linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-09 12:17:12 +09:00

Author	SHA1	Message	Date
Randy Dunlap	1ffb2ca650	ipmi: ASPEED_BT_IPMI_BMC: select REGMAP_MMIO instead of depending on it [ Upstream commit `2a587b9ad0` ] REGMAP is a hidden (not user visible) symbol. Users cannot set it directly thru "make *config", so drivers should select it instead of depending on it if they need it. Consistently using "select" or "depends on" can also help reduce Kconfig circular dependency issues. Therefore, change the use of "depends on REGMAP_MMIO" to "select REGMAP_MMIO", which will also set REGMAP. Fixes: `eb994594bc` ("ipmi: bt-bmc: Use a regmap for register access") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Andrew Jeffery <andrew@aj.id.au> Cc: Corey Minyard <minyard@acm.org> Cc: openipmi-developer@lists.sourceforge.net Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20230226053953.4681-2-rdunlap@infradead.org> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:31 +09:00
Kuniyuki Iwashima	43e4197dd5	tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp. [ Upstream commit `50749f2dd6` ] syzkaller reported [0] memory leaks of an UDP socket and ZEROCOPY skbs. We can reproduce the problem with these sequences: sk = socket(AF_INET, SOCK_DGRAM, 0) sk.setsockopt(SOL_SOCKET, SO_TIMESTAMPING, SOF_TIMESTAMPING_TX_SOFTWARE) sk.setsockopt(SOL_SOCKET, SO_ZEROCOPY, 1) sk.sendto(b'', MSG_ZEROCOPY, ('127.0.0.1', 53)) sk.close() sendmsg() calls msg_zerocopy_alloc(), which allocates a skb, sets skb->cb->ubuf.refcnt to 1, and calls sock_hold(). Here, struct ubuf_info_msgzc indirectly holds a refcnt of the socket. When the skb is sent, __skb_tstamp_tx() clones it and puts the clone into the socket's error queue with the TX timestamp. When the original skb is received locally, skb_copy_ubufs() calls skb_unclone(), and pskb_expand_head() increments skb->cb->ubuf.refcnt. This additional count is decremented while freeing the skb, but struct ubuf_info_msgzc still has a refcnt, so __msg_zerocopy_callback() is not called. The last refcnt is not released unless we retrieve the TX timestamped skb by recvmsg(). Since we clear the error queue in inet_sock_destruct() after the socket's refcnt reaches 0, there is a circular dependency. If we close() the socket holding such skbs, we never call sock_put() and leak the count, sk, and skb. TCP has the same problem, and commit `e0c8bccd40` ("net: stream: purge sk_error_queue in sk_stream_kill_queues()") tried to fix it by calling skb_queue_purge() during close(). However, there is a small chance that skb queued in a qdisc or device could be put into the error queue after the skb_queue_purge() call. In __skb_tstamp_tx(), the cloned skb should not have a reference to the ubuf to remove the circular dependency, but skb_clone() does not call skb_copy_ubufs() for zerocopy skb. So, we need to call skb_orphan_frags_rx() for the cloned skb to call skb_copy_ubufs(). [0]: BUG: memory leak unreferenced object 0xffff88800c6d2d00 (size 1152): comm "syz-executor392", pid 264, jiffies 4294785440 (age 13.044s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 cd af e8 81 00 00 00 00 ................ 02 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ backtrace: [<0000000055636812>] sk_prot_alloc+0x64/0x2a0 net/core/sock.c:2024 [<0000000054d77b7a>] sk_alloc+0x3b/0x800 net/core/sock.c:2083 [<0000000066f3c7e0>] inet_create net/ipv4/af_inet.c:319 [inline] [<0000000066f3c7e0>] inet_create+0x31e/0xe40 net/ipv4/af_inet.c:245 [<000000009b83af97>] __sock_create+0x2ab/0x550 net/socket.c:1515 [<00000000b9b11231>] sock_create net/socket.c:1566 [inline] [<00000000b9b11231>] __sys_socket_create net/socket.c:1603 [inline] [<00000000b9b11231>] __sys_socket_create net/socket.c:1588 [inline] [<00000000b9b11231>] __sys_socket+0x138/0x250 net/socket.c:1636 [<000000004fb45142>] __do_sys_socket net/socket.c:1649 [inline] [<000000004fb45142>] __se_sys_socket net/socket.c:1647 [inline] [<000000004fb45142>] __x64_sys_socket+0x73/0xb0 net/socket.c:1647 [<0000000066999e0e>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<0000000066999e0e>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 [<0000000017f238c1>] entry_SYSCALL_64_after_hwframe+0x63/0xcd BUG: memory leak unreferenced object 0xffff888017633a00 (size 240): comm "syz-executor392", pid 264, jiffies 4294785440 (age 13.044s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 2d 6d 0c 80 88 ff ff .........-m..... backtrace: [<000000002b1c4368>] __alloc_skb+0x229/0x320 net/core/skbuff.c:497 [<00000000143579a6>] alloc_skb include/linux/skbuff.h:1265 [inline] [<00000000143579a6>] sock_omalloc+0xaa/0x190 net/core/sock.c:2596 [<00000000be626478>] msg_zerocopy_alloc net/core/skbuff.c:1294 [inline] [<00000000be626478>] msg_zerocopy_realloc+0x1ce/0x7f0 net/core/skbuff.c:1370 [<00000000cbfc9870>] __ip_append_data+0x2adf/0x3b30 net/ipv4/ip_output.c:1037 [<0000000089869146>] ip_make_skb+0x26c/0x2e0 net/ipv4/ip_output.c:1652 [<00000000098015c2>] udp_sendmsg+0x1bac/0x2390 net/ipv4/udp.c:1253 [<0000000045e0e95e>] inet_sendmsg+0x10a/0x150 net/ipv4/af_inet.c:819 [<000000008d31bfde>] sock_sendmsg_nosec net/socket.c:714 [inline] [<000000008d31bfde>] sock_sendmsg+0x141/0x190 net/socket.c:734 [<0000000021e21aa4>] __sys_sendto+0x243/0x360 net/socket.c:2117 [<00000000ac0af00c>] __do_sys_sendto net/socket.c:2129 [inline] [<00000000ac0af00c>] __se_sys_sendto net/socket.c:2125 [inline] [<00000000ac0af00c>] __x64_sys_sendto+0xe1/0x1c0 net/socket.c:2125 [<0000000066999e0e>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<0000000066999e0e>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 [<0000000017f238c1>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: `f214f915e7` ("tcp: enable MSG_ZEROCOPY") Fixes: `b5947e5d1e` ("udp: msg_zerocopy") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:31 +09:00
Gencen Gan	1d2f799c16	net: amd: Fix link leak when verifying config failed [ Upstream commit `d325c34d9e` ] After failing to verify configuration, it returns directly without releasing link, which may cause memory leak. Paolo Abeni thinks that the whole code of this driver is quite "suboptimal" and looks unmainatained since at least ~15y, so he suggests that we could simply remove the whole driver, please take it into consideration. Simon Horman suggests that the fix label should be set to "Linux-2.6.12-rc2" considering that the problem has existed since the driver was introduced and the commit above doesn't seem to exist in net/net-next. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Gan Gecen <gangecen@hust.edu.cn> Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:31 +09:00
Kuniyuki Iwashima	5d6e5c054e	netlink: Use copy_to_user() for optval in netlink_getsockopt(). [ Upstream commit `d913d32cc2` ] Brad Spencer provided a detailed report [0] that when calling getsockopt() for AF_NETLINK, some SOL_NETLINK options set only 1 byte even though such options require at least sizeof(int) as length. The options return a flag value that fits into 1 byte, but such behaviour confuses users who do not initialise the variable before calling getsockopt() and do not strictly check the returned value as char. Currently, netlink_getsockopt() uses put_user() to copy data to optlen and optval, but put_user() casts the data based on the pointer, char *optval. As a result, only 1 byte is set to optval. To avoid this behaviour, we need to use copy_to_user() or cast optval for put_user(). Note that this changes the behaviour on big-endian systems, but we document that the size of optval is int in the man page. $ man 7 netlink ... Socket options To set or get a netlink socket option, call getsockopt(2) to read or setsockopt(2) to write the option with the option level argument set to SOL_NETLINK. Unless otherwise noted, optval is a pointer to an int. Fixes: `9a4595bc7e` ("[NETLINK]: Add set/getsockopt options to support more than 32 groups") Fixes: `be0c22a46c` ("netlink: add NETLINK_BROADCAST_ERROR socket option") Fixes: `38938bfe34` ("netlink: add NETLINK_NO_ENOBUFS socket flag") Fixes: `0a6a3a23ea` ("netlink: add NETLINK_CAP_ACK socket option") Fixes: `2d4bc93368` ("netlink: extended ACK reporting") Fixes: `89d35528d1` ("netlink: Add new socket option to enable strict checking on dumps") Reported-by: Brad Spencer <bspencer@blackberry.com> Link: https://lore.kernel.org/netdev/ZD7VkNWFfp22kTDt@datsun.rim.net/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Link: https://lore.kernel.org/r/20230421185255.94606-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:31 +09:00
Liu Jian	a789192f36	Revert "Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work" [ Upstream commit `db2bf510bd` ] This reverts commit `1e9ac114c4`. This patch introduces a possible null-ptr-def problem. Revert it. And the fixed bug by this patch have resolved by commit `73f7b171b7` ("Bluetooth: btsdio: fix use after free bug in btsdio_remove due to race condition"). Fixes: `1e9ac114c4` ("Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work") Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:31 +09:00
Ziyang Xuan	a54ec573d9	ipv4: Fix potential uninit variable access bug in __ip_make_skb() [ Upstream commit `99e5acae19` ] Like commit `ea30388bae` ("ipv6: Fix an uninit variable access bug in __ip6_make_skb()"). icmphdr does not in skb linear region under the scenario of SOCK_RAW socket. Access icmp_hdr(skb)->type directly will trigger the uninit variable access bug. Use a local variable icmp_type to carry the correct value in different scenarios. Fixes: `96793b4825` ("[IPV4]: Add ICMPMsgStats MIB (RFC 4293)") Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:31 +09:00
Davide Caratti	d0b43125ec	net/sched: sch_fq: fix integer overflow of "credit" [ Upstream commit `7041101ff6` ] if sch_fq is configured with "initial quantum" having values greater than INT_MAX, the first assignment of "credit" does signed integer overflow to a very negative value. In this situation, the syzkaller script provided by Cristoph triggers the CPU soft-lockup warning even with few sockets. It's not an infinite loop, but "credit" wasn't probably meant to be minus 2Gb for each new flow. Capping "initial quantum" to INT_MAX proved to fix the issue. v2: validation of "initial quantum" is done in fq_policy, instead of open coding in fq_change() _ suggested by Jakub Kicinski Reported-by: Christoph Paasch <cpaasch@apple.com> Link: https://github.com/multipath-tcp/mptcp_net-next/issues/377 Fixes: `afe4fd0624` ("pkt_sched: fq: Fair Queue packet scheduler") Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/7b3a3c7e36d03068707a021760a194a8eb5ad41a.1682002300.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:31 +09:00
Florian Westphal	7a45b4e1c8	netfilter: nf_tables: don't write table validation state without mutex [ Upstream commit `9a32e98506` ] The ->cleanup callback needs to be removed, this doesn't work anymore as the transaction mutex is already released in the ->abort function. Just do it after a successful validation pass, this either happens from commit or abort phases where transaction mutex is held. Fixes: `f102d66b33` ("netfilter: nf_tables: use dedicated mutex to guard transactions") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Stanislav Fomichev	8913abddad	bpf: Don't EFAULT for getsockopt with optval=NULL [ Upstream commit `00e74ae086` ] Some socket options do getsockopt with optval=NULL to estimate the size of the final buffer (which is returned via optlen). This breaks BPF getsockopt assumptions about permitted optval buffer size. Let's enforce these assumptions only when non-NULL optval is provided. Fixes: `0d01da6afc` ("bpf: implement getsockopt and setsockopt hooks") Reported-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/ZD7Js4fj5YyI2oLd@google.com/T/#mb68daf700f87a9244a15d01d00c3f0e5b08f49f7 Link: https://lore.kernel.org/bpf/20230418225343.553806-2-sdf@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Yan Wang	77f245ce05	net: stmmac:fix system hang when setting up tag_8021q VLAN for DSA ports [ Upstream commit `35226750f7` ] The system hang because of dsa_tag_8021q_port_setup()-> stmmac_vlan_rx_add_vid(). I found in stmmac_drv_probe() that cailing pm_runtime_put() disabled the clock. First, when the kernel is compiled with CONFIG_PM=y,The stmmac's resume/suspend is active. Secondly,stmmac as DSA master,the dsa_tag_8021q_port_setup() function will callback stmmac_vlan_rx_add_vid when DSA dirver starts. However, The system is hanged for the stmmac_vlan_rx_add_vid() accesses its registers after stmmac's clock is closed. I would suggest adding the pm_runtime_resume_and_get() to the stmmac_vlan_rx_add_vid().This guarantees that resuming clock output while in use. Fixes: `b3dcb31277` ("net: stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Yan Wang <rk.code@outlook.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Chris Mi	a9e96eef82	net/mlx5: E-switch, Don't destroy indirect table in split rule [ Upstream commit `4c81893025` ] Source port rewrite (forward to ovs internal port or statck device) isn't supported in the rule of split action. So there is no indirect table in split rule. The cited commit destroyes indirect table in split rule. The indirect table for other rules will be destroyed wrongly. It will cause traffic loss. Fix it by removing the destroy function in split rule. And also remove the destroy function in error flow. Fixes: `10742efc20` ("net/mlx5e: VF tunnel TX traffic offloading") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Joe Damato	05cf6f353d	ixgbe: Enable setting RSS table to default values [ Upstream commit `e85d3d5587` ] ethtool uses `ETHTOOL_GRXRINGS` to compute how many queues are supported by RSS. The driver should return the smaller of either: - The maximum number of RSS queues the device supports, OR - The number of RX queues configured Prior to this change, running `ethtool -X $iface default` fails if the number of queues configured is larger than the number supported by RSS, even though changing the queue count correctly resets the flowhash to use all supported queues. Other drivers (for example, i40e) will succeed but the flow hash will reset to support the maximum number of queues supported by RSS, even if that amount is smaller than the configured amount. Prior to this change: $ sudo ethtool -L eth1 combined 20 $ sudo ethtool -x eth1 RX flow hash indirection table for eth1 with 20 RX ring(s): 0: 0 1 2 3 4 5 6 7 8: 8 9 10 11 12 13 14 15 16: 0 1 2 3 4 5 6 7 24: 8 9 10 11 12 13 14 15 32: 0 1 2 3 4 5 6 7 ... You can see that the flowhash was correctly set to use the maximum number of queues supported by the driver (16). However, asking the NIC to reset to "default" fails: $ sudo ethtool -X eth1 default Cannot set RX flow hash configuration: Invalid argument After this change, the flowhash can be reset to default which will use all of the available RSS queues (16) or the configured queue count, whichever is smaller. Starting with eth1 which has 10 queues and a flowhash distributing to all 10 queues: $ sudo ethtool -x eth1 RX flow hash indirection table for eth1 with 10 RX ring(s): 0: 0 1 2 3 4 5 6 7 8: 8 9 0 1 2 3 4 5 16: 6 7 8 9 0 1 2 3 ... Increasing the queue count to 48 resets the flowhash to distribute to 16 queues, as it did before this patch: $ sudo ethtool -L eth1 combined 48 $ sudo ethtool -x eth1 RX flow hash indirection table for eth1 with 16 RX ring(s): 0: 0 1 2 3 4 5 6 7 8: 8 9 10 11 12 13 14 15 16: 0 1 2 3 4 5 6 7 ... Due to the other bugfix in this series, the flowhash can be set to use queues 0-5: $ sudo ethtool -X eth1 equal 5 $ sudo ethtool -x eth1 RX flow hash indirection table for eth1 with 16 RX ring(s): 0: 0 1 2 3 4 0 1 2 8: 3 4 0 1 2 3 4 0 16: 1 2 3 4 0 1 2 3 ... Due to this bugfix, the flowhash can be reset to default and use 16 queues: $ sudo ethtool -X eth1 default $ sudo ethtool -x eth1 RX flow hash indirection table for eth1 with 16 RX ring(s): 0: 0 1 2 3 4 5 6 7 8: 8 9 10 11 12 13 14 15 16: 0 1 2 3 4 5 6 7 ... Fixes: `91cd94bfe4` ("ixgbe: add basic support for setting and getting nfc controls") Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Joe Damato	624b73f776	ixgbe: Allow flow hash to be set via ethtool [ Upstream commit `4f3ed1293f` ] ixgbe currently returns `EINVAL` whenever the flowhash it set by ethtool because the ethtool code in the kernel passes a non-zero value for hfunc that ixgbe should allow. When ethtool is called with `ETHTOOL_SRXFHINDIR`, `ethtool_set_rxfh_indir` will call ixgbe's set_rxfh function with `ETH_RSS_HASH_NO_CHANGE`. This value should be accepted. When ethtool is called with `ETHTOOL_SRSSH`, `ethtool_set_rxfh` will call ixgbe's set_rxfh function with `rxfh.hfunc`, which appears to be hardcoded in ixgbe to always be `ETH_RSS_HASH_TOP`. This value should also be accepted. Before this patch: $ sudo ethtool -L eth1 combined 10 $ sudo ethtool -X eth1 default Cannot set RX flow hash configuration: Invalid argument After this patch: $ sudo ethtool -L eth1 combined 10 $ sudo ethtool -X eth1 default $ sudo ethtool -x eth1 RX flow hash indirection table for eth1 with 10 RX ring(s): 0: 0 1 2 3 4 5 6 7 8: 8 9 0 1 2 3 4 5 16: 6 7 8 9 0 1 2 3 24: 4 5 6 7 8 9 0 1 ... Fixes: `1c7cf0784e` ("ixgbe: support for ethtool set_rxfh") Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Johannes Berg	e302e9ca14	wifi: iwlwifi: fw: fix memory leak in debugfs [ Upstream commit `3d90d2f4a0` ] Fix a memory leak that occurs when reading the fw_info file all the way, since we return NULL indicating no more data, but don't free the status tracking object. Fixes: `36dfe9ac6e` ("iwlwifi: dump api version in yaml format") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230418122405.239e501b3b8d.I4268f87809ef91209cbcd748eee0863195e70fa2@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Johannes Berg	53b3b1f563	wifi: iwlwifi: mvm: check firmware response size [ Upstream commit `13513cec93` ] Check the firmware response size for responses to the memory read/write command in debugfs before using it. Fixes: `2b55f43f8e` ("iwlwifi: mvm: Add mem debugfs entry") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230417113648.0d56fcaf68ee.I70e9571f3ed7263929b04f8fabad23c9b999e4ea@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Quan Zhou	aa11a89445	wifi: mt76: mt7921e: improve reliability of dma reset [ Upstream commit `87714bf6ed` ] The hardware team has advised the driver that it is necessary to first put WFDMA into an idle state before resetting the WFDMA. Otherwise, the WFDMA may enter an unknown state where it cannot be polled with the right state successfully. To ensure that the DMA can work properly while a stressful cold reboot test was being made, we have reordered the programming sequence in the driver based on the hardware team's guidance. The patch would modify the WFDMA disabling flow from "DMA reset -> disabling DMASHDL -> disabling WFDMA -> polling and waiting until DMA idle" to "disabling WFDMA -> polling and waiting for DMA idle -> disabling DMASHDL -> DMA reset. Where he polling and waiting until WFDMA is idle is coordinated with the operation of disabling WFDMA. Even while WFDMA is being disabled, it can still handle Tx/Rx requests. The additional polling allows sufficient time for WFDMA to process the last T/Rx request. When the idle state of WFDMA is reached, it is a reliable indication that DMASHDL is also idle to ensure it is safe to disable it and perform the DMA reset. Fixes: `0a1059d0f0` ("mt76: mt7921: move mt7921_dma_reset in dma.c") Co-developed-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Co-developed-by: Deren Wu <deren.wu@mediatek.com> Signed-off-by: Deren Wu <deren.wu@mediatek.com> Co-developed-by: Wang Zhao <wang.zhao@mediatek.com> Signed-off-by: Wang Zhao <wang.zhao@mediatek.com> Signed-off-by: Quan Zhou <quan.zhou@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Ming Yen Hsieh	f8923ad9dd	wifi: mt76: fix 6GHz high channel not be scanned [ Upstream commit `23792cedaf` ] mt76 scan command only support 64 channels currently. If the channel count is larger than 64(for 2+5+6GHz), some channels will not be scanned. Hence change the scan type to full channel scan in case of the command cannot include proper list for chip. Fixes: `399090ef96` ("mt76: mt76_connac: move hw_scan and sched_scan routine in mt76_connac_mcu module") Reported-by: Ben Greear <greearb@candelatech.com> Tested-by: Isaac Konikoff <konikofi@candelatech.com> Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Deren Wu <deren.wu@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Quan Zhou	613b51663f	wifi: mt76: mt7921e: fix probe timeout after reboot [ Upstream commit `c397fc1e63` ] In system warm reboot scene, due to the polling timeout(now 1000us) is too short to wait dma idle in time, it may make driver probe fail with error code -ETIMEDOUT. Meanwhile, we also found the dma may take around 70ms to enter idle state. Change the polling idle timeout to 100ms to avoid the probabilistic probe fail. Tested pass with 5000 times warm reboot on x86 platform. [4.477496] pci 0000:01:00.0: attach allowed to drvr mt7921e [internal device] [4.478306] mt7921e 0000:01:00.0: ASIC revision: 79610010 [4.480063] mt7921e: probe of 0000:01:00.0 failed with error -110 Fixes: `0a1059d0f0` ("mt76: mt7921: move mt7921_dma_reset in dma.c") Signed-off-by: Quan Zhou <quan.zhou@mediatek.com> Signed-off-by: Deren Wu <deren.wu@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Deren Wu	5279aaf9f5	wifi: mt76: add flexible polling wait-interval support [ Upstream commit `35effe6c0c` ] The default waiting unit is 10ms and the value is too much for data path related control. Provide a new API mt76_poll_msec_tick() to support different cases, such as 1ms polling waiting kick. Reviewed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Deren Wu <deren.wu@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Stable-dep-of: `c397fc1e63` ("wifi: mt76: mt7921e: fix probe timeout after reboot") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Kang Chen	ac9fec5b56	wifi: mt76: handle failure of vzalloc in mt7615_coredump_work [ Upstream commit `9e47dd9f64` ] vzalloc may fails, dump might be null and will cause illegal address access later. Link: https://lore.kernel.org/all/Y%2Fy5Asxw3T3m4jCw@lore-desk Fixes: `d2bf7959d9` ("mt76: mt7663: introduce coredump support") Signed-off-by: Kang Chen <void0red@gmail.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Emmanuel Grumbach	210e6d01cc	wifi: iwlwifi: make the loop for card preparation effective [ Upstream commit `28965ec0b5` ] Since we didn't reset t to 0, only the first iteration of the loop did checked the ready bit several times. From the second iteration and on, we just tested the bit once and continued to the next iteration. Reported-and-tested-by: Lorenzo Zolfanelli <lorenzo@zolfa.nl> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216452 Fixes: `289e5501c3` ("iwlwifi: fix the preparation of the card") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230416154301.615b683ab9c8.Ic52c3229d3345b0064fa34263293db095d88daf8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:30 +09:00
Jan Kara	dff2a7b330	jdb2: Don't refuse invalidation of already invalidated buffers [ Upstream commit `bd159398a2` ] When invalidating buffers under the partial tail page, jbd2_journal_invalidate_folio() returns -EBUSY if the buffer is part of the committing transaction as we cannot safely modify buffer state. However if the buffer is already invalidated (due to previous invalidation attempts from ext4_wait_for_tail_page_commit()), there's nothing to do and there's no point in returning -EBUSY. This fixes occasional warnings from ext4_journalled_invalidate_folio() triggered by generic/051 fstest when blocksize < pagesize. Fixes: `53e872681f` ("ext4: fix deadlock in journal_unmap_buffer()") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230329154950.19720-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Tom Rix	358317ad9c	wifi: iwlwifi: fw: move memset before early return [ Upstream commit `8ce437dd5b` ] Clang static analysis reports this representative issue dbg.c:1455:6: warning: Branch condition evaluates to a garbage value if (!rxf_data.size) ^~~~~~~~~~~~~~ This check depends on iwl_ini_get_rxf_data() to clear rxf_data but the function can return early without doing the clear. So move the memset before the early return. Fixes: `cc9b6012d3` ("iwlwifi: yoyo: use hweight_long instead of bit manipulating") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230414130637.872a7175f1ff.I33802a77a91998276992b088fbe25f61c87c33ac@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Tom Rix	cccf85e047	wifi: iwlwifi: mvm: initialize seq variable [ Upstream commit `11e94d2bcd` ] Clang static analysis reports this issue d3.c:567:22: warning: The left operand of '>' is a garbage value if (seq.tkip.iv32 > cur_rx_iv32) ~~~~~~~~~~~~~ ^ seq is never initialized. Call ieee80211_get_key_rx_seq() to initialize seq. Fixes: `0419e5e672` ("iwlwifi: mvm: d3: separate TKIP data from key iteration") Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230414130637.6dd372f84f93.If1f708c90e6424a935b4eba3917dfb7582e0dd0a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Daniel Gabay	b3cecbb257	wifi: iwlwifi: yoyo: Fix possible division by zero [ Upstream commit `ba30415118` ] Don't allow buffer allocation TLV with zero req_size since it leads later to division by zero in iwl_dbg_tlv_alloc_fragments(). Also, NPK/SRAM locations are allowed to have zero buffer req_size, don't discard them. Fixes: `a9248de424` ("iwlwifi: dbg_ini: add TLV allocation new API support") Signed-off-by: Daniel Gabay <daniel.gabay@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230413213309.5d6688ed74d8.I5c2f3a882b50698b708d54f4524dc5bdf11e3d32@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Daniel Gabay	4636c35b7e	wifi: iwlwifi: yoyo: skip dump correctly on hw error [ Upstream commit `11195ab0d6` ] When NIC is in a bad state, reading data will return 28 bits as 0xa5a5a5a and the lowest 4 bits are not fixed value. Mask these bits in a few places to skip the dump correctly. Fixes: `89639e06d0` ("iwlwifi: yoyo: support for new DBGI_SRAM region") Signed-off-by: Daniel Gabay <daniel.gabay@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230413213309.df6c0663179d.I36d8487b2419c6fefa65e5514855d94327c3b1eb@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Yu Kuai	34222897e0	md/raid10: don't call bio_start_io_acct twice for bio which experienced read error [ Upstream commit `7cddb055bf` ] handle_read_error() will resumit r10_bio by raid10_read_request(), which will call bio_start_io_acct() again, while bio_end_io_acct() will only be called once. Fix the problem by don't account io again from handle_read_error(). Fixes: `528bc2cf2f` ("md/raid10: enable io accounting") Suggested-by: Song Liu <song@kernel.org> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230314012258.2395894-1-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Yu Kuai	d6cfcf98b8	md/raid10: fix memleak of md thread [ Upstream commit `f0ddb83da3` ] In raid10_run(), if setup_conf() succeed and raid10_run() failed before setting 'mddev->thread', then in the error path 'conf->thread' is not freed. Fix the problem by setting 'mddev->thread' right after setup_conf(). Fixes: `43a521238a` ("md-cluster: choose correct label when clustered layout is not supported") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230310073855.1337560-7-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Yu Kuai	7f673fa34c	md/raid10: fix memleak for 'conf->bio_split' [ Upstream commit `c9ac2acde5` ] In the error path of raid10_run(), 'conf' need be freed, however, 'conf->bio_split' is missed and memory will be leaked. Since there are 3 places to free 'conf', factor out a helper to fix the problem. Fixes: `fc9977dd06` ("md/raid10: simplify the splitting of requests.") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230310073855.1337560-6-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Yu Kuai	8d09065802	md/raid10: fix leak of 'r10bio->remaining' for recovery [ Upstream commit `26208a7cff` ] raid10_sync_request() will add 'r10bio->remaining' for both rdev and replacement rdev. However, if the read io fails, recovery_request_write() returns without issuing the write io, in this case, end_sync_request() is only called once and 'remaining' is leaked, cause an io hang. Fix the problem by decreasing 'remaining' according to if 'bio' and 'repl_bio' is valid. Fixes: `24afd80d99` ("md/raid10: handle recovery of replacement devices.") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230310073855.1337560-5-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Li Nan	901b4918fa	md/raid10: fix task hung in raid10d [ Upstream commit `72c215ed87` ] commit `fe630de009` ("md/raid10: avoid deadlock on recovery.") allowed normal io and sync io to exist at the same time. Task hung will occur as below: T1 T2 T3 T4 raid10d handle_read_error allow_barrier conf->nr_pending-- -> 0 //submit sync io raid10_sync_request raise_barrier ->will not be blocked ... //submit to drivers raid10_read_request wait_barrier conf->nr_pending++ -> 1 //retry read fail raid10_end_read_request reschedule_retry add to retry_list conf->nr_queued++ -> 1 //sync io fail end_sync_read __end_sync_read reschedule_retry add to retry_list conf->nr_queued++ -> 2 ... handle_read_error get form retry_list conf->nr_queued-- freeze_array wait nr_pending == nr_queued+1 ->1 ->2 //task hung retry read and sync io will be added to retry_list(nr_queued->2) if they fails. raid10d() called handle_read_error() and hung in freeze_array(). nr_queued will not decrease because raid10d is blocked, nr_pending will not increase because conf->barrier is not released. Fix it by moving allow_barrier() after raid10_read_request(). raise_barrier() will wait for nr_waiting to become 0. Therefore, sync io and regular io will not be issued at the same time. Also remove the check of nr_queued in stop_waiting_barrier. It can be 0 but don't need to be blocking. Remove the check for MD_RECOVERY_RUNNING as the check is redundent. Fixes: `fe630de009` ("md/raid10: avoid deadlock on recovery.") Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230222041000.3341651-2-linan666@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Yu Kuai	fc04998351	md/raid10: factor out code from wait_barrier() to stop_waiting_barrier() [ Upstream commit `ed2e063f92` ] Currently the nasty condition in wait_barrier() is hard to read. This patch factors out the condition into a function. There are no functional changes. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Song Liu <song@kernel.org> Stable-dep-of: `72c215ed87` ("md/raid10: fix task hung in raid10d") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Vishal Verma	39db562b3f	md: raid10 add nowait support [ Upstream commit `c9aa889b03` ] This adds nowait support to the RAID10 driver. Very similar to raid1 driver changes. It makes RAID10 driver return with EAGAIN for situations where it could wait for eg: - Waiting for the barrier, - Reshape operation, - Discard operation. wait_barrier() and regular_request_wait() fn are modified to return bool to support error for wait barriers. They returns true in case of wait or if wait is not required and returns false if wait was required but not performed to support nowait. Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Vishal Verma <vverma@digitalocean.com> Signed-off-by: Song Liu <song@kernel.org> Stable-dep-of: `72c215ed87` ("md/raid10: fix task hung in raid10d") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Mariusz Tkaczyk	74af08efa5	md: drop queue limitation for RAID1 and RAID10 [ Upstream commit `a92ce0feff` ] As suggested by Neil Brown[1], this limitation seems to be deprecated. With plugging in use, writes are processed behind the raid thread and conf->pending_count is not increased. This limitation occurs only if caller doesn't use plugs. It can be avoided and often it is (with plugging). There are no reports that queue is growing to enormous size so remove queue limitation for non-plugged IOs too. [1] https://lore.kernel.org/linux-raid/162496301481.7211.18031090130574610495@noble.neil.brown.name Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Song Liu <song@kernel.org> Stable-dep-of: `72c215ed87` ("md/raid10: fix task hung in raid10d") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Daniel Borkmann	337d1d88be	bpf, sockmap: Revert buggy deadlock fix in the sockhash and sockmap [ Upstream commit `8c5c2a4898` ] syzbot reported a splat and bisected it to recent commit `ed17aa92dc` ("bpf, sockmap: fix deadlocks in the sockhash and sockmap"): [...] WARNING: CPU: 1 PID: 9280 at kernel/softirq.c:376 __local_bh_enable_ip+0xbe/0x130 kernel/softirq.c:376 Modules linked in: CPU: 1 PID: 9280 Comm: syz-executor.1 Not tainted 6.2.0-syzkaller-13249-gd319f344561d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/30/2023 RIP: 0010:__local_bh_enable_ip+0xbe/0x130 kernel/softirq.c:376 [...] Call Trace: <TASK> spin_unlock_bh include/linux/spinlock.h:395 [inline] sock_map_del_link+0x2ea/0x510 net/core/sock_map.c:165 sock_map_unref+0xb0/0x1d0 net/core/sock_map.c:184 sock_hash_delete_elem+0x1ec/0x2a0 net/core/sock_map.c:945 map_delete_elem kernel/bpf/syscall.c:1536 [inline] __sys_bpf+0x2edc/0x53e0 kernel/bpf/syscall.c:5053 __do_sys_bpf kernel/bpf/syscall.c:5166 [inline] __se_sys_bpf kernel/bpf/syscall.c:5164 [inline] __x64_sys_bpf+0x79/0xc0 kernel/bpf/syscall.c:5164 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fe8f7c8c169 </TASK> [...] Revert for now until we have a proper solution. Fixes: `ed17aa92dc` ("bpf, sockmap: fix deadlocks in the sockhash and sockmap") Reported-by: syzbot+49f6cef45247ff249498@syzkaller.appspotmail.com Cc: Hsin-Wei Hung <hsinweih@uci.edu> Cc: Xin Liu <liuxin350@huawei.com> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/000000000000f1db9605f939720e@google.com/ Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:29 +09:00
Song Liu	12e70c6f4e	selftests/bpf: Fix leaked bpf_link in get_stackid_cannot_attach [ Upstream commit `c1e07a80cf` ] skel->links.oncpu is leaked in one case. This causes test perf_branches fails when it runs after get_stackid_cannot_attach: ./test_progs -t get_stackid_cannot_attach,perf_branches 84 get_stackid_cannot_attach:OK test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_good_sample:FAIL:output not valid no valid sample from prog 146/1 perf_branches/perf_branches_hw:FAIL 146/2 perf_branches/perf_branches_no_hw:OK 146 perf_branches:FAIL All error logs: test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_good_sample:FAIL:output not valid no valid sample from prog 146/1 perf_branches/perf_branches_hw:FAIL 146 perf_branches:FAIL Summary: 1/1 PASSED, 0 SKIPPED, 1 FAILED Fix this by adding the missing bpf_link__destroy(). Fixes: `346938e938` ("selftests/bpf: Add get_stackid_cannot_attach") Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230412210423.900851-3-song@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Ming Lei	103a427542	nvme-fcloop: fix "inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage" [ Upstream commit `4f86a6ff6f` ] fcloop_fcp_op() could be called from flush request's ->end_io(flush_end_io) in which the spinlock of fq->mq_flush_lock is grabbed with irq saved/disabled. So fcloop_fcp_op() can't call spin_unlock_irq(&tfcp_req->reqlock) simply which enables irq unconditionally. Fixes the warning by switching to spin_lock_irqsave()/spin_unlock_irqrestore() Fixes: `c38dbbfab1` ("nvme-fcloop: fix inconsistent lock state warnings") Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Keith Busch	9fe41e6482	nvme: fix async event trace event [ Upstream commit `6622b76fe9` ] Mixing AER Event Type and Event Info has masking clashes. Just print the event type, but also include the event info of the AER result in the trace. Fixes: `09bd1ff4b1` ("nvme-core: add async event trace helper") Reported-by: Nate Thornton <nate.thornton@samsung.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Minwoo Im <minwoo.im@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Michael Kelley	13475e6391	nvme: handle the persistent internal error AER [ Upstream commit `2c61c97fb1` ] In the NVM Express Revision 1.4 spec, Figure 145 describes possible values for an AER with event type "Error" (value 000b). For a Persistent Internal Error (value 03h), the host should perform a controller reset. Add support for this error using code that already exists for doing a controller reset. As part of this support, introduce two utility functions for parsing the AER type and subtype. This new support was tested in a lab environment where we can generate the persistent internal error on demand, and observe both the Linux side and NVMe controller side to see that the controller reset has been done. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Stable-dep-of: `6622b76fe9` ("nvme: fix async event trace event") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Damien Le Moal	30b9073583	nvmet: fix I/O Command Set specific Identify Controller [ Upstream commit `a5a6ab0950` ] For an identify command with cns set to NVME_ID_CNS_CS_CTRL, the NVMe 2.0 specification states that: If the I/O Command Set specified by the CSI field does not have an Identify Controller data structure, then the controller shall return a zero filled data structure. If the host requests a data structure for an I/O Command Set that the controller does not support, the controller shall abort the command with a status code of Invalid Field in Command. However, the current implementation of this identify command in nvmet_execute_identify() only handles the ZNS command set, returning an error for the NVM command set, which is not compliant with the specifications as we do support this command set. Fix this by: 1) Renaming nvmet_execute_identify_cns_cs_ctrl() to nvmet_execute_identify_ctrl_zns() to continue handling the ZNS command set as is. 2) Introduce a nvmet_execute_identify_ctrl_ns() helper to handle the NVM command set, returning a zero filled nvme_id_ctrl_nvm data structure. 3) Modify nvmet_execute_identify() to call these helpers based on the csi specified, returning an error for unsupported command sets. Fixes: `aaf2e048af` ("nvmet: add ZBD over ZNS backend support") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Damien Le Moal	42bcbc2a90	nvmet: fix Identify Active Namespace ID list handling [ Upstream commit `97416f67d5` ] The identify command with cns set to NVME_ID_CNS_NS_ACTIVE_LIST does not depend on the command set. The execution of this command should thus not look at the csi field specified in the command. Simplify nvmet_execute_identify() to directly call nvmet_execute_identify_nslist() without the csi switch-case. Fixes: `ab5d0b38c0` ("nvmet: add Command Set Identifier support") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Damien Le Moal	92cf81746e	nvmet: fix Identify Controller handling [ Upstream commit `62904b3b33` ] The identify command with cns set to NVME_ID_CNS_CTRL does not depend on the command set. The execution of this command should thus not look at the csi specified in the command. Simplify nvmet_execute_identify() to directly call nvmet_execute_identify_ctrl() without the csi switch-case. Fixes: `ab5d0b38c0` ("nvmet: add Command Set Identifier support") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Damien Le Moal	ac86d59eaa	nvmet: fix Identify Namespace handling [ Upstream commit `8c098aa001` ] The identify command with cns set to NVME_ID_CNS_NS does not directly depend on the command set. The NVMe specifications is rather confusing here as it appears that this command only applies to the NVM command set. However, footnote 8 of Figure 273 in the NVMe 2.0 base specifications clearly state that this command applies to NVM command sets that support logical blocks, that is, NVM and ZNS. Both the NVM and ZNS command set specifications also list this identify as mandatory. The command handling should thus not look at the csi field since it is defined as unused for this command. Given that we do not support the KV command set, simply remove the csi switch-case for that command handling and call directly nvmet_execute_identify_ns() in nvmet_execute_identify(). Fixes: `ab5d0b38c0` ("nvmet: add Command Set Identifier support") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Damien Le Moal	c7e98afeca	nvmet: fix error handling in nvmet_execute_identify_cns_cs_ns() [ Upstream commit `ab76e7206b` ] Nvme specifications state that: If the I/O Command Set associated with the namespace identified by the NSID field does not support the Identify Namespace data structure specified by the CSI field, the controller shall abort the command with a status code of Invalid Field in Command. In other words, if nvmet_execute_identify_cns_cs_ns() is called for a target with a block device that is not zoned, we should not return any data and set the status to NVME_SC_INVALID_FIELD. While at it, it is also better to revalidate the ns block devie before checking if the block device is zoned, to ensure that nvmet_execute_identify_cns_cs_ns() operates against updated device characteristics. Fixes: `aaf2e048af` ("nvmet: add ZBD over ZNS backend support") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Christoph Hellwig	537083b127	nvmet: move the call to nvmet_ns_changed out of nvmet_ns_revalidate [ Upstream commit `da78373396` ] nvmet_ns_changed states via lockdep that the ns->subsys->lock must be held. The only caller of nvmet_ns_changed which does not acquire that lock is nvmet_ns_revalidate. nvmet_ns_revalidate has 3 callers, of which 2 do not acquire that lock: nvmet_execute_identify_cns_cs_ns and nvmet_execute_identify_ns. The other caller nvmet_ns_revalidate_size_store does acquire the lock. Move the call to nvmet_ns_changed from nvmet_ns_revalidate to the callers so that they can perform the correct locking as needed. This issue was found using a static type-based analyser and manually verified. Reported-by: Niels Dossche <dossche.niels@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Stable-dep-of: `ab76e7206b` ("nvmet: fix error handling in nvmet_execute_identify_cns_cs_ns()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Chaitanya Kulkarni	080826d167	nvmet: use i_size_read() to set size for file-ns [ Upstream commit `2caecd62ea` ] Instead of calling vfs_getattr() use i_size_read() to read the size of file so we can read the size of not only file type but also block type with one call. This is needed to implement buffered_io support for the NVMeOF block device backend. We also change return type of function nvmet_file_ns_revalidate() from int to void, since this function does not return any meaning value. Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Stable-dep-of: `ab76e7206b` ("nvmet: fix error handling in nvmet_execute_identify_cns_cs_ns()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Xin Liu	f333854dce	bpf, sockmap: fix deadlocks in the sockhash and sockmap [ Upstream commit `ed17aa92dc` ] When huang uses sched_switch tracepoint, the tracepoint does only one thing in the mounted ebpf program, which deletes the fixed elements in sockhash ([0]) It seems that elements in sockhash are rarely actively deleted by users or ebpf program. Therefore, we do not pay much attention to their deletion. Compared with hash maps, sockhash only provides spin_lock_bh protection. This causes it to appear to have self-locking behavior in the interrupt context. [0]:https://lore.kernel.org/all/CABcoxUayum5oOqFMMqAeWuS8+EzojquSOSyDA3J_2omY=2EeAg@mail.gmail.com/ Reported-by: Hsin-Wei Hung <hsinweih@uci.edu> Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Xin Liu <liuxin350@huawei.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20230406122622.109978-1-liuxin350@huawei.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Sebastian Reichel	c8a67bc857	net: ethernet: stmmac: dwmac-rk: fix optional phy regulator handling [ Upstream commit `db21973263` ] The usual devm_regulator_get() call already handles "optional" regulators by returning a valid dummy and printing a warning that the dummy regulator should be described properly. This code open coded the same behaviour, but masked any errors that are not -EPROBE_DEFER and is quite noisy. This change effectively unmasks and propagates regulators errors not involving -ENODEV, downgrades the error print to warning level if no regulator is specified and captures the probe defer message for /sys/kernel/debug/devices_deferred. Fixes: `2e12f53663` ("net: stmmac: dwmac-rk: Use standard devicetree property for phy regulator") Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Shuchang Li	fd8c83d837	scsi: lpfc: Fix ioremap issues in lpfc_sli4_pci_mem_setup() [ Upstream commit `91a0c0c141` ] When if_type equals zero and pci_resource_start(pdev, PCI_64BIT_BAR4) returns false, drbl_regs_memmap_p is not remapped. This passes a NULL pointer to iounmap(), which can trigger a WARN() on certain arches. When if_type equals six and pci_resource_start(pdev, PCI_64BIT_BAR4) returns true, drbl_regs_memmap_p may has been remapped and ctrl_regs_memmap_p is not remapped. This is a resource leak and passes a NULL pointer to iounmap(). To fix these issues, we need to add null checks before iounmap(), and change some goto labels. Fixes: `1351e69fc6` ("scsi: lpfc: Add push-to-adapter support to sli4") Signed-off-by: Shuchang Li <lishuchang@hust.edu.cn> Link: https://lore.kernel.org/r/20230404072133.1022-1-lishuchang@hust.edu.cn Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:28 +09:00
Chao Yu	9a7f63283a	f2fs: fix to avoid use-after-free for cached IPU bio [ Upstream commit `5cdb422c83` ] xfstest generic/019 reports a bug: kernel BUG at mm/filemap.c:1619! RIP: 0010:folio_end_writeback+0x8a/0x90 Call Trace: end_page_writeback+0x1c/0x60 f2fs_write_end_io+0x199/0x420 bio_endio+0x104/0x180 submit_bio_noacct+0xa5/0x510 submit_bio+0x48/0x80 f2fs_submit_write_bio+0x35/0x300 f2fs_submit_merged_ipu_write+0x2a0/0x2b0 f2fs_write_single_data_page+0x838/0x8b0 f2fs_write_cache_pages+0x379/0xa30 f2fs_write_data_pages+0x30c/0x340 do_writepages+0xd8/0x1b0 __writeback_single_inode+0x44/0x370 writeback_sb_inodes+0x233/0x4d0 __writeback_inodes_wb+0x56/0xf0 wb_writeback+0x1dd/0x2d0 wb_workfn+0x367/0x4a0 process_one_work+0x21d/0x430 worker_thread+0x4e/0x3c0 kthread+0x103/0x130 ret_from_fork+0x2c/0x50 The root cause is: after cp_error is set, f2fs_submit_merged_ipu_write() in f2fs_write_single_data_page() tries to flush IPU bio in cache, however f2fs_submit_merged_ipu_write() missed to check validity of @bio parameter, result in submitting random cached bio which belong to other IO context, then it will cause use-after-free issue, fix it by adding additional validity check. Fixes: `0b20fcec86` ("f2fs: cache global IPU bio") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-11 23:00:27 +09:00

1 2 3 4 5 ...

1060974 Commits