linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-05 18:41:58 +09:00

Author	SHA1	Message	Date
Eric Dumazet	0317c8322d	net: annotate data-race around sk->sk_txrehash [ Upstream commit `c76a032889` ] sk_getsockopt() runs locklessly. This means sk->sk_txrehash can be read while other threads are changing its value. Other locations were handled in commit `cb6cd2cec7` ("tcp: Change SYN ACK retransmit behaviour to account for rehash") Fixes: `26859240e4` ("txhash: Add socket option to control TX hash rethink behavior") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Akhmat Karakotov <hmukos@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:13 +02:00
Eric Dumazet	60d92bc9c0	net: annotate data-races around sk->sk_reserved_mem [ Upstream commit `fe11fdcb42` ] sk_getsockopt() runs locklessly. This means sk->sk_reserved_mem can be read while other threads are changing its value. Add missing annotations where they are needed. Fixes: `2bb2f5fb21` ("net: add new socket option SO_RESERVE_MEM") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:13 +02:00
Konstantin Khorenko	9da9ea9b13	qed: Fix scheduling in a tasklet while getting stats [ Upstream commit `e346e231b4` ] Here we've got to a situation when tasklet called usleep_range() in PTT acquire logic, thus welcome to the "scheduling while atomic" BUG(). BUG: scheduling while atomic: swapper/24/0/0x00000100 [<ffffffffb41c6199>] schedule+0x29/0x70 [<ffffffffb41c5512>] schedule_hrtimeout_range_clock+0xb2/0x150 [<ffffffffb41c55c3>] schedule_hrtimeout_range+0x13/0x20 [<ffffffffb41c3bcf>] usleep_range+0x4f/0x70 [<ffffffffc08d3e58>] qed_ptt_acquire+0x38/0x100 [qed] [<ffffffffc08eac48>] _qed_get_vport_stats+0x458/0x580 [qed] [<ffffffffc08ead8c>] qed_get_vport_stats+0x1c/0xd0 [qed] [<ffffffffc08dffd3>] qed_get_protocol_stats+0x93/0x100 [qed] qed_mcp_send_protocol_stats case MFW_DRV_MSG_GET_LAN_STATS: case MFW_DRV_MSG_GET_FCOE_STATS: case MFW_DRV_MSG_GET_ISCSI_STATS: case MFW_DRV_MSG_GET_RDMA_STATS: [<ffffffffc08e36d8>] qed_mcp_handle_events+0x2d8/0x890 [qed] qed_int_assertion qed_int_attentions [<ffffffffc08d9490>] qed_int_sp_dpc+0xa50/0xdc0 [qed] [<ffffffffb3aa7623>] tasklet_action+0x83/0x140 [<ffffffffb41d9125>] __do_softirq+0x125/0x2bb [<ffffffffb41d560c>] call_softirq+0x1c/0x30 [<ffffffffb3a30645>] do_softirq+0x65/0xa0 [<ffffffffb3aa78d5>] irq_exit+0x105/0x110 [<ffffffffb41d8996>] do_IRQ+0x56/0xf0 Fix this by making caller to provide the context whether it could be in atomic context flow or not when getting stats from QED driver. QED driver based on the context provided decide to schedule out or not when acquiring the PTT BAR window. We faced the BUG_ON() while getting vport stats, but according to the code same issue could happen for fcoe and iscsi statistics as well, so fixing them too. Fixes: `6c75424612` ("qed: Add support for NCSI statistics.") Fixes: `1e128c8129` ("qed: Add support for hardware offloaded FCoE.") Fixes: `2f2b2614e8` ("qed: Provide iSCSI statistics to management") Cc: Sudarsana Kalluru <skalluru@marvell.com> Cc: David Miller <davem@davemloft.net> Cc: Manish Chopra <manishc@marvell.com> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:13 +02:00
Chengfeng Ye	3c42307abe	mISDN: hfcpci: Fix potential deadlock on &hc->lock [ Upstream commit `56c6be35fc` ] As &hc->lock is acquired by both timer _hfcpci_softirq() and hardirq hfcpci_int(), the timer should disable irq before lock acquisition otherwise deadlock could happen if the timmer is preemtped by the hadr irq. Possible deadlock scenario: hfcpci_softirq() (timer) -> _hfcpci_softirq() -> spin_lock(&hc->lock); <irq interruption> -> hfcpci_int() -> spin_lock(&hc->lock); (deadlock here) This flaw was found by an experimental static analysis tool I am developing for irq-related deadlock. The tentative patch fixes the potential deadlock by spin_lock_irq() in timer. Fixes: `b36b654a7e` ("mISDN: Create /sys/class/mISDN") Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> Link: https://lore.kernel.org/r/20230727085619.7419-1-dg573847474@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:13 +02:00
Jamal Hadi Salim	d652c080b6	net: sched: cls_u32: Fix match key mis-addressing [ Upstream commit `e68409db99` ] A match entry is uniquely identified with an "address" or "path" in the form of: hashtable ID(12b):bucketid(8b):nodeid(12b). When creating table match entries all of hash table id, bucket id and node (match entry id) are needed to be either specified by the user or reasonable in-kernel defaults are used. The in-kernel default for a table id is 0x800(omnipresent root table); for bucketid it is 0x0. Prior to this fix there was none for a nodeid i.e. the code assumed that the user passed the correct nodeid and if the user passes a nodeid of 0 (as Mingi Cho did) then that is what was used. But nodeid of 0 is reserved for identifying the table. This is not a problem until we dump. The dump code notices that the nodeid is zero and assumes it is referencing a table and therefore references table struct tc_u_hnode instead of what was created i.e match entry struct tc_u_knode. Ming does an equivalent of: tc filter add dev dummy0 parent 10: prio 1 handle 0x1000 \ protocol ip u32 match ip src 10.0.0.1/32 classid 10:1 action ok Essentially specifying a table id 0, bucketid 1 and nodeid of zero Tableid 0 is remapped to the default of 0x800. Bucketid 1 is ignored and defaults to 0x00. Nodeid was assumed to be what Ming passed - 0x000 dumping before fix shows: ~$ tc filter ls dev dummy0 parent 10: filter protocol ip pref 1 u32 chain 0 filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1 filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor -30591 Note that the last line reports a table instead of a match entry (you can tell this because it says "ht divisor..."). As a result of reporting the wrong data type (misinterpretting of struct tc_u_knode as being struct tc_u_hnode) the divisor is reported with value of -30591. Ming identified this as part of the heap address (physmap_base is 0xffff8880 (-30591 - 1)). The fix is to ensure that when table entry matches are added and no nodeid is specified (i.e nodeid == 0) then we get the next available nodeid from the table's pool. After the fix, this is what the dump shows: $ tc filter ls dev dummy0 parent 10: filter protocol ip pref 1 u32 chain 0 filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1 filter protocol ip pref 1 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 flowid 10:1 not_in_hw match 0a000001/ffffffff at 12 action order 1: gact action pass random type none pass val 0 index 1 ref 1 bind 1 Reported-by: Mingi Cho <mgcho.minic@gmail.com> Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20230726135151.416917-1-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:13 +02:00
Georg Müller	22709d8537	perf test uprobe_from_different_cu: Skip if there is no gcc [ Upstream commit `98ce8e4a9d` ] Without gcc, the test will fail. On cleanup, ignore probe removal errors. Otherwise, in case of an error adding the probe, the temporary directory is not removed. Fixes: `56cbeacf14` ("perf probe: Add test for regression introduced by switch to die_get_decl_file()") Signed-off-by: Georg Müller <georgmueller@gmx.net> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Georg Müller <georgmueller@gmx.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230728151812.454806-2-georgmueller@gmx.net Link: https://lore.kernel.org/r/CAP-5=fUP6UuLgRty3t2=fQsQi3k4hDMz415vWdp1x88QMvZ8ug@mail.gmail.com/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:13 +02:00
Yuanjun Gong	5ef5b6e9c1	net: dsa: fix value check in bcm_sf2_sw_probe() [ Upstream commit `dadc5b86cc` ] in bcm_sf2_sw_probe(), check the return value of clk_prepare_enable() and return the error code if clk_prepare_enable() returns an unexpected value. Fixes: `e9ec5c3bd2` ("net: dsa: bcm_sf2: request and handle clocks") Signed-off-by: Yuanjun Gong <ruc_gongyuanjun@163.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20230726170506.16547-1-ruc_gongyuanjun@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:13 +02:00
Lin Ma	8dfac8071d	rtnetlink: let rtnl_bridge_setlink checks IFLA_BRIDGE_MODE length [ Upstream commit `d73ef2d69c` ] There are totally 9 ndo_bridge_setlink handlers in the current kernel, which are 1) bnxt_bridge_setlink, 2) be_ndo_bridge_setlink 3) i40e_ndo_bridge_setlink 4) ice_bridge_setlink 5) ixgbe_ndo_bridge_setlink 6) mlx5e_bridge_setlink 7) nfp_net_bridge_setlink 8) qeth_l2_bridge_setlink 9) br_setlink. By investigating the code, we find that 1-7 parse and use nlattr IFLA_BRIDGE_MODE but 3 and 4 forget to do the nla_len check. This can lead to an out-of-attribute read and allow a malformed nlattr (e.g., length 0) to be viewed as a 2 byte integer. To avoid such issues, also for other ndo_bridge_setlink handlers in the future. This patch adds the nla_len check in rtnl_bridge_setlink and does an early error return if length mismatches. To make it works, the break is removed from the parsing for IFLA_BRIDGE_FLAGS to make sure this nla_for_each_nested iterates every attribute. Fixes: `b1edc14a3f` ("ice: Implement ice_bridge_getlink and ice_bridge_setlink") Fixes: `51616018dd` ("i40e: Add support for getlink, setlink ndo ops") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Lin Ma <linma@zju.edu.cn> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20230726075314.1059224-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:12 +02:00
Lin Ma	24772cc31f	bpf: Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing [ Upstream commit `bcc29b7f5a` ] The nla_for_each_nested parsing in function bpf_sk_storage_diag_alloc does not check the length of the nested attribute. This can lead to an out-of-attribute read and allow a malformed nlattr (e.g., length 0) to be viewed as a 4 byte integer. This patch adds an additional check when the nlattr is getting counted. This makes sure the latter nla_get_u32 can access the attributes with the correct length. Fixes: `1ed4d92458` ("bpf: INET_DIAG support in bpf_sk_storage") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230725023330.422856-1-linma@zju.edu.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:12 +02:00
Jianbo Liu	d628ba98eb	net/mlx5e: Move representor neigh cleanup to profile cleanup_tx [ Upstream commit `d03b6e6f31` ] For IP tunnel encapsulation in ECMP (Equal-Cost Multipath) mode, as the flow is duplicated to the peer eswitch, the related neighbour information on the peer uplink representor is created as well. In the cited commit, eswitch devcom unpair is moved to uplink unload API, specifically the profile->cleanup_tx. If there is a encap rule offloaded in ECMP mode, when one eswitch does unpair (because of unloading the driver, for instance), and the peer rule from the peer eswitch is going to be deleted, the use-after-free error is triggered while accessing neigh info, as it is already cleaned up in uplink's profile->disable, which is before its profile->cleanup_tx. To fix this issue, move the neigh cleanup to profile's cleanup_tx callback, and after mlx5e_cleanup_uplink_rep_tx is called. The neigh init is moved to init_tx for symmeter. [ 2453.376299] BUG: KASAN: slab-use-after-free in mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core] [ 2453.379125] Read of size 4 at addr ffff888127af9008 by task modprobe/2496 [ 2453.381542] CPU: 7 PID: 2496 Comm: modprobe Tainted: G B 6.4.0-rc7+ #15 [ 2453.383386] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 2453.384335] Call Trace: [ 2453.384625] <TASK> [ 2453.384891] dump_stack_lvl+0x33/0x50 [ 2453.385285] print_report+0xc2/0x610 [ 2453.385667] ? __virt_addr_valid+0xb1/0x130 [ 2453.386091] ? mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core] [ 2453.386757] kasan_report+0xae/0xe0 [ 2453.387123] ? mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core] [ 2453.387798] mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core] [ 2453.388465] mlx5e_rep_encap_entry_detach+0xa6/0xe0 [mlx5_core] [ 2453.389111] mlx5e_encap_dealloc+0xa7/0x100 [mlx5_core] [ 2453.389706] mlx5e_tc_tun_encap_dests_unset+0x61/0xb0 [mlx5_core] [ 2453.390361] mlx5_free_flow_attr_actions+0x11e/0x340 [mlx5_core] [ 2453.391015] ? complete_all+0x43/0xd0 [ 2453.391398] ? free_flow_post_acts+0x38/0x120 [mlx5_core] [ 2453.392004] mlx5e_tc_del_fdb_flow+0x4ae/0x690 [mlx5_core] [ 2453.392618] mlx5e_tc_del_fdb_peers_flow+0x308/0x370 [mlx5_core] [ 2453.393276] mlx5e_tc_clean_fdb_peer_flows+0xf5/0x140 [mlx5_core] [ 2453.393925] mlx5_esw_offloads_unpair+0x86/0x540 [mlx5_core] [ 2453.394546] ? mlx5_esw_offloads_set_ns_peer.isra.0+0x180/0x180 [mlx5_core] [ 2453.395268] ? down_write+0xaa/0x100 [ 2453.395652] mlx5_esw_offloads_devcom_event+0x203/0x530 [mlx5_core] [ 2453.396317] mlx5_devcom_send_event+0xbb/0x190 [mlx5_core] [ 2453.396917] mlx5_esw_offloads_devcom_cleanup+0xb0/0xd0 [mlx5_core] [ 2453.397582] mlx5e_tc_esw_cleanup+0x42/0x120 [mlx5_core] [ 2453.398182] mlx5e_rep_tc_cleanup+0x15/0x30 [mlx5_core] [ 2453.398768] mlx5e_cleanup_rep_tx+0x6c/0x80 [mlx5_core] [ 2453.399367] mlx5e_detach_netdev+0xee/0x120 [mlx5_core] [ 2453.399957] mlx5e_netdev_change_profile+0x84/0x170 [mlx5_core] [ 2453.400598] mlx5e_vport_rep_unload+0xe0/0xf0 [mlx5_core] [ 2453.403781] mlx5_eswitch_unregister_vport_reps+0x15e/0x190 [mlx5_core] [ 2453.404479] ? mlx5_eswitch_register_vport_reps+0x200/0x200 [mlx5_core] [ 2453.405170] ? up_write+0x39/0x60 [ 2453.405529] ? kernfs_remove_by_name_ns+0xb7/0xe0 [ 2453.405985] auxiliary_bus_remove+0x2e/0x40 [ 2453.406405] device_release_driver_internal+0x243/0x2d0 [ 2453.406900] ? kobject_put+0x42/0x2d0 [ 2453.407284] bus_remove_device+0x128/0x1d0 [ 2453.407687] device_del+0x240/0x550 [ 2453.408053] ? waiting_for_supplier_show+0xe0/0xe0 [ 2453.408511] ? kobject_put+0xfa/0x2d0 [ 2453.408889] ? __kmem_cache_free+0x14d/0x280 [ 2453.409310] mlx5_rescan_drivers_locked.part.0+0xcd/0x2b0 [mlx5_core] [ 2453.409973] mlx5_unregister_device+0x40/0x50 [mlx5_core] [ 2453.410561] mlx5_uninit_one+0x3d/0x110 [mlx5_core] [ 2453.411111] remove_one+0x89/0x130 [mlx5_core] [ 2453.411628] pci_device_remove+0x59/0xf0 [ 2453.412026] device_release_driver_internal+0x243/0x2d0 [ 2453.412511] ? parse_option_str+0x14/0x90 [ 2453.412915] driver_detach+0x7b/0xf0 [ 2453.413289] bus_remove_driver+0xb5/0x160 [ 2453.413685] pci_unregister_driver+0x3f/0xf0 [ 2453.414104] mlx5_cleanup+0xc/0x20 [mlx5_core] Fixes: `2be5bd42a5` ("net/mlx5: Handle pairing of E-switch via uplink un/load APIs") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:12 +02:00
Amir Tzin	94a0eb9c12	net/mlx5e: Fix crash moving to switchdev mode when ntuple offload is set [ Upstream commit `3ec43c1b08` ] Moving to switchdev mode with ntuple offload on causes the kernel to crash since fs->arfs is freed during nic profile cleanup flow. Ntuple offload is not supported in switchdev mode and it is already unset by mlx5 fix feature ndo in switchdev mode. Verify fs->arfs is valid before disabling it. trace: [] RIP: 0010:_raw_spin_lock_bh+0x17/0x30 [] arfs_del_rules+0x44/0x1a0 [mlx5_core] [] mlx5e_arfs_disable+0xe/0x20 [mlx5_core] [] mlx5e_handle_feature+0x3d/0xb0 [mlx5_core] [] ? __rtnl_unlock+0x25/0x50 [] mlx5e_set_features+0xfe/0x160 [mlx5_core] [] __netdev_update_features+0x278/0xa50 [] ? netdev_run_todo+0x5e/0x2a0 [] netdev_update_features+0x22/0x70 [] ? _cond_resched+0x15/0x30 [] mlx5e_attach_netdev+0x12a/0x1e0 [mlx5_core] [] mlx5e_netdev_attach_profile+0xa1/0xc0 [mlx5_core] [] mlx5e_netdev_change_profile+0x77/0xe0 [mlx5_core] [] mlx5e_vport_rep_load+0x1ed/0x290 [mlx5_core] [] mlx5_esw_offloads_rep_load+0x88/0xd0 [mlx5_core] [] esw_offloads_load_rep.part.38+0x31/0x50 [mlx5_core] [] esw_offloads_enable+0x6c5/0x710 [mlx5_core] [] mlx5_eswitch_enable_locked+0x1bb/0x290 [mlx5_core] [] mlx5_devlink_eswitch_mode_set+0x14f/0x320 [mlx5_core] [] devlink_nl_cmd_eswitch_set_doit+0x94/0x120 [] genl_family_rcv_msg_doit.isra.17+0x113/0x150 [] genl_family_rcv_msg+0xb7/0x170 [] ? devlink_nl_cmd_port_split_doit+0x100/0x100 [] genl_rcv_msg+0x47/0xa0 [] ? genl_family_rcv_msg+0x170/0x170 [] netlink_rcv_skb+0x4c/0x130 [] genl_rcv+0x24/0x40 [] netlink_unicast+0x19a/0x230 [] netlink_sendmsg+0x204/0x3d0 [] sock_sendmsg+0x50/0x60 Fixes: `90b22b9bcd` ("net/mlx5e: Disable Rx ntuple offload for uplink representor") Signed-off-by: Amir Tzin <amirtz@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:12 +02:00
Yuanjun Gong	a7b5f00100	net/mlx5e: fix return value check in mlx5e_ipsec_remove_trailer() [ Upstream commit `e5bcb7564d` ] mlx5e_ipsec_remove_trailer() should return an error code if function pskb_trim() returns an unexpected value. Fixes: `2ac9cfe782` ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path") Signed-off-by: Yuanjun Gong <ruc_gongyuanjun@163.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:12 +02:00
Zhengchao Shao	0582a3caaa	net/mlx5: fix potential memory leak in mlx5e_init_rep_rx [ Upstream commit `c6cf0b6097` ] The memory pointed to by the priv->rx_res pointer is not freed in the error path of mlx5e_init_rep_rx, which can lead to a memory leak. Fix by freeing the memory in the error path, thereby making the error path identical to mlx5e_cleanup_rep_rx(). Fixes: `af8bbf7300` ("net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:12 +02:00
Zhengchao Shao	3169c38543	net/mlx5: DR, fix memory leak in mlx5dr_cmd_create_reformat_ctx [ Upstream commit `5dd77585dd` ] when mlx5_cmd_exec failed in mlx5dr_cmd_create_reformat_ctx, the memory pointed by 'in' is not released, which will cause memory leak. Move memory release after mlx5_cmd_exec. Fixes: `1d9186476e` ("net/mlx5: DR, Add direct rule command utilities") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:12 +02:00
Zhengchao Shao	c818fff3b6	net/mlx5e: fix double free in macsec_fs_tx_create_crypto_table_groups [ Upstream commit `aeb660171b` ] In function macsec_fs_tx_create_crypto_table_groups(), when the ft->g memory is successfully allocated but the 'in' memory fails to be allocated, the memory pointed to by ft->g is released once. And in function macsec_fs_tx_create(), macsec_fs_tx_destroy() is called to release the memory pointed to by ft->g again. This will cause double free problem. Fixes: `e467b283ff` ("net/mlx5e: Add MACsec TX steering rules") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:11 +02:00
Ilan Peer	7a6fad03f5	wifi: cfg80211: Fix return value in scan logic [ Upstream commit `fd7f08d92f` ] The reporter noticed a warning when running iwlwifi: WARNING: CPU: 8 PID: 659 at mm/page_alloc.c:4453 __alloc_pages+0x329/0x340 As cfg80211_parse_colocated_ap() is not expected to return a negative value return 0 and not a negative value if cfg80211_calc_short_ssid() fails. Fixes: `c8cb5b854b` ("nl80211/cfg80211: support 6 GHz scanning") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217675 Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230723201043.3007430-1-ilan.peer@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:11 +02:00
Gao Xiang	05e0952ddb	erofs: fix wrong primary bvec selection on deduplicated extents [ Upstream commit `94c43de735` ] When handling deduplicated compressed data, there can be multiple decompressed extents pointing to the same compressed data in one shot. In such cases, the bvecs which belong to the longest extent will be selected as the primary bvecs for real decompressors to decode and the other duplicated bvecs will be directly copied from the primary bvecs. Previously, only relative offsets of the longest extent were checked to decompress the primary bvecs. On rare occasions, it can be incorrect if there are several extents with the same start relative offset. As a result, some short bvecs could be selected for decompression and then cause data corruption. For example, as Shijie Sun reported off-list, considering the following extents of a file: 117: 903345.. 915250 \| 11905 : 385024.. 389120 \| 4096 ... 119: 919729.. 930323 \| 10594 : 385024.. 389120 \| 4096 ... 124: 968881.. 980786 \| 11905 : 385024.. 389120 \| 4096 The start relative offset is the same: 2225, but extent 119 (919729.. 930323) is shorter than the others. Let's restrict the bvec length in addition to the start offset if bvecs are not full. Reported-by: Shijie Sun <sunshijie@xiaomi.com> Fixes: `5c2a64252c` ("erofs: introduce partial-referenced pclusters") Tested-by Shijie Sun <sunshijie@xiaomi.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230719065459.60083-1-hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:11 +02:00
Heiko Carstens	a759972d25	KVM: s390: fix sthyi error handling [ Upstream commit `0c02cc576e` ] Commit `9fb6c9b3fe` ("s390/sthyi: add cache to store hypervisor info") added cache handling for store hypervisor info. This also changed the possible return code for sthyi_fill(). Instead of only returning a condition code like the sthyi instruction would do, it can now also return a negative error value (-ENOMEM). handle_styhi() was not changed accordingly. In case of an error, the negative error value would incorrectly injected into the guest PSW. Add proper error handling to prevent this, and update the comment which describes the possible return values of sthyi_fill(). Fixes: `9fb6c9b3fe` ("s390/sthyi: add cache to store hypervisor info") Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20230727182939.2050744-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:11 +02:00
ndesaulniers@google.com	f168188174	word-at-a-time: use the same return type for has_zero regardless of endianness [ Upstream commit `79e8328e5a` ] Compiling big-endian targets with Clang produces the diagnostic: fs/namei.c:2173:13: warning: use of bitwise '\|' with boolean operands [-Wbitwise-instead-of-logical] } while (!(has_zero(a, &adata, &constants) \| has_zero(b, &bdata, &constants))); ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \|\| fs/namei.c:2173:13: note: cast one or both operands to int to silence this warning It appears that when has_zero was introduced, two definitions were produced with different signatures (in particular different return types). Looking at the usage in hash_name() in fs/namei.c, I suspect that has_zero() is meant to be invoked twice per while loop iteration; using logical-or would not update `bdata` when `a` did not have zeros. So I think it's preferred to always return an unsigned long rather than a bool than update the while loop in hash_name() to use a logical-or rather than bitwise-or. [ Also changed powerpc version to do the same - Linus ] Link: https://github.com/ClangBuiltLinux/linux/issues/1832 Link: https://lore.kernel.org/lkml/20230801-bitwise-v1-1-799bec468dc4@google.com/ Fixes: `36126f8f2e` ("word-at-a-time: make the interfaces truly generic") Debugged-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:11 +02:00
Cristian Marussi	5b53b2b44f	firmware: arm_scmi: Fix chan_free cleanup on SMC [ Upstream commit `d1ff11d7ad` ] SCMI transport based on SMC can optionally use an additional IRQ to signal message completion. The associated interrupt handler is currently allocated using devres but on shutdown the core SCMI stack will call .chan_free() well before any managed cleanup is invoked by devres. As a consequence, the arrival of a late reply to an in-flight pending transaction could still trigger the interrupt handler well after the SCMI core has cleaned up the channels, with unpleasant results. Inhibit further message processing on the IRQ path by explicitly freeing the IRQ inside .chan_free() callback itself. Fixes: `dd820ee21d` ("firmware: arm_scmi: Augment SMC/HVC to allow optional interrupt") Reported-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20230719173533.2739319-1-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:11 +02:00
Yury Norov	6289d5486d	lib/bitmap: workaround const_eval test build failure [ Upstream commit `2356d198d2` ] When building with Clang, and when KASAN and GCOV_PROFILE_ALL are both enabled, the test fails to build [1]: >> lib/test_bitmap.c:920:2: error: call to '__compiletime_assert_239' declared with 'error' attribute: BUILD_BUG_ON failed: !__builtin_constant_p(res) BUILD_BUG_ON(!__builtin_constant_p(res)); ^ include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON' BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) ^ include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^ include/linux/compiler_types.h:352:2: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ include/linux/compiler_types.h:340:2: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^ include/linux/compiler_types.h:333:4: note: expanded from macro '__compiletime_assert' prefix ## suffix(); \ ^ <scratch space>:185:1: note: expanded from here __compiletime_assert_239 Originally it was attributed to s390, which now looks seemingly wrong. The issue is not related to bitmap code itself, but it breaks build for a given configuration. Disabling the const_eval test under that config may potentially hide other bugs. Instead, workaround it by disabling GCOV for the test_bitmap unless the compiler will get fixed. [1] https://github.com/ClangBuiltLinux/linux/issues/1874 Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202307171254.yFcH97ej-lkp@intel.com/ Fixes: `dc34d50366` ("lib: test_bitmap: add compile-time optimization/evaluations assertions") Co-developed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:10 +02:00
Punit Agrawal	0ca5de8309	firmware: smccc: Fix use of uninitialised results structure [ Upstream commit `d05799d7b4` ] Commit `35727af2b1` ("irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4") moved the initialisation of the SoC version to arm_smccc_version_init() but forgot to update the results structure and it's usage. Fix the use of the uninitialised results structure and update the error strings. Fixes: `35727af2b1` ("irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4") Signed-off-by: Punit Agrawal <punit.agrawal@bytedance.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Vikram Sethi <vsethi@nvidia.com> Cc: Shanker Donthineni <sdonthineni@nvidia.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230717171702.424253-1-punit.agrawal@bytedance.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:10 +02:00
Benjamin Gaignard	7b0582dddd	arm64: dts: freescale: Fix VPU G2 clock [ Upstream commit `b27bfc5103` ] Set VPU G2 clock to 300MHz like described in documentation. This fixes pixels error occurring with large resolution ( >= 2560x1600) HEVC test stream when using the postprocessor to produce NV12. Fixes: `4ac7e4a812` ("arm64: dts: imx8mq: Enable both G1 and G2 VPU's with vpu-blk-ctrl") Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:10 +02:00
Hugo Villeneuve	5841d3d0c3	arm64: dts: imx8mn-var-som: add missing pull-up for onboard PHY reset pinmux [ Upstream commit `253be5b53c` ] For SOMs with an onboard PHY, the RESET_N pull-up resistor is currently deactivated in the pinmux configuration. When the pinmux code selects the GPIO function for this pin, with a default direction of input, this prevents the RESET_N pin from being taken to the proper 3.3V level (deasserted), and this results in the PHY being not detected since it is held in reset. Taken from RESET_N pin description in ADIN13000 datasheet: This pin requires a 1K pull-up resistor to AVDD_3P3. Activate the pull-up resistor to fix the issue. Fixes: `ade0176dd8` ("arm64: dts: imx8mn-var-som: Add Variscite VAR-SOM-MX8MN System on Module") Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:10 +02:00
Yashwanth Varakala	a24f67b71a	arm64: dts: phycore-imx8mm: Correction in gpio-line-names [ Upstream commit `1ef0aa137a` ] Remove unused nINT_ETHPHY entry from gpio-line-names in gpio1 nodes of phyCORE-i.MX8MM and phyBOARD-Polis-i.MX8MM devicetrees. Fixes: `ae6847f26a` ("arm64: dts: freescale: Add phyBOARD-Polis-i.MX8MM support") Signed-off-by: Yashwanth Varakala <y.varakala@phytec.de> Signed-off-by: Cem Tenruh <c.tenruh@phytec.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:10 +02:00
Yashwanth Varakala	753a927c58	arm64: dts: phycore-imx8mm: Label typo-fix of VPU [ Upstream commit `cddeefc166` ] Corrected the label of the VPU regulator node (buck 3) from reg_vdd_gpu to reg_vdd_vpu. Fixes: `ae6847f26a` ("arm64: dts: freescale: Add phyBOARD-Polis-i.MX8MM support") Signed-off-by: Yashwanth Varakala <y.varakala@phytec.de> Signed-off-by: Cem Tenruh <c.tenruh@phytec.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:10 +02:00
Tim Harvey	608ac7ea5f	arm64: dts: imx8mm-venice-gw7904: disable disp_blk_ctrl [ Upstream commit `f7a0b57524` ] The GW7904 does not connect the VDD_MIPI power rails thus MIPI is disabled. However we must also disable disp_blk_ctrl as it uses the pgc_mipi power domain and without it being disabled imx8m-blk-ctrl will fail to probe: imx8m-blk-ctrl 32e28000.blk-ctrl: error -ETIMEDOUT: failed to attach power domain "mipi-dsi" imx8m-blk-ctrl: probe of 32e28000.blk-ctrl failed with error -110 Fixes: `b999bdaf05` ("arm64: dts: imx: Add i.mx8mm Gateworks gw7904 dts support") Signed-off-by: Tim Harvey <tharvey@gateworks.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:10 +02:00
Tim Harvey	d060bbb2fe	arm64: dts: imx8mm-venice-gw7903: disable disp_blk_ctrl [ Upstream commit `3e7d3c5e13` ] The GW7903 does not connect the VDD_MIPI power rails thus MIPI is disabled. However we must also disable disp_blk_ctrl as it uses the pgc_mipi power domain and without it being disabled imx8m-blk-ctrl will fail to probe: imx8m-blk-ctrl 32e28000.blk-ctrl: error -ETIMEDOUT: failed to attach power domain "mipi-dsi" imx8m-blk-ctrl: probe of 32e28000.blk-ctrl failed with error -110 Fixes: `a72ba91e5b` ("arm64: dts: imx: Add i.mx8mm Gateworks gw7903 dts support") Signed-off-by: Tim Harvey <tharvey@gateworks.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-08-11 12:08:10 +02:00
Robin Murphy	8ddb3183c4	iommu/arm-smmu-v3: Document nesting-related errata commit `0bfbfc526c` upstream Both MMU-600 and MMU-700 have similar errata around TLB invalidation while both stages of translation are active, which will need some consideration once nesting support is implemented. For now, though, it's very easy to make our implicit lack of nesting support explicit for those cases, so they're less likely to be missed in future. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/696da78d32bb4491f898f11b0bb4d850a8aa7c6a.1683731256.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-11 12:08:09 +02:00
Robin Murphy	42d04acf1d	iommu/arm-smmu-v3: Add explicit feature for nesting commit `1d9777b9f3` upstream In certain cases we may want to refuse to allow nested translation even when both stages are implemented, so let's add an explicit feature for nesting support which we can control in its own right. For now this merely serves as documentation, but it means a nice convenient check will be ready and waiting for the future nesting code. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/136c3f4a3a84cc14a5a1978ace57dfd3ed67b688.1683731256.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-11 12:08:09 +02:00
Robin Murphy	57ae3671ec	iommu/arm-smmu-v3: Document MMU-700 erratum 2812531 commit `309a15cb16` upstream To work around MMU-700 erratum 2812531 we need to ensure that certain sequences of commands cannot be issued without an intervening sync. In practice this falls out of our current command-batching machinery anyway - each batch only contains a single type of invalidation command, and ends with a sync. The only exception is when a batch is sufficiently large to need issuing across multiple command queue slots, wherein the earlier slots will not contain a sync and thus may in theory interleave with another batch being issued in parallel to create an affected sequence across the slot boundary. Since MMU-700 supports range invalidate commands and thus we will prefer to use them (which also happens to avoid conditions for other errata), I'm not entirely sure it's even possible for a single high-level invalidate call to generate a batch of more than 63 commands, but for the sake of robustness and documentation, wire up an option to enforce that a sync is always inserted for every slot issued. The other aspect is that the relative order of DVM commands cannot be controlled, so DVM cannot be used. Again that is already the status quo, but since we have at least defined ARM_SMMU_FEAT_BTM, we can explicitly disable it for documentation purposes even if it's not wired up anywhere yet. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/330221cdfd0003cd51b6c04e7ff3566741ad8374.1683731256.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-11 12:08:09 +02:00
Robin Murphy	e3399bd014	iommu/arm-smmu-v3: Work around MMU-600 erratum 1076982 commit `f322e8af35` upstream MMU-600 versions prior to r1p0 fail to correctly generate a WFE wakeup event when the command queue transitions fom full to non-full. We can easily work around this by simply hiding the SEV capability such that we fall back to polling for space in the queue - since MMU-600 implements MSIs we wouldn't expect to need SEV for sync completion either, so this should have little to no impact. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/08adbe3d01024d8382a478325f73b56851f76e49.1683731256.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-11 12:08:09 +02:00
Alex Elder	50c24f0c94	net: ipa: only reset hashed tables when supported commit `e11ec2b868` upstream. Last year, the code that manages GSI channel transactions switched from using spinlock-protected linked lists to using indexes into the ring buffer used for a channel. Recently, Google reported seeing transaction reference count underflows occasionally during shutdown. Doug Anderson found a way to reproduce the issue reliably, and bisected the issue to the commit that eliminated the linked lists and the lock. The root cause was ultimately determined to be related to unused transactions being committed as part of the modem shutdown cleanup activity. Unused transactions are not normally expected (except in error cases). The modem uses some ranges of IPA-resident memory, and whenever it shuts down we zero those ranges. In ipa_filter_reset_table() a transaction is allocated to zero modem filter table entries. If hashing is not supported, hashed table memory should not be zeroed. But currently nothing prevents that, and the result is an unused transaction. Something similar occurs when we zero routing table entries for the modem. By preventing any attempt to clear hashed tables when hashing is not supported, the reference count underflow is avoided in this case. Note that there likely remains an issue with properly freeing unused transactions (if they occur due to errors). This patch addresses only the underflows that Google originally reported. Cc: <stable@vger.kernel.org> # 6.1.x Fixes: `d338ae28d8` ("net: ipa: kill all other transaction lists") Tested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20230724224055.1688854-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-11 12:08:09 +02:00
Shay Drory	93f5b88112	net/mlx5: Free irqs only on shutdown callback commit `9c2d080109` upstream. Whenever a shutdown is invoked, free irqs only and keep mlx5_irq synthetic wrapper intact in order to avoid use-after-free on system shutdown. for example: ================================================================== BUG: KASAN: use-after-free in _find_first_bit+0x66/0x80 Read of size 8 at addr ffff88823fc0d318 by task kworker/u192:0/13608 CPU: 25 PID: 13608 Comm: kworker/u192:0 Tainted: G B W O 6.1.21-cloudflare-kasan-2023.3.21 #1 Hardware name: GIGABYTE R162-R2-GEN0/MZ12-HD2-CD, BIOS R14 05/03/2021 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core] Call Trace: <TASK> dump_stack_lvl+0x34/0x48 print_report+0x170/0x473 ? _find_first_bit+0x66/0x80 kasan_report+0xad/0x130 ? _find_first_bit+0x66/0x80 _find_first_bit+0x66/0x80 mlx5e_open_channels+0x3c5/0x3a10 [mlx5_core] ? console_unlock+0x2fa/0x430 ? _raw_spin_lock_irqsave+0x8d/0xf0 ? _raw_spin_unlock_irqrestore+0x42/0x80 ? preempt_count_add+0x7d/0x150 ? __wake_up_klogd.part.0+0x7d/0xc0 ? vprintk_emit+0xfe/0x2c0 ? mlx5e_trigger_napi_sched+0x40/0x40 [mlx5_core] ? dev_attr_show.cold+0x35/0x35 ? devlink_health_do_dump.part.0+0x174/0x340 ? devlink_health_report+0x504/0x810 ? mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core] ? mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core] ? process_one_work+0x680/0x1050 mlx5e_safe_switch_params+0x156/0x220 [mlx5_core] ? mlx5e_switch_priv_channels+0x310/0x310 [mlx5_core] ? mlx5_eq_poll_irq_disabled+0xb6/0x100 [mlx5_core] mlx5e_tx_reporter_timeout_recover+0x123/0x240 [mlx5_core] ? __mutex_unlock_slowpath.constprop.0+0x2b0/0x2b0 devlink_health_reporter_recover+0xa6/0x1f0 devlink_health_report+0x2f7/0x810 ? vsnprintf+0x854/0x15e0 mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core] ? mlx5e_reporter_tx_err_cqe+0x1a0/0x1a0 [mlx5_core] ? mlx5e_tx_reporter_timeout_dump+0x50/0x50 [mlx5_core] ? mlx5e_tx_reporter_dump_sq+0x260/0x260 [mlx5_core] ? newidle_balance+0x9b7/0xe30 ? psi_group_change+0x6a7/0xb80 ? mutex_lock+0x96/0xf0 ? __mutex_lock_slowpath+0x10/0x10 mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core] process_one_work+0x680/0x1050 worker_thread+0x5a0/0xeb0 ? process_one_work+0x1050/0x1050 kthread+0x2a2/0x340 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 </TASK> Freed by task 1: kasan_save_stack+0x23/0x50 kasan_set_track+0x21/0x30 kasan_save_free_info+0x2a/0x40 ____kasan_slab_free+0x169/0x1d0 slab_free_freelist_hook+0xd2/0x190 __kmem_cache_free+0x1a1/0x2f0 irq_pool_free+0x138/0x200 [mlx5_core] mlx5_irq_table_destroy+0xf6/0x170 [mlx5_core] mlx5_core_eq_free_irqs+0x74/0xf0 [mlx5_core] shutdown+0x194/0x1aa [mlx5_core] pci_device_shutdown+0x75/0x120 device_shutdown+0x35c/0x620 kernel_restart+0x60/0xa0 __do_sys_reboot+0x1cb/0x2c0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x4b/0xb5 The buggy address belongs to the object at ffff88823fc0d300 which belongs to the cache kmalloc-192 of size 192 The buggy address is located 24 bytes inside of 192-byte region [ffff88823fc0d300, ffff88823fc0d3c0) The buggy address belongs to the physical page: page:0000000010139587 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x23fc0c head:0000000010139587 order:1 compound_mapcount:0 compound_pincount:0 flags: 0x2ffff800010200(slab\|head\|node=0\|zone=2\|lastcpupid=0x1ffff) raw: 002ffff800010200 0000000000000000 dead000000000122 ffff88810004ca00 raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88823fc0d200: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88823fc0d280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc >ffff88823fc0d300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88823fc0d380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff88823fc0d400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== general protection fault, probably for non-canonical address 0xdffffc005c40d7ac: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: probably user-memory-access in range [0x00000002e206bd60-0x00000002e206bd67] CPU: 25 PID: 13608 Comm: kworker/u192:0 Tainted: G B W O 6.1.21-cloudflare-kasan-2023.3.21 #1 Hardware name: GIGABYTE R162-R2-GEN0/MZ12-HD2-CD, BIOS R14 05/03/2021 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core] RIP: 0010:__alloc_pages+0x141/0x5c0 Call Trace: <TASK> ? sysvec_apic_timer_interrupt+0xa0/0xc0 ? asm_sysvec_apic_timer_interrupt+0x16/0x20 ? __alloc_pages_slowpath.constprop.0+0x1ec0/0x1ec0 ? _raw_spin_unlock_irqrestore+0x3d/0x80 __kmalloc_large_node+0x80/0x120 ? kvmalloc_node+0x4e/0x170 __kmalloc_node+0xd4/0x150 kvmalloc_node+0x4e/0x170 mlx5e_open_channels+0x631/0x3a10 [mlx5_core] ? console_unlock+0x2fa/0x430 ? _raw_spin_lock_irqsave+0x8d/0xf0 ? _raw_spin_unlock_irqrestore+0x42/0x80 ? preempt_count_add+0x7d/0x150 ? __wake_up_klogd.part.0+0x7d/0xc0 ? vprintk_emit+0xfe/0x2c0 ? mlx5e_trigger_napi_sched+0x40/0x40 [mlx5_core] ? dev_attr_show.cold+0x35/0x35 ? devlink_health_do_dump.part.0+0x174/0x340 ? devlink_health_report+0x504/0x810 ? mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core] ? mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core] ? process_one_work+0x680/0x1050 mlx5e_safe_switch_params+0x156/0x220 [mlx5_core] ? mlx5e_switch_priv_channels+0x310/0x310 [mlx5_core] ? mlx5_eq_poll_irq_disabled+0xb6/0x100 [mlx5_core] mlx5e_tx_reporter_timeout_recover+0x123/0x240 [mlx5_core] ? __mutex_unlock_slowpath.constprop.0+0x2b0/0x2b0 devlink_health_reporter_recover+0xa6/0x1f0 devlink_health_report+0x2f7/0x810 ? vsnprintf+0x854/0x15e0 mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core] ? mlx5e_reporter_tx_err_cqe+0x1a0/0x1a0 [mlx5_core] ? mlx5e_tx_reporter_timeout_dump+0x50/0x50 [mlx5_core] ? mlx5e_tx_reporter_dump_sq+0x260/0x260 [mlx5_core] ? newidle_balance+0x9b7/0xe30 ? psi_group_change+0x6a7/0xb80 ? mutex_lock+0x96/0xf0 ? __mutex_lock_slowpath+0x10/0x10 mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core] process_one_work+0x680/0x1050 worker_thread+0x5a0/0xeb0 ? process_one_work+0x1050/0x1050 kthread+0x2a2/0x340 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 </TASK> ---[ end trace 0000000000000000 ]--- RIP: 0010:__alloc_pages+0x141/0x5c0 Code: e0 39 a3 96 89 e9 b8 22 01 32 01 83 e1 0f 48 89 fa 01 c9 48 c1 ea 03 d3 f8 83 e0 03 89 44 24 6c 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 0f 85 fc 03 00 00 89 e8 4a 8b 14 f5 e0 39 a3 96 4c 89 RSP: 0018:ffff888251f0f438 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 1ffff1104a3e1e8b RCX: 0000000000000000 RDX: 000000005c40d7ac RSI: 0000000000000003 RDI: 00000002e206bd60 RBP: 0000000000052dc0 R08: ffff8882b0044218 R09: ffff8882b0045e8a R10: fffffbfff300fefc R11: ffff888167af4000 R12: 0000000000000003 R13: 0000000000000000 R14: 00000000696c7070 R15: ffff8882373f4380 FS: 0000000000000000(0000) GS:ffff88bf2be80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005641d031eee8 CR3: 0000002e7ca14000 CR4: 0000000000350ee0 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x11000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]---] Reported-by: Frederick Lawler <fred@cloudflare.com> Link: https://lore.kernel.org/netdev/be5b9271-7507-19c5-ded1-fa78f1980e69@cloudflare.com Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> [hardik: Refer to the irqn member of the mlx5_irq struct, instead of the msi_map, since we don't have upstream v6.4 commit `235a25fe28` ("net/mlx5: Modify struct mlx5_irq to use struct msi_map")]. [hardik: Refer to the pf_pool member of the mlx5_irq_table struct, instead of pcif_pool, since we don't have upstream v6.4 commit `8bebfd7679` ("net/mlx5: Improve naming of pci function vectors")]. Signed-off-by: Hardik Garg <hargar@linux.microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-11 12:08:09 +02:00
Peter Zijlstra	15c22cd1de	perf: Fix function pointer case commit `1af6239d1d` upstream. With the advent of CFI it is no longer acceptible to cast function pointers. The robot complains thusly: kernel-events-core.c:warning:cast-from-int-(-)(struct-perf_cpu_pmu_context-)-to-remote_function_f-(aka-int-(-)(void-)-)-converts-to-incompatible-function-type Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Cixi Geng <cixi.geng1@unisoc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-11 12:08:09 +02:00
Jens Axboe	c7920f9928	io_uring: gate iowait schedule on having pending requests Commit `7b72d661f1` upstream. A previous commit made all cqring waits marked as iowait, as a way to improve performance for short schedules with pending IO. However, for use cases that have a special reaper thread that does nothing but wait on events on the ring, this causes a cosmetic issue where we know have one core marked as being "busy" with 100% iowait. While this isn't a grave issue, it is confusing to users. Rather than always mark us as being in iowait, gate setting of current->in_iowait to 1 by whether or not the waiting task has pending requests. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/io-uring/CAMEGJJ2RxopfNQ7GNLhr7X9=bHXKo+G5OOe0LUq=+UgLXsv1Xg@mail.gmail.com/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=217699 Link: https://bugzilla.kernel.org/show_bug.cgi?id=217700 Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reported-by: Phil Elwell <phil@raspberrypi.com> Tested-by: Andres Freund <andres@anarazel.de> Fixes: `8a796565ce` ("io_uring: Use io_schedule* in cqring wait") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-11 12:08:08 +02:00
Greg Kroah-Hartman	0a4a785530	Linux 6.1.44 Tested-by: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:51 +02:00
Greg Kroah-Hartman	dd5f2ef16e	x86: fix backwards merge of GDS/SRSO bit Stable-tree-only change. Due to the way the GDS and SRSO patches flowed into the stable tree, it was a 50% chance that the merge of the which value GDS and SRSO should be. Of course, I lost that bet, and chose the opposite of what Linus chose in commit `64094e7e31` ("Merge tag 'gds-for-linus-2023-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip") Fix this up by switching the values to match what is now in Linus's tree as that is the correct value to mirror. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:51 +02:00
Ross Lagerwall	fa5b932b77	xen/netback: Fix buffer overrun triggered by unusual packet commit `534fc31d09` upstream. It is possible that a guest can send a packet that contains a head + 18 slots and yet has a len <= XEN_NETBACK_TX_COPY_LEN. This causes nr_slots to underflow in xenvif_get_requests() which then causes the subsequent loop's termination condition to be wrong, causing a buffer overrun of queue->tx_map_ops. Rework the code to account for the extra frag_overflow slots. This is CVE-2023-34319 / XSA-432. Fixes: `ad7f402ae4` ("xen/netback: Ensure protocol headers don't fall in the non-linear area") Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:51 +02:00
Borislav Petkov (AMD)	4f25355540	x86/srso: Tie SBPB bit setting to microcode patch detection commit `5a15d83488` upstream. The SBPB bit in MSR_IA32_PRED_CMD is supported only after a microcode patch has been applied so set X86_FEATURE_SBPB only then. Otherwise, guests would attempt to set that bit and #GP on the MSR write. While at it, make SMT detection more robust as some guests - depending on how and what CPUID leafs their report - lead to cpu_smt_control getting set to CPU_SMT_NOT_SUPPORTED but SRSO_NO should be set for any guest incarnation where one simply cannot do SMT, for whatever reason. Fixes: `fb3bd914b3` ("x86/srso: Add a Speculative RAS Overflow mitigation") Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-by: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:51 +02:00
Borislav Petkov (AMD)	77cf32d0db	x86/srso: Add a forgotten NOENDBR annotation Upstream commit: `3bbbe97ad8` Fix: vmlinux.o: warning: objtool: .export_symbol+0x29e40: data relocation to !ENDBR: srso_untrain_ret_alias+0x0 Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:51 +02:00
Josh Poimboeuf	c7f2cd0455	x86/srso: Fix return thunks in generated code Upstream commit: `238ec850b9` Set X86_FEATURE_RETHUNK when enabling the SRSO mitigation so that generated code (e.g., ftrace, static call, eBPF) generates "jmp __x86_return_thunk" instead of RET. [ bp: Add a comment. ] Fixes: `fb3bd914b3` ("x86/srso: Add a Speculative RAS Overflow mitigation") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:51 +02:00
Borislav Petkov (AMD)	c9ae63d773	x86/srso: Add IBPB on VMEXIT Upstream commit: `d893832d0e` Add the option to flush IBPB only on VMEXIT in order to protect from malicious guests but one otherwise trusts the software that runs on the hypervisor. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:50 +02:00
Borislav Petkov (AMD)	79c8091888	x86/srso: Add IBPB Upstream commit: `233d6f68b9` Add the option to mitigate using IBPB on a kernel entry. Pull in the Retbleed alternative so that the IBPB call from there can be used. Also, if Retbleed mitigation is done using IBPB, the same mitigation can and must be used here. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:50 +02:00
Borislav Petkov (AMD)	98f62883e7	x86/srso: Add SRSO_NO support Upstream commit: `1b5277c0ea` Add support for the CPUID flag which denotes that the CPU is not affected by SRSO. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:50 +02:00
Borislav Petkov (AMD)	9139f4b6dd	x86/srso: Add IBPB_BRTYPE support Upstream commit: `79113e4060` Add support for the synthetic CPUID flag which "if this bit is 1, it indicates that MSR 49h (PRED_CMD) bit 0 (IBPB) flushes all branch type predictions from the CPU branch predictor." This flag is there so that this capability in guests can be detected easily (otherwise one would have to track microcode revisions which is impossible for guests). It is also needed only for Zen3 and -4. The other two (Zen1 and -2) always flush branch type predictions by default. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:50 +02:00
Borislav Petkov (AMD)	ac41e90d8d	x86/srso: Add a Speculative RAS Overflow mitigation Upstream commit: `fb3bd914b3` Add a mitigation for the speculative return address stack overflow vulnerability found on AMD processors. The mitigation works by ensuring all RET instructions speculate to a controlled location, similar to how speculation is controlled in the retpoline sequence. To accomplish this, the __x86_return_thunk forces the CPU to mispredict every function return using a 'safe return' sequence. To ensure the safety of this mitigation, the kernel must ensure that the safe return sequence is itself free from attacker interference. In Zen3 and Zen4, this is accomplished by creating a BTB alias between the untraining function srso_untrain_ret_alias() and the safe return function srso_safe_ret_alias() which results in evicting a potentially poisoned BTB entry and using that safe one for all function returns. In older Zen1 and Zen2, this is accomplished using a reinterpretation technique similar to Retbleed one: srso_untrain_ret() and srso_safe_ret(). Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:50 +02:00
Kim Phillips	dec3b91f2c	x86/cpu, kvm: Add support for CPUID_80000021_EAX commit `8415a74852` upstream. Add support for CPUID leaf 80000021, EAX. The majority of the features will be used in the kernel and thus a separate leaf is appropriate. Include KVM's reverse_cpuid entry because features are used by VM guests, too. [ bp: Massage commit message. ] Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-2-kim.phillips@amd.com [bwh: Backported to 6.1: adjust context] Signed-off-by: Ben Hutchings <benh@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:50 +02:00
Borislav Petkov (AMD)	dfede4cb8e	x86/bugs: Increase the x86 bugs vector size to two u32s Upstream commit: `0e52740ffd` There was never a doubt in my mind that they would not fit into a single u32 eventually. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:50 +02:00
Dave Hansen	dacb0bac2e	Documentation/x86: Fix backwards on/off logic about YMM support commit `1b0fc0345f` upstream These options clearly turn off XSAVE YMM support. Correct the typo. Reported-by: Ben Hutchings <ben@decadent.org.uk> Fixes: `553a5c03e9` ("x86/speculation: Add force option to GDS mitigation") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-08 20:03:49 +02:00

1 2 3 4 5 ...

1147250 Commits