linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-06 02:50:49 +09:00

Author	SHA1	Message	Date
Johannes Berg	fb195ff418	wifi: cfg80211: hold wiphy lock in auto-disconnect [ Upstream commit `e9da6df749` ] Most code paths in cfg80211 already hold the wiphy lock, mostly by virtue of being called from nl80211, so make the auto-disconnect worker also hold it, aligning the locking promises between different parts of cfg80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Stable-dep-of: `37c20b2eff` ("wifi: cfg80211: fix cqm_config access race") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:39 +02:00
Christophe JAILLET	6b3223449c	wifi: iwlwifi: mvm: Fix a memory corruption issue [ Upstream commit `8ba438ef3c` ] A few lines above, space is kzalloc()'ed for: sizeof(struct iwl_nvm_data) + sizeof(struct ieee80211_channel) + sizeof(struct ieee80211_rate) 'mvm->nvm_data' is a 'struct iwl_nvm_data', so it is fine. At the end of this structure, there is the 'channels' flex array. Each element is of type 'struct ieee80211_channel'. So only 1 element is allocated in this array. When doing: mvm->nvm_data->bands[0].channels = mvm->nvm_data->channels; We point at the first element of the 'channels' flex array. So this is fine. However, when doing: mvm->nvm_data->bands[0].bitrates = (void )((u8 )mvm->nvm_data->channels + 1); because of the "(u8 *)" cast, we add only 1 to the address of the beginning of the flex array. It is likely that we want point at the 'struct ieee80211_rate' allocated just after. Remove the spurious casting so that the pointer arithmetic works as expected. Fixes: `8ca151b568` ("iwlwifi: add the MVM driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/23f0ec986ef1529055f4f93dcb3940a6cf8d9a94.1690143750.git.christophe.jaillet@wanadoo.fr Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:39 +02:00
Arnd Bergmann	78b5c62ede	wifi: iwlwifi: dbg_ini: fix structure packing [ Upstream commit `424c82e8ad` ] The iwl_fw_ini_error_dump_range structure has conflicting alignment requirements for the inner union and the outer struct: In file included from drivers/net/wireless/intel/iwlwifi/fw/dbg.c:9: drivers/net/wireless/intel/iwlwifi/fw/error-dump.h:312:2: error: field within 'struct iwl_fw_ini_error_dump_range' is less aligned than 'union iwl_fw_ini_error_dump_range::(anonymous at drivers/net/wireless/intel/iwlwifi/fw/error-dump.h:312:2)' and is usually due to 'struct iwl_fw_ini_error_dump_range' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] union { As the original intention was apparently to make the entire structure unaligned, mark the innermost members the same way so the union becomes packed as well. Fixes: `973193554c` ("iwlwifi: dbg_ini: dump headers cleanup") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230616090343.2454061-1-arnd@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:39 +02:00
Gao Xiang	6a5a8f0a97	erofs: fix memory leak of LZMA global compressed deduplication [ Upstream commit `75a5221630` ] When stressing microLZMA EROFS images with the new global compressed deduplication feature enabled (`-Ededupe`), I found some short-lived temporary pages weren't properly released, which could slowly cause unexpected OOMs hours later. Let's fix it now (LZ4 and DEFLATE don't have this issue.) Fixes: `5c2a64252c` ("erofs: introduce partial-referenced pclusters") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230907050542.97152-1-hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:39 +02:00
Zhihao Cheng	91aeb418b9	ubi: Refuse attaching if mtd's erasesize is 0 [ Upstream commit `017c73a34a` ] There exists mtd devices with zero erasesize, which will trigger a divide-by-zero exception while attaching ubi device. Fix it by refusing attaching if mtd's erasesize is 0. Fixes: `801c135ce7` ("UBI: Unsorted Block Images") Reported-by: Yu Hao <yhao016@ucr.edu> Link: https://lore.kernel.org/lkml/977347543.226888.1682011999468.JavaMail.zimbra@nod.at/T/ Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:39 +02:00
Christophe JAILLET	f237b17611	HID: sony: Fix a potential memory leak in sony_probe() [ Upstream commit `e1cd4004cd` ] If an error occurs after a successful usb_alloc_urb() call, usb_free_urb() should be called. Fixes: `fb1a79a6b6` ("HID: sony: fix freeze when inserting ghlive ps3/wii dongles") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:39 +02:00
Rob Herring	6e3ae2927b	arm64: errata: Add Cortex-A520 speculative unprivileged load workaround commit `471470bc70` upstream. Implement the workaround for ARM Cortex-A520 erratum 2966298. On an affected Cortex-A520 core, a speculatively executed unprivileged load might leak data from a privileged load via a cache side channel. The issue only exists for loads within a translation regime with the same translation (e.g. same ASID and VMID). Therefore, the issue only affects the return to EL0. The workaround is to execute a TLBI before returning to EL0 after all loads of privileged data. A non-shareable TLBI to any address is sufficient. The workaround isn't necessary if page table isolation (KPTI) is enabled, but for simplicity it will be. Page table isolation should normally be disabled for Cortex-A520 as it supports the CSV3 feature and the E0PD feature (used when KASLR is enabled). Cc: stable@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230921194156.1050055-2-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:39 +02:00
Rob Herring	0a4ae26348	arm64: Add Cortex-A520 CPU part definition commit `a654a69b9f` upstream. Add the CPU Part number for the new Arm design. Cc: stable@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230921194156.1050055-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:39 +02:00
Mario Limonciello	d2894c4f47	drm/amd: Fix logic error in sienna_cichlid_update_pcie_parameters() commit `2a1fe39a5b` upstream. While aligning SMU11 with SMU13 implementation an assumption was made that `dpm_context->dpm_tables.pcie_table` was populated in dpm table initialization like in SMU13 but it isn't. So restore some of the original logic and instead just check for amdgpu_device_pcie_dynamic_switching_supported() to decide whether to hardcode values; erring on the side of performance. Cc: stable@vger.kernel.org # 6.1+ Reported-and-tested-by: Umio Yasuno <coelacanth_dream@protonmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1447#note_2101382 Fixes: `e701156ccc` ("drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:39 +02:00
Mario Limonciello	c8bd3e12b3	drm/amd: Fix detection of _PR3 on the PCIe root port commit `134b8c5d86` upstream. On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO. This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3. Cc: stable@vger.kernel.org Suggested-by: Jun Ma <Jun.Ma2@amd.com> Tested-by: David Perry <David.Perry@amd.com> Fixes: `b10c1c5b3a` ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:39 +02:00
Jordan Rife	fc8d9630c8	net: prevent rewrite of msg_name in sock_sendmsg() commit `86a7e0b69b` upstream. Callers of sock_sendmsg(), and similarly kernel_sendmsg(), in kernel space may observe their value of msg_name change in cases where BPF sendmsg hooks rewrite the send address. This has been confirmed to break NFS mounts running in UDP mode and has the potential to break other systems. This patch: 1) Creates a new function called __sock_sendmsg() with same logic as the old sock_sendmsg() function. 2) Replaces calls to sock_sendmsg() made by __sys_sendto() and __sys_sendmsg() with __sock_sendmsg() to avoid an unnecessary copy, as these system calls are already protected. 3) Modifies sock_sendmsg() so that it makes a copy of msg_name if present before passing it down the stack to insulate callers from changes to the send address. Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/ Fixes: `1cedee13d2` ("bpf: Hooks for sys_sendmsg") Cc: stable@vger.kernel.org Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jordan Rife <jrife@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:39 +02:00
Jordan Rife	34f9370ae4	net: replace calls to sock->ops->connect() with kernel_connect() commit `26297b4ce1` upstream. commit `0bdf399342` ("net: Avoid address overwrite in kernel_connect") ensured that kernel_connect() will not overwrite the address parameter in cases where BPF connect hooks perform an address rewrite. This change replaces direct calls to sock->ops->connect() in net with kernel_connect() to make these call safe. Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/ Fixes: `d74bad4e74` ("bpf: Hooks for sys_connect") Cc: stable@vger.kernel.org Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jordan Rife <jrife@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Sricharan Ramabadhran	2dfb5f324d	PCI: qcom: Fix IPQ8074 enumeration commit `6a878a54d0` upstream. PARF_SLV_ADDR_SPACE_SIZE_2_3_3 is used by qcom_pcie_post_init_2_3_3(). This PCIe slave address space size register offset is 0x358 but was incorrectly changed to 0x16c by `39171b33f6` ("PCI: qcom: Remove PCIE20_ prefix from register definitions"). This prevented access to slave address space registers like iATU, etc., so the IPQ8074 PCIe controller was not enumerated. Revert back to the correct 0x358 offset and remove the unused PARF_SLV_ADDR_SPACE_SIZE_2_3_3. Fixes: `39171b33f6` ("PCI: qcom: Remove PCIE20_ prefix from register definitions") Link: https://lore.kernel.org/r/20230919102948.1844909-1-quic_srichara@quicinc.com Tested-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Sricharan Ramabadhran <quic_srichara@quicinc.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
David Jeffery	ebf2d9a782	md/raid5: release batch_last before waiting for another stripe_head commit `2fd7b0f6d5` upstream. When raid5_get_active_stripe is called with a ctx containing a stripe_head in its batch_last pointer, it can cause a deadlock if the task sleeps waiting on another stripe_head to become available. The stripe_head held by batch_last can be blocking the advancement of other stripe_heads, leading to no stripe_heads being released so raid5_get_active_stripe waits forever. Like with the quiesce state handling earlier in the function, batch_last needs to be released by raid5_get_active_stripe before it waits for another stripe_head. Fixes: `3312e6c887` ("md/raid5: Keep a reference to last stripe_head for batch") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231002183422.13047-1-djeffery@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Gustavo A. R. Silva	c404d39e77	wifi: mwifiex: Fix tlv_buf_left calculation commit `eec679e4ac` upstream. In a TLV encoding scheme, the Length part represents the length after the header containing the values for type and length. In this case, `tlv_len` should be: tlv_len == (sizeof(tlv_rxba) - 1) - sizeof(tlv_rxba->header) + tlv_bitmap_len Notice that the `- 1` accounts for the one-element array `bitmap`, which 1-byte size is already included in `sizeof(tlv_rxba)`. So, if the above is correct, there is a double-counting of some members in `struct mwifiex_ie_types_rxba_sync`, when `tlv_buf_left` and `tmp` are calculated: 968 tlv_buf_left -= (sizeof(tlv_rxba) + tlv_len); 969 tmp = (u8 )tlv_rxba + tlv_len + sizeof(tlv_rxba); in specific, members: drivers/net/wireless/marvell/mwifiex/fw.h:777 777 u8 mac[ETH_ALEN]; 778 u8 tid; 779 u8 reserved; 780 __le16 seq_num; 781 __le16 bitmap_len; This is clearly wrong, and affects the subsequent decoding of data in `event_buf` through `tlv_rxba`: 970 tlv_rxba = (struct mwifiex_ie_types_rxba_sync )tmp; Fix this by using `sizeof(tlv_rxba->header)` instead of `sizeof(tlv_rxba)` in the calculation of `tlv_buf_left` and `tmp`. This results in the following binary differences before/after changes: \| drivers/net/wireless/marvell/mwifiex/11n_rxreorder.o \| @@ -4698,11 +4698,11 @@ \| drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c:968 \| tlv_buf_left -= (sizeof(tlv_rxba->header) + tlv_len); \| - 1da7: lea -0x11(%rbx),%edx \| + 1da7: lea -0x4(%rbx),%edx \| 1daa: movzwl %bp,%eax \| drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c:969 \| tmp = (u8 )tlv_rxba + sizeof(tlv_rxba->header) + tlv_len; \| - 1dad: lea 0x11(%r15,%rbp,1),%r15 \| + 1dad: lea 0x4(%r15,%rbp,1),%r15 The above reflects the desired change: avoid counting 13 too many bytes; which is the total size of the double-counted members in `struct mwifiex_ie_types_rxba_sync`: $ pahole -C mwifiex_ie_types_rxba_sync drivers/net/wireless/marvell/mwifiex/11n_rxreorder.o struct mwifiex_ie_types_rxba_sync { struct mwifiex_ie_types_header header; /* 0 4 / \|----------------------------------------------------------------------- \| u8 mac[6]; / 4 6 / \| \| u8 tid; / 10 1 / \| \| u8 reserved; / 11 1 / \| \| __le16 seq_num; / 12 2 / \| \| __le16 bitmap_len; / 14 2 / \| \| u8 bitmap[1]; / 16 1 / \| \|----------------------------------------------------------------------\| \| 13 bytes\| ----------- / size: 17, cachelines: 1, members: 7 / / last cacheline: 17 bytes */ } __attribute__((__packed__)); Fixes: `99ffe72cda` ("mwifiex: process rxba_sync event") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/06668edd68e7a26bbfeebd1201ae077a2a7a8bce.1692931954.git.gustavoars@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Luiz Augusto von Dentz	794ae3a9f8	Bluetooth: hci_sync: Fix handling of HCI_QUIRK_STRICT_DUPLICATE_FILTER commit `941c998b42` upstream. When HCI_QUIRK_STRICT_DUPLICATE_FILTER is set LE scanning requires periodic restarts of the scanning procedure as the controller would consider device previously found as duplicated despite of RSSI changes, but in order to set the scan timeout properly set le_scan_restart needs to be synchronous so it shall not use hci_cmd_sync_queue which defers the command processing to cmd_sync_work. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-bluetooth/578e6d7afd676129decafba846a933f5@agner.ch/#t Fixes: `27d54b778a` ("Bluetooth: Rework le_scan_restart for hci_sync") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Luiz Augusto von Dentz	626535077b	Bluetooth: hci_codec: Fix leaking content of local_codecs commit `b938790e70` upstream. The following memory leak can be observed when the controller supports codecs which are stored in local_codecs list but the elements are never freed: unreferenced object 0xffff88800221d840 (size 32): comm "kworker/u3:0", pid 36, jiffies 4294898739 (age 127.060s) hex dump (first 32 bytes): f8 d3 02 03 80 88 ff ff 80 d8 21 02 80 88 ff ff ..........!..... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffffb324f557>] __kmalloc+0x47/0x120 [<ffffffffb39ef37d>] hci_codec_list_add.isra.0+0x2d/0x160 [<ffffffffb39ef643>] hci_read_codec_capabilities+0x183/0x270 [<ffffffffb39ef9ab>] hci_read_supported_codecs+0x1bb/0x2d0 [<ffffffffb39f162e>] hci_read_local_codecs_sync+0x3e/0x60 [<ffffffffb39ff1b3>] hci_dev_open_sync+0x943/0x11e0 [<ffffffffb396d55d>] hci_power_on+0x10d/0x3f0 [<ffffffffb30c99b4>] process_one_work+0x404/0x800 [<ffffffffb30ca134>] worker_thread+0x374/0x670 [<ffffffffb30d9108>] kthread+0x188/0x1c0 [<ffffffffb304db6b>] ret_from_fork+0x2b/0x50 [<ffffffffb300206a>] ret_from_fork_asm+0x1a/0x30 Cc: stable@vger.kernel.org Fixes: `8961987f3f` ("Bluetooth: Enumerate local supported codec and cache details") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Gustavo A. R. Silva	01afbfb395	qed/red_ll2: Fix undefined behavior bug in struct qed_ll2_info commit `eea03d18af` upstream. The flexible structure (a structure that contains a flexible-array member at the end) `qed_ll2_tx_packet` is nested within the second layer of `struct qed_ll2_info`: struct qed_ll2_tx_packet { ... /* Flexible Array of bds_set determined by max_bds_per_packet / struct { struct core_tx_bd txq_bd; dma_addr_t tx_frag; u16 frag_len; } bds_set[]; }; struct qed_ll2_tx_queue { ... struct qed_ll2_tx_packet cur_completing_packet; }; struct qed_ll2_info { ... struct qed_ll2_tx_queue tx_queue; struct qed_ll2_cbs cbs; }; The problem is that member `cbs` in `struct qed_ll2_info` is placed just after an object of type `struct qed_ll2_tx_queue`, which is in itself an implicit flexible structure, which by definition ends in a flexible array member, in this case `bds_set`. This causes an undefined behavior bug at run-time when dynamic memory is allocated for `bds_set`, which could lead to a serious issue if `cbs` in `struct qed_ll2_info` is overwritten by the contents of `bds_set`. Notice that the type of `cbs` is a structure full of function pointers (and a cookie :) ): include/linux/qed/qed_ll2_if.h: 107 typedef 108 void (qed_ll2_complete_rx_packet_cb)(void cxt, 109 struct qed_ll2_comp_rx_data data); 110 111 typedef 112 void (qed_ll2_release_rx_packet_cb)(void cxt, 113 u8 connection_handle, 114 void cookie, 115 dma_addr_t rx_buf_addr, 116 bool b_last_packet); 117 118 typedef 119 void (qed_ll2_complete_tx_packet_cb)(void cxt, 120 u8 connection_handle, 121 void cookie, 122 dma_addr_t first_frag_addr, 123 bool b_last_fragment, 124 bool b_last_packet); 125 126 typedef 127 void (qed_ll2_release_tx_packet_cb)(void cxt, 128 u8 connection_handle, 129 void cookie, 130 dma_addr_t first_frag_addr, 131 bool b_last_fragment, bool b_last_packet); 132 133 typedef 134 void (qed_ll2_slowpath_cb)(void cxt, u8 connection_handle, 135 u32 opaque_data_0, u32 opaque_data_1); 136 137 struct qed_ll2_cbs { 138 qed_ll2_complete_rx_packet_cb rx_comp_cb; 139 qed_ll2_release_rx_packet_cb rx_release_cb; 140 qed_ll2_complete_tx_packet_cb tx_comp_cb; 141 qed_ll2_release_tx_packet_cb tx_release_cb; 142 qed_ll2_slowpath_cb slowpath_cb; 143 void *cookie; 144 }; Fix this by moving the declaration of `cbs` to the middle of its containing structure `qed_ll2_info`, preventing it from being overwritten by the contents of `bds_set` at run-time. This bug was introduced in 2017, when `bds_set` was converted to a one-element array, and started to be used as a Variable Length Object (VLO) at run-time. Fixes: `f5823fe689` ("qed: Add ll2 option to limit the number of bds per packet") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/ZQ+Nz8DfPg56pIzr@work Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Geliang Tang	454bb54b8f	mptcp: userspace pm allow creating id 0 subflow commit `e5ed101a60` upstream. This patch drops id 0 limitation in mptcp_nl_cmd_sf_create() to allow creating additional subflows with the local addr ID 0. There is no reason not to allow additional subflows from this local address: we should be able to create new subflows from the initial endpoint. This limitation was breaking fullmesh support from userspace. Fixes: `702c2f646d` ("mptcp: netlink: allow userspace-driven subflow establishment") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/391 Cc: stable@vger.kernel.org Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-2-28de4ac663ae@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Christian Marangi	4674e9626b	net: ethernet: mediatek: disable irq before schedule napi commit `fcdfc46288` upstream. While searching for possible refactor of napi_schedule_prep and __napi_schedule it was notice that the mtk eth driver disable the interrupt for rx and tx AFTER napi is scheduled. While this is a very hard to repro case it might happen to have situation where the interrupt is disabled and never enabled again as the napi completes and the interrupt is enabled before. This is caused by the fact that a napi driven by interrupt expect a logic with: 1. interrupt received. napi prepared -> interrupt disabled -> napi scheduled 2. napi triggered. ring cleared -> interrupt enabled -> wait for new interrupt To prevent this case, disable the interrupt BEFORE the napi is scheduled. Fixes: `656e705243` ("net-next: mediatek: add support for MT7623 ethernet") Cc: stable@vger.kernel.org Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20231002140805.568-1-ansuelsmth@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Stefano Garzarella	3a72decd6b	vringh: don't use vringh_kiov_advance() in vringh_iov_xfer() commit `7aed44babc` upstream. In the while loop of vringh_iov_xfer(), `partlen` could be 0 if one of the `iov` has 0 lenght. In this case, we should skip the iov and go to the next one. But calling vringh_kiov_advance() with 0 lenght does not cause the advancement, since it returns immediately if asked to advance by 0 bytes. Let's restore the code that was there before commit `b8c06ad4d6` ("vringh: implement vringh_kiov_advance()"), avoiding using vringh_kiov_advance(). Fixes: `b8c06ad4d6` ("vringh: implement vringh_kiov_advance()") Cc: stable@vger.kernel.org Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Zhang Rui	c12ef025ad	iommu/vt-d: Avoid memory allocation in iommu_suspend() commit `59df44bfb0` upstream. The iommu_suspend() syscore suspend callback is invoked with IRQ disabled. Allocating memory with the GFP_KERNEL flag may re-enable IRQs during the suspend callback, which can cause intermittent suspend/hibernation problems with the following kernel traces: Calling iommu_suspend+0x0/0x1d0 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 15 at kernel/time/timekeeping.c:868 ktime_get+0x9b/0xb0 ... CPU: 0 PID: 15 Comm: rcu_preempt Tainted: G U E 6.3-intel #r1 RIP: 0010:ktime_get+0x9b/0xb0 ... Call Trace: <IRQ> tick_sched_timer+0x22/0x90 ? __pfx_tick_sched_timer+0x10/0x10 __hrtimer_run_queues+0x111/0x2b0 hrtimer_interrupt+0xfa/0x230 __sysvec_apic_timer_interrupt+0x63/0x140 sysvec_apic_timer_interrupt+0x7b/0xa0 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1f/0x30 ... ------------[ cut here ]------------ Interrupts enabled after iommu_suspend+0x0/0x1d0 WARNING: CPU: 0 PID: 27420 at drivers/base/syscore.c:68 syscore_suspend+0x147/0x270 CPU: 0 PID: 27420 Comm: rtcwake Tainted: G U W E 6.3-intel #r1 RIP: 0010:syscore_suspend+0x147/0x270 ... Call Trace: <TASK> hibernation_snapshot+0x25b/0x670 hibernate+0xcd/0x390 state_store+0xcf/0xe0 kobj_attr_store+0x13/0x30 sysfs_kf_write+0x3f/0x50 kernfs_fop_write_iter+0x128/0x200 vfs_write+0x1fd/0x3c0 ksys_write+0x6f/0xf0 __x64_sys_write+0x1d/0x30 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Given that only 4 words memory is needed, avoid the memory allocation in iommu_suspend(). CC: stable@kernel.org Fixes: `33e0715710` ("iommu/vt-d: Avoid GFP_ATOMIC where it is not needed") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Ooi, Chin Hao <chin.hao.ooi@intel.com> Link: https://lore.kernel.org/r/20230921093956.234692-1-rui.zhang@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230925120417.55977-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Dinghao Liu	cdf18e7585	scsi: zfcp: Fix a double put in zfcp_port_enqueue() commit `b481f644d9` upstream. When device_register() fails, zfcp_port_release() will be called after put_device(). As a result, zfcp_ccw_adapter_put() will be called twice: one in zfcp_port_release() and one in the error path after device_register(). So the reference on the adapter object is doubly put, which may lead to a premature free. Fix this by adjusting the error tag after device_register(). Fixes: `f3450c7b91` ("[SCSI] zfcp: Replace local reference counting with common kref") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/r/20230923103723.10320-1-dinghao.liu@zju.edu.cn Acked-by: Benjamin Block <bblock@linux.ibm.com> Cc: stable@vger.kernel.org # v2.6.33+ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Yajun Deng	ef167cc188	i40e: fix the wrong PTP frequency calculation The new adjustment should be based on the base frequency, not the I40E_PTP_40GB_INCVAL in i40e_ptp_adjfine(). This issue was introduced in commit `3626a690b7` ("i40e: use mul_u64_u64_div_u64 for PTP frequency calculation"), frequency is left just as base I40E_PTP_40GB_INCVAL before the commit. After the commit, frequency is the I40E_PTP_40GB_INCVAL times the ptp_adj_mult value. But then the diff is applied on the wrong value, and no multiplication is done afterwards. It was accidentally fixed in commit `1060707e38` ("ptp: introduce helpers to adjust by scaled parts per million"). It uses adjust_by_scaled_ppm correctly performs the calculation and uses the base adjustment, so there's no error here. But it is a new feature and doesn't need to backported to the stable releases. This issue affects both v6.0 and v6.1, and the v6.1 version is an LTS release. Therefore, the patch only needs to be applied to v6.1 stable. Fixes: `3626a690b7` ("i40e: use mul_u64_u64_div_u64 for PTP frequency calculation") Cc: <stable@vger.kernel.org> # 6.1 Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:38 +02:00
Aleksandr Mezin	a0829d9cf2	hwmon: (nzxt-smart2) add another USB ID commit `4a148e9b1e` upstream. This seems to be a new revision of the device. RGB controls have changed, but this driver doesn't touch them anyway. Fan speed control reported to be working with existing userspace (hidraw) software, so I assume it's compatible. Fan channel count is the same. Recently added (0x1e71, 0x2019) seems to be the same device. Discovered in liquidctl project: https://github.com/liquidctl/liquidctl/issues/541 Signed-off-by: Aleksandr Mezin <mezin.alexander@gmail.com> Link: https://lore.kernel.org/r/20230219105924.333007-1-mezin.alexander@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:37 +02:00
Herman Fries	6ddb9e6b9b	hwmon: (nzxt-smart2) Add device id commit `e247510e1b` upstream. Adding support for new device id 1e71:2019 NZXT NZXT RGB & Fan Controller Signed-off-by: Herman Fries <baracoder@googlemail.com> Link: https://lore.kernel.org/r/20221214194627.135692-1-baracoder@googlemail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:37 +02:00
Ming Lei	752ec2d93e	block: fix use-after-free of q->q_usage_counter commit `d36a9ea5e7` upstream. For blk-mq, queue release handler is usually called after blk_mq_freeze_queue_wait() returns. However, the q_usage_counter->release() handler may not be run yet at that time, so this can cause a use-after-free. Fix the issue by moving percpu_ref_exit() into blk_free_queue_rcu(). Since ->release() is called with rcu read lock held, it is agreed that the race should be covered in caller per discussion from the two links. Reported-by: Zhang Wensheng <zhangwensheng@huaweicloud.com> Reported-by: Zhong Jinghua <zhongjinghua@huawei.com> Link: https://lore.kernel.org/linux-block/Y5prfOjyyjQKUrtH@T590/T/#u Link: https://lore.kernel.org/lkml/Y4%2FmzMd4evRg9yDi@fedora/ Cc: Hillf Danton <hdanton@sina.com> Cc: Yu Kuai <yukuai3@huawei.com> Cc: Dennis Zhou <dennis@kernel.org> Fixes: `2b0d3d3e4f` ("percpu_ref: reduce memory footprint of percpu_ref in fast path") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20221215021629.74870-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:37 +02:00
Ilya Dryomov	77d0e7e8e5	rbd: take header_rwsem in rbd_dev_refresh() only when updating commit `0b207d02bd` upstream. rbd_dev_refresh() has been holding header_rwsem across header and parent info read-in unnecessarily for ages. With commit `870611e487` ("rbd: get snapshot context after exclusive lock is ensured to be held"), the potential for deadlocks became much more real owning to a) header_rwsem now nesting inside lock_rwsem and b) rw_semaphores not allowing new readers after a writer is registered. For example, assuming that I/O request 1, I/O request 2 and header read-in request all target the same OSD: 1. I/O request 1 comes in and gets submitted 2. watch error occurs 3. rbd_watch_errcb() takes lock_rwsem for write, clears owner_cid and releases lock_rwsem 4. after reestablishing the watch, rbd_reregister_watch() calls rbd_dev_refresh() which takes header_rwsem for write and submits a header read-in request 5. I/O request 2 comes in: after taking lock_rwsem for read in __rbd_img_handle_request(), it blocks trying to take header_rwsem for read in rbd_img_object_requests() 6. another watch error occurs 7. rbd_watch_errcb() blocks trying to take lock_rwsem for write 8. I/O request 1 completion is received by the messenger but can't be processed because lock_rwsem won't be granted anymore 9. header read-in request completion can't be received, let alone processed, because the messenger is stranded Change rbd_dev_refresh() to take header_rwsem only for actually updating rbd_dev->header. Header and parent info read-in don't need any locking. Cc: stable@vger.kernel.org # `0b035401c5`: rbd: move rbd_dev_refresh() definition Cc: stable@vger.kernel.org # `510a7330c8`: rbd: decouple header read-in from updating rbd_dev->header Cc: stable@vger.kernel.org # `c10311776f`: rbd: decouple parent info read-in from updating rbd_dev Cc: stable@vger.kernel.org Fixes: `870611e487` ("rbd: get snapshot context after exclusive lock is ensured to be held") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:37 +02:00
Ilya Dryomov	698039a461	rbd: decouple parent info read-in from updating rbd_dev commit `c10311776f` upstream. Unlike header read-in, parent info read-in is already decoupled in get_parent_info(), but it's buried in rbd_dev_v2_parent_info() along with the processing logic. Separate the initial read-in and update read-in logic into rbd_dev_setup_parent() and rbd_dev_update_parent() respectively and have rbd_dev_v2_parent_info() just populate struct parent_image_info (i.e. what get_parent_info() did). Some existing QoI issues, like flatten of a standalone clone being disregarded on refresh, remain. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:37 +02:00
Ilya Dryomov	377d26174e	rbd: decouple header read-in from updating rbd_dev->header commit `510a7330c8` upstream. Make rbd_dev_header_info() populate a passed struct rbd_image_header instead of rbd_dev->header and introduce rbd_dev_update_header() for updating mutable fields in rbd_dev->header upon refresh. The initial read-in of both mutable and immutable fields in rbd_dev_image_probe() passes in rbd_dev->header so no update step is required there. rbd_init_layout() is now called directly from rbd_dev_image_probe() instead of individually in format 1 and format 2 implementations. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:37 +02:00
Ilya Dryomov	33ecf5f5a8	rbd: move rbd_dev_refresh() definition commit `0b035401c5` upstream. Move rbd_dev_refresh() definition further down to avoid having to move struct parent_image_info definition in the next commit. This spares some forward declarations too. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> [idryomov@gmail.com: backport to 5.10-6.1: context] Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:37 +02:00
Robin Murphy	ff09fa5f23	iommu/arm-smmu-v3: Avoid constructing invalid range commands [ Upstream commit `eb6c97647b` ] Although io-pgtable's non-leaf invalidations are always for full tables, I missed that SVA also uses non-leaf invalidations, while being at the mercy of whatever range the MMU notifier throws at it. This means it definitely wants the previous TTL fix as well, since it also doesn't know exactly which leaf level(s) may need invalidating, but it can also give us less-aligned ranges wherein certain corners may lead to building an invalid command where TTL, Num and Scale are all 0. It should be fine to handle this by over-invalidating an extra page, since falling back to a non-range command opens up a whole can of errata-flavoured worms. Fixes: `6833b8f2e1` ("iommu/arm-smmu-v3: Set TTL invalidation hint better") Reported-by: Rui Zhu <zhurui3@huawei.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/b99cfe71af2bd93a8a2930f20967fb2a4f7748dd.1694432734.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:37 +02:00
Robin Murphy	357ba59b9d	iommu/arm-smmu-v3: Set TTL invalidation hint better [ Upstream commit `6833b8f2e1` ] When io-pgtable unmaps a whole table, rather than waste time walking it to find the leaf entries to invalidate exactly, it simply expects .tlb_flush_walk with nominal last-level granularity to invalidate any leaf entries at higher intermediate levels as well. This works fine with page-based invalidation, but with range commands we need to be careful with the TTL hint - unconditionally setting it based on the given level 3 granule means that an invalidation for a level 1 table would strictly not be required to affect level 2 block entries. It's easy to comply with the expected behaviour by simply not setting the TTL hint for non-leaf invalidations, so let's do that. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/b409d9a17c52dc0db51faee91d92737bb7975f5b.1685637456.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:37 +02:00
Wayne Lin	7147287293	drm/amd/display: Adjust the MST resume flow commit `ec5fa9fcde` upstream. [Why] In drm_dp_mst_topology_mgr_resume() today, it will resume the mst branch to be ready handling mst mode and also consecutively do the mst topology probing. Which will cause the dirver have chance to fire hotplug event before restoring the old state. Then Userspace will react to the hotplug event based on a wrong state. [How] Adjust the mst resume flow as: 1. set dpcd to resume mst branch status 2. restore source old state 3. Do mst resume topology probing For drm_dp_mst_topology_mgr_resume(), it's better to adjust it to pull out topology probing work into a 2nd part procedure of the mst resume. Will have a follow up patch in drm. Reviewed-by: Chao-kai Wang <stylon.wang@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> [ Adjust for missing variable rename in `f0127cb112` ("drm/amdgpu/display/mst: adjust the naming of mst_port and port of aconnector") ] Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:37 +02:00
Kristina Martsenko	b0fe378674	arm64: cpufeature: Fix CLRBHB and BC detection commit `479965a2b7` upstream. ClearBHB support is indicated by the CLRBHB field in ID_AA64ISAR2_EL1. Following some refactoring the kernel incorrectly checks the BC field instead. Fix the detection to use the right field. (Note: The original ClearBHB support had it as FTR_HIGHER_SAFE, but this patch uses FTR_LOWER_SAFE, which seems more correct.) Also fix the detection of BC (hinted conditional branches) to use FTR_LOWER_SAFE, so that it is not reported on mismatched systems. Fixes: `356137e68a` ("arm64/sysreg: Make BHB clear feature defines match the architecture") Fixes: `8fcc8285c0` ("arm64/sysreg: Convert ID_AA64ISAR2_EL1 to automatic generation") Cc: stable@vger.kernel.org Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230912133429.2606875-1-kristina.martsenko@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:37 +02:00
Patrick Rohr	b691264274	net: release reference to inet6_dev pointer commit `5cb249686e` upstream. addrconf_prefix_rcv returned early without releasing the inet6_dev pointer when the PIO lifetime is less than accept_ra_min_lft. Fixes: `5027d54a9c` ("net: change accept_ra_min_rtr_lft to affect all RA lifetimes") Cc: Maciej Żenczykowski <maze@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: David Ahern <dsahern@kernel.org> Cc: Simon Horman <horms@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Patrick Rohr <prohr@google.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:37 +02:00
Patrick Rohr	bad004c384	net: change accept_ra_min_rtr_lft to affect all RA lifetimes commit `5027d54a9c` upstream. accept_ra_min_rtr_lft only considered the lifetime of the default route and discarded entire RAs accordingly. This change renames accept_ra_min_rtr_lft to accept_ra_min_lft, and applies the value to individual RA sections; in particular, router lifetime, PIO preferred lifetime, and RIO lifetime. If any of those lifetimes are lower than the configured value, the specific RA section is ignored. In order for the sysctl to be useful to Android, it should really apply to all lifetimes in the RA, since that is what determines the minimum frequency at which RAs must be processed by the kernel. Android uses hardware offloads to drop RAs for a fraction of the minimum of all lifetimes present in the RA (some networks have very frequent RAs (5s) with high lifetimes (2h)). Despite this, we have encountered networks that set the router lifetime to 30s which results in very frequent CPU wakeups. Instead of disabling IPv6 (and dropping IPv6 ethertype in the WiFi firmware) entirely on such networks, it seems better to ignore the misconfigured routers while still processing RAs from other IPv6 routers on the same network (i.e. to support IoT applications). The previous implementation dropped the entire RA based on router lifetime. This turned out to be hard to expand to the other lifetimes present in the RA in a consistent manner; dropping the entire RA based on RIO/PIO lifetimes would essentially require parsing the whole thing twice. Fixes: `1671bcfd76` ("net: add sysctl accept_ra_min_rtr_lft") Cc: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Patrick Rohr <prohr@google.com> Reviewed-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230726230701.919212-1-prohr@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:37 +02:00
Patrick Rohr	ec4162bb70	net: add sysctl accept_ra_min_rtr_lft commit `1671bcfd76` upstream. This change adds a new sysctl accept_ra_min_rtr_lft to specify the minimum acceptable router lifetime in an RA. If the received RA router lifetime is less than the configured value (and not 0), the RA is ignored. This is useful for mobile devices, whose battery life can be impacted by networks that configure RAs with a short lifetime. On such networks, the device should never gain IPv6 provisioning and should attempt to drop RAs via hardware offload, if available. Signed-off-by: Patrick Rohr <prohr@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-10-10 22:00:36 +02:00
Gabriel Krisman Bertazi	9d91134c16	arm64: Avoid repeated AA64MMFR1_EL1 register read on pagefault path [ Upstream commit `a89c6bcdac` ] Accessing AA64MMFR1_EL1 is expensive in KVM guests, since it is emulated in the hypervisor. In fact, ARM documentation mentions some feature registers are not supposed to be accessed frequently by the OS, and therefore should be emulated for guests [1]. Commit `0388f9c743` ("arm64: mm: Implement arch_wants_old_prefaulted_pte()") introduced a read of this register in the page fault path. But, even when the feature of setting faultaround pages with the old flag is disabled for a given cpu, we are still paying the cost of checking the register on every pagefault. This results in an explosion of vmexit events in KVM guests, which directly impacts the performance of virtualized workloads. For instance, running kernbench yields a 15% increase in system time solely due to the increased vmexit cycles. This patch avoids the extra cost by using the sanitized cached value. It should be safe to do so, since this register mustn't change for a given cpu. [1] https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/Learn%20the%20Architecture/Armv8-A%20virtualization.pdf?revision=a765a7df-1a00-434d-b241-357bfda2dd31 Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20230109151955.8292-1-krisman@suse.de Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:36 +02:00
Benjamin Coddington	dd8c836930	Revert "NFSv4: Retry LOCK on OLD_STATEID during delegation return" [ Upstream commit `5b4a82a072` ] Olga Kornievskaia reports that this patch breaks NFSv4.0 state recovery. It also introduces additional complexity in the error paths for cases not related to the original problem. Let's revert it for now, and address the original problem in another manner. This reverts commit `f5ea16137a`. Fixes: `f5ea16137a` ("NFSv4: Retry LOCK on OLD_STATEID during delegation return") Reported-by: Kornievskaia, Olga <Olga.Kornievskaia@netapp.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:36 +02:00
Sweet Tea Dorminy	ef54db5b5d	btrfs: use struct fscrypt_str instead of struct qstr [ Upstream commit `6db7531882` ] While struct qstr is more natural without fscrypt, since it's provided by dentries, struct fscrypt_str is provided by the fscrypt handlers processing dentries, and is thus more natural in the fscrypt world. Replace all of the struct qstr uses with struct fscrypt_str. Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: `9af86694fd` ("btrfs: file_remove_privs needs an exclusive lock in direct io write") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:36 +02:00
Sweet Tea Dorminy	68ad364ec8	btrfs: setup qstr from dentrys using fscrypt helper [ Upstream commit `ab3c5c18e8` ] Most places where we get a struct qstr, we are doing so from a dentry. With fscrypt, the dentry's name may be encrypted on-disk, so fscrypt provides a helper to convert a dentry name to the appropriate disk name if necessary. Convert each of the dentry name accesses to use fscrypt_setup_filename(), then convert the resulting fscrypt_name back to an unencrypted qstr. This does not work for nokey names, but the specific locations that could spawn nokey names are noted. At present, since there are no encrypted directories, nothing goes down the filename encryption paths. Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: `9af86694fd` ("btrfs: file_remove_privs needs an exclusive lock in direct io write") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:36 +02:00
Sweet Tea Dorminy	1cf474cd47	btrfs: use struct qstr instead of name and namelen pairs [ Upstream commit `e43eec81c5` ] Many functions throughout btrfs take name buffer and name length arguments. Most of these functions at the highest level are usually called with these arguments extracted from a supplied dentry's name. But the entire name can be passed instead, making each function a little more elegant. Each function whose arguments are currently the name and length extracted from a dentry is herein converted to instead take a pointer to the name in the dentry. The couple of calls to these calls without a struct dentry are converted to create an appropriate qstr to pass in. Additionally, every function which is only called with a name/len extracted directly from a qstr is also converted. This change has positive effect on stack consumption, frame of many functions is reduced but this will be used in the future for fscrypt related structures. Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: `9af86694fd` ("btrfs: file_remove_privs needs an exclusive lock in direct io write") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:36 +02:00
Zheng Yejian	87efd87d36	ring-buffer: Fix bytes info in per_cpu buffer stats [ Upstream commit `45d99ea451` ] The 'bytes' info in file 'per_cpu/cpu<X>/stats' means the number of bytes in cpu buffer that have not been consumed. However, currently after consuming data by reading file 'trace_pipe', the 'bytes' info was not changed as expected. # cat per_cpu/cpu0/stats entries: 0 overrun: 0 commit overrun: 0 bytes: 568 <--- 'bytes' is problematical !!! oldest event ts: 8651.371479 now ts: 8653.912224 dropped events: 0 read events: 8 The root cause is incorrect stat on cpu_buffer->read_bytes. To fix it: 1. When stat 'read_bytes', account consumed event in rb_advance_reader(); 2. When stat 'entries_bytes', exclude the discarded padding event which is smaller than minimum size because it is invisible to reader. Then use rb_page_commit() instead of BUF_PAGE_SIZE at where accounting for page-based read/remove/overrun. Also correct the comments of ring_buffer_bytes_cpu() in this patch. Link: https://lore.kernel.org/linux-trace-kernel/20230921125425.1708423-1-zhengyejian1@huawei.com Cc: stable@vger.kernel.org Fixes: `c64e148a3b` ("trace: Add ring buffer stats to measure rate of events") Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:36 +02:00
Vlastimil Babka	62eed43e03	ring-buffer: remove obsolete comment for free_buffer_page() [ Upstream commit `a98151ad53` ] The comment refers to mm/slob.c which is being removed. It comes from commit `ed56829cb3` ("ring_buffer: reset buffer page when freeing") and according to Steven the borrowed code was a page mapcount and mapping reset, which was later removed by commit `e4c2ce82ca` ("ring_buffer: allocate buffer page pointer"). Thus the comment is not accurate anyway, remove it. Link: https://lore.kernel.org/linux-trace-kernel/20230315142446.27040-1-vbabka@suse.cz Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Reported-by: Mike Rapoport <mike.rapoport@gmail.com> Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Fixes: `e4c2ce82ca` ("ring_buffer: allocate buffer page pointer") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Stable-dep-of: `45d99ea451` ("ring-buffer: Fix bytes info in per_cpu buffer stats") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:36 +02:00
Johannes Weiner	836adaddc6	mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list [ Upstream commit `7b086755fb` ] Commit `4b23a68f95` ("mm/page_alloc: protect PCP lists with a spinlock") bypasses the pcplist on lock contention and returns the page directly to the buddy list of the page's migratetype. For pages that don't have their own pcplist, such as CMA and HIGHATOMIC, the migratetype is temporarily updated such that the page can hitch a ride on the MOVABLE pcplist. Their true type is later reassessed when flushing in free_pcppages_bulk(). However, when lock contention is detected after the type was already overridden, the bypass will then put the page on the wrong buddy list. Once on the MOVABLE buddy list, the page becomes eligible for fallbacks and even stealing. In the case of HIGHATOMIC, otherwise ineligible allocations can dip into the highatomic reserves. In the case of CMA, the page can be lost from the CMA region permanently. Use a separate pcpmigratetype variable for the pcplist override. Use the original migratetype when going directly to the buddy. This fixes the bug and should make the intentions more obvious in the code. Originally sent here to address the HIGHATOMIC case: https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@cmpxchg.org/ Changelog updated in response to the CMA-specific bug report. [mgorman@techsingularity.net: updated changelog] Link: https://lkml.kernel.org/r/20230911181108.GA104295@cmpxchg.org Fixes: `4b23a68f95` ("mm/page_alloc: protect PCP lists with a spinlock") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Joe Liu <joe.liu@mediatek.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:36 +02:00
Mel Gorman	d1da921452	mm/page_alloc: leave IRQs enabled for per-cpu page allocations [ Upstream commit `5749077415` ] The pcp_spin_lock_irqsave protecting the PCP lists is IRQ-safe as a task allocating from the PCP must not re-enter the allocator from IRQ context. In each instance where IRQ-reentrancy is possible, the lock is acquired using pcp_spin_trylock_irqsave() even though IRQs are disabled and re-entrancy is impossible. Demote the lock to pcp_spin_lock avoids an IRQ disable/enable in the common case at the cost of some IRQ allocations taking a slower path. If the PCP lists need to be refilled, the zone lock still needs to disable IRQs but that will only happen on PCP refill and drain. If an IRQ is raised when a PCP allocation is in progress, the trylock will fail and fallback to using the buddy lists directly. Note that this may not be a universal win if an interrupt-intensive workload also allocates heavily from interrupt context and contends heavily on the zone->lock as a result. [mgorman@techsingularity.net: migratetype might be wrong if a PCP was locked] Link: https://lkml.kernel.org/r/20221122131229.5263-2-mgorman@techsingularity.net [yuzhao@google.com: reported lockdep issue on IO completion from softirq] [hughd@google.com: fix list corruption, lock improvements, micro-optimsations] Link: https://lkml.kernel.org/r/20221118101714.19590-3-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: `7b086755fb` ("mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:36 +02:00
Mel Gorman	570786ac6f	mm/page_alloc: always remove pages from temporary list [ Upstream commit `c3e58a7042` ] Patch series "Leave IRQs enabled for per-cpu page allocations", v3. This patch (of 2): free_unref_page_list() has neglected to remove pages properly from the list of pages to free since forever. It works by coincidence because list_add happened to do the right thing adding the pages to just the PCP lists. However, a later patch added pages to either the PCP list or the zone list but only properly deleted the page from the list in one path leading to list corruption and a subsequent failure. As a preparation patch, always delete the pages from one list properly before adding to another. On its own, this fixes nothing although it adds a fractional amount of overhead but is critical to the next patch. Link: https://lkml.kernel.org/r/20221118101714.19590-1-mgorman@techsingularity.net Link: https://lkml.kernel.org/r/20221118101714.19590-2-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Hugh Dickins <hughd@google.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: `7b086755fb` ("mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:36 +02:00
Yang Shi	939189aedf	mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified [ Upstream commit `24526268f4` ] When calling mbind() with MPOL_MF_{MOVE\|MOVEALL} \| MPOL_MF_STRICT, kernel should attempt to migrate all existing pages, and return -EIO if there is misplaced or unmovable page. Then commit `6f4576e368` ("mempolicy: apply page table walker on queue_pages_range()") messed up the return value and didn't break VMA scan early ianymore when MPOL_MF_STRICT alone. The return value problem was fixed by commit `a7f40cfe3b` ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified"), but it broke the VMA walk early if unmovable page is met, it may cause some pages are not migrated as expected. The code should conceptually do: if (MPOL_MF_MOVE\|MOVEALL) scan all vmas try to migrate the existing pages return success else if (MPOL_MF_MOVE* \| MPOL_MF_STRICT) scan all vmas try to migrate the existing pages return -EIO if unmovable or migration failed else /* MPOL_MF_STRICT alone / break early if meets unmovable and don't call mbind_range() at all else / none of those flags */ check the ranges in test_walk, EFAULT without mbind_range() if discontig. Fixed the behavior. Link: https://lkml.kernel.org/r/20230920223242.3425775-1-yang@os.amperecomputing.com Fixes: `a7f40cfe3b` ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified") Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Cc: Hugh Dickins <hughd@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael Aquini <aquini@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [4.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:35 +02:00
Vishal Moola (Oracle)	ce9f3441fc	mm/mempolicy: convert migrate_page_add() to migrate_folio_add() [ Upstream commit `4a64981dfe` ] Replace migrate_page_add() with migrate_folio_add(). migrate_folio_add() does the same a migrate_page_add() but takes in a folio instead of a page. This removes a couple of calls to compound_head(). Link: https://lkml.kernel.org/r/20230130201833.27042-7-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jane Chu <jane.chu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: `24526268f4` ("mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-10-10 22:00:35 +02:00

1 2 3 4 5 ...

1149140 Commits