linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-08 03:40:35 +09:00

Author	SHA1	Message	Date
Ong Boon Leong	7f101d035d	net: stmmac: fix incorrect DMA channel intr enable setting of EQoS v4.10 commit `879c348c35` upstream. We introduce dwmac410_dma_init_channel() here for both EQoS v4.10 and above which use different DMA_CH(n)_Interrupt_Enable bit definitions for NIE and AIE. Fixes: `48863ce594` ("stmmac: add DMA support for GMAC 4.xx") Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Ramesh Babu B <ramesh.babu.b@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:34 +01:00
Kevin(Yudong) Yang	0fbbcf797e	net/mlx4_en: update moderation when config reset commit `00ff801bb8` upstream. This patch fixes a bug that the moderation config will not be applied when calling mlx4_en_reset_config. For example, when turning on rx timestamping, mlx4_en_reset_config() will be called, causing the NIC to forget previous moderation config. This fix is in phase with a previous fix: commit `79c54b6bbf` ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called") Tested: Before this patch, on a host with NIC using mlx4, run netserver and stream TCP to the host at full utilization. $ sar -I SUM 1 INTR intr/s 14:03:56 sum 48758.00 After rx hwtstamp is enabled: $ sar -I SUM 1 14:10:38 sum 317771.00 We see the moderation is not working properly and issued 7x more interrupts. After the patch, and turned on rx hwtstamp, the rate of interrupts is as expected: $ sar -I SUM 1 14:52:11 sum 49332.00 Fixes: `79c54b6bbf` ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called") Signed-off-by: Kevin(Yudong) Yang <yyd@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> CC: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:34 +01:00
Vladimir Oltean	78cbd0a474	net: enetc: don't overwrite the RSS indirection table when initializing commit `c646d10dda` upstream. After the blamed patch, all RX traffic gets hashed to CPU 0 because the hashing indirection table set up in: enetc_pf_probe -> enetc_alloc_si_resources -> enetc_configure_si -> enetc_setup_default_rss_table is overwritten later in: enetc_pf_probe -> enetc_init_port_rss_memory which zero-initializes the entire port RSS table in order to avoid ECC errors. The trouble really is that enetc_init_port_rss_memory really neads enetc_alloc_si_resources to be called, because it depends upon enetc_alloc_cbdr and enetc_setup_cbdr. But that whole enetc_configure_si thing could have been better thought out, it has nothing to do in a function called "alloc_si_resources", especially since its counterpart, "free_si_resources", does nothing to unwind the configuration of the SI. The point is, we need to pull out enetc_configure_si out of enetc_alloc_resources, and move it after enetc_init_port_rss_memory. This allows us to set up the default RSS indirection table after initializing the memory. Fixes: `07bf34a50e` ("net: enetc: initialize the RFS and RSS memories") Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:34 +01:00
Linus Torvalds	6547ec4286	Revert "mm, slub: consider rest of partial list if acquire_slab() fails" commit `9b1ea29bc0` upstream. This reverts commit `8ff60eb052`. The kernel test robot reports a huge performance regression due to the commit, and the reason seems fairly straightforward: when there is contention on the page list (which is what causes acquire_slab() to fail), we do _not_ want to just loop and try again, because that will transfer the contention to the 'n->list_lock' spinlock we hold, and just make things even worse. This is admittedly likely a problem only on big machines - the kernel test robot report comes from a 96-thread dual socket Intel Xeon Gold 6252 setup, but the regression there really is quite noticeable: -47.9% regression of stress-ng.rawpkt.ops_per_sec and the commit that was marked as being fixed (`7ced371971`: "slub: Acquire_slab() avoid loop") actually did the loop exit early very intentionally (the hint being that "avoid loop" part of that commit message), exactly to avoid this issue. The correct thing to do may be to pick some kind of reasonable middle ground: instead of breaking out of the loop on the very first sign of contention, or trying over and over and over again, the right thing may be to re-try _once_, and then give up on the second failure (or pick your favorite value for "once"..). Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/lkml/20210301080404.GF12822@xsang-OptiPlex-9020/ Cc: Jann Horn <jannh@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:34 +01:00
Paulo Alcantara	55e6ede3b9	cifs: return proper error code in statfs(2) commit `14302ee330` upstream. In cifs_statfs(), if server->ops->queryfs is not NULL, then we should use its return value rather than always returning 0. Instead, use rc variable as it is properly set to 0 in case there is no server->ops->queryfs. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:34 +01:00
Christian Brauner	a1ff418d3e	mount: fix mounting of detached mounts onto targets that reside on shared mounts commit `ee2e3f5062` upstream. Creating a series of detached mounts, attaching them to the filesystem, and unmounting them can be used to trigger an integer overflow in ns->mounts causing the kernel to block any new mounts in count_mounts() and returning ENOSPC because it falsely assumes that the maximum number of mounts in the mount namespace has been reached, i.e. it thinks it can't fit the new mounts into the mount namespace anymore. Depending on the number of mounts in your system, this can be reproduced on any kernel that supportes open_tree() and move_mount() by compiling and running the following program: /* SPDX-License-Identifier: LGPL-2.1+ / #define _GNU_SOURCE #include <errno.h> #include <fcntl.h> #include <getopt.h> #include <limits.h> #include <stdbool.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/mount.h> #include <sys/stat.h> #include <sys/syscall.h> #include <sys/types.h> #include <unistd.h> / open_tree() / #ifndef OPEN_TREE_CLONE #define OPEN_TREE_CLONE 1 #endif #ifndef OPEN_TREE_CLOEXEC #define OPEN_TREE_CLOEXEC O_CLOEXEC #endif #ifndef __NR_open_tree #if defined __alpha__ #define __NR_open_tree 538 #elif defined _MIPS_SIM #if _MIPS_SIM == _MIPS_SIM_ABI32 / o32 / #define __NR_open_tree 4428 #endif #if _MIPS_SIM == _MIPS_SIM_NABI32 / n32 / #define __NR_open_tree 6428 #endif #if _MIPS_SIM == _MIPS_SIM_ABI64 / n64 / #define __NR_open_tree 5428 #endif #elif defined __ia64__ #define __NR_open_tree (428 + 1024) #else #define __NR_open_tree 428 #endif #endif / move_mount() / #ifndef MOVE_MOUNT_F_EMPTY_PATH #define MOVE_MOUNT_F_EMPTY_PATH 0x00000004 / Empty from path permitted / #endif #ifndef __NR_move_mount #if defined __alpha__ #define __NR_move_mount 539 #elif defined _MIPS_SIM #if _MIPS_SIM == _MIPS_SIM_ABI32 / o32 / #define __NR_move_mount 4429 #endif #if _MIPS_SIM == _MIPS_SIM_NABI32 / n32 / #define __NR_move_mount 6429 #endif #if _MIPS_SIM == _MIPS_SIM_ABI64 / n64 / #define __NR_move_mount 5429 #endif #elif defined __ia64__ #define __NR_move_mount (428 + 1024) #else #define __NR_move_mount 429 #endif #endif static inline int sys_open_tree(int dfd, const char filename, unsigned int flags) { return syscall(__NR_open_tree, dfd, filename, flags); } static inline int sys_move_mount(int from_dfd, const char from_pathname, int to_dfd, const char to_pathname, unsigned int flags) { return syscall(__NR_move_mount, from_dfd, from_pathname, to_dfd, to_pathname, flags); } static bool is_shared_mountpoint(const char path) { bool shared = false; FILE f = NULL; char line = NULL; int i; size_t len = 0; f = fopen("/proc/self/mountinfo", "re"); if (!f) return 0; while (getline(&line, &len, f) > 0) { char slider1, slider2; for (slider1 = line, i = 0; slider1 && i < 4; i++) slider1 = strchr(slider1 + 1, ' '); if (!slider1) continue; slider2 = strchr(slider1 + 1, ' '); if (!slider2) continue; slider2 = '\0'; if (strcmp(slider1 + 1, path) == 0) { /* This is the path. Is it shared? / slider1 = strchr(slider2 + 1, ' '); if (slider1 && strstr(slider1, "shared:")) { shared = true; break; } } } fclose(f); free(line); return shared; } static void usage(void) { const char text = "mount-new [--recursive] <base-dir>\n"; fprintf(stderr, "%s", text); _exit(EXIT_SUCCESS); } #define exit_usage(format, ...) \ ({ \ fprintf(stderr, format "\n", ##__VA_ARGS__); \ usage(); \ }) #define exit_log(format, ...) \ ({ \ fprintf(stderr, format "\n", ##__VA_ARGS__); \ exit(EXIT_FAILURE); \ }) static const struct option longopts[] = { {"help", no_argument, 0, 'a'}, { NULL, no_argument, 0, 0 }, }; int main(int argc, char argv[]) { int exit_code = EXIT_SUCCESS, index = 0; int dfd, fd_tree, new_argc, ret; char base_dir; char const new_argv; char target[PATH_MAX]; while ((ret = getopt_long_only(argc, argv, "", longopts, &index)) != -1) { switch (ret) { case 'a': /* fallthrough / default: usage(); } } new_argv = &argv[optind]; new_argc = argc - optind; if (new_argc < 1) exit_usage("Missing base directory\n"); base_dir = new_argv[0]; if (base_dir != '/') exit_log("Please specify an absolute path"); /* Ensure that target is a shared mountpoint. / if (!is_shared_mountpoint(base_dir)) exit_log("Please ensure that \"%s\" is a shared mountpoint", base_dir); dfd = open(base_dir, O_RDONLY \| O_DIRECTORY \| O_CLOEXEC); if (dfd < 0) exit_log("%m - Failed to open base directory \"%s\"", base_dir); ret = mkdirat(dfd, "detached-move-mount", 0755); if (ret < 0) exit_log("%m - Failed to create required temporary directories"); ret = snprintf(target, sizeof(target), "%s/detached-move-mount", base_dir); if (ret < 0 \|\| (size_t)ret >= sizeof(target)) exit_log("%m - Failed to assemble target path"); / * Having a mount table with 10000 mounts is already quite excessive * and shoult account even for weird test systems. */ for (size_t i = 0; i < 10000; i++) { fd_tree = sys_open_tree(dfd, "detached-move-mount", OPEN_TREE_CLONE \| OPEN_TREE_CLOEXEC \| AT_EMPTY_PATH); if (fd_tree < 0) { fprintf(stderr, "%m - Failed to open %d(detached-move-mount)", dfd); exit_code = EXIT_FAILURE; break; } ret = sys_move_mount(fd_tree, "", dfd, "detached-move-mount", MOVE_MOUNT_F_EMPTY_PATH); if (ret < 0) { if (errno == ENOSPC) fprintf(stderr, "%m - Buggy mount counting"); else fprintf(stderr, "%m - Failed to attach mount to %d(detached-move-mount)", dfd); exit_code = EXIT_FAILURE; break; } close(fd_tree); ret = umount2(target, MNT_DETACH); if (ret < 0) { fprintf(stderr, "%m - Failed to unmount %s", target); exit_code = EXIT_FAILURE; break; } } (void)unlinkat(dfd, "detached-move-mount", AT_REMOVEDIR); close(dfd); exit(exit_code); } and wait for the kernel to refuse any new mounts by returning ENOSPC. How many iterations are needed depends on the number of mounts in your system. Assuming you have something like 50 mounts on a standard system it should be almost instantaneous. The root cause of this is that detached mounts aren't handled correctly when source and target mount are identical and reside on a shared mount causing a broken mount tree where the detached source itself is propagated which propagation prevents for regular bind-mounts and new mounts. This ultimately leads to a miscalculation of the number of mounts in the mount namespace. Detached mounts created via open_tree(fd, path, OPEN_TREE_CLONE) are essentially like an unattached new mount, or an unattached bind-mount. They can then later on be attached to the filesystem via move_mount() which calls into attach_recursive_mount(). Part of attaching it to the filesystem is making sure that mounts get correctly propagated in case the destination mountpoint is MS_SHARED, i.e. is a shared mountpoint. This is done by calling into propagate_mnt() which walks the list of peers calling propagate_one() on each mount in this list making sure it receives the propagation event. The propagate_one() functions thereby skips both new mounts and bind mounts to not propagate them "into themselves". Both are identified by checking whether the mount is already attached to any mount namespace in mnt->mnt_ns. The is what the IS_MNT_NEW() helper is responsible for. However, detached mounts have an anonymous mount namespace attached to them stashed in mnt->mnt_ns which means that IS_MNT_NEW() doesn't realize they need to be skipped causing the mount to propagate "into itself" breaking the mount table and causing a disconnect between the number of mounts recorded as being beneath or reachable from the target mountpoint and the number of mounts actually recorded/counted in ns->mounts ultimately causing an overflow which in turn prevents any new mounts via the ENOSPC issue. So teach propagation to handle detached mounts by making it aware of them. I've been tracking this issue down for the last couple of days and then verifying that the fix is correct by unmounting everything in my current mount table leaving only /proc and /sys mounted and running the reproducer above overnight verifying the number of mounts counted in ns->mounts. With this fix the counts are correct and the ENOSPC issue can't be reproduced. This change will only have an effect on mounts created with the new mount API since detached mounts cannot be created with the old mount API so regressions are extremely unlikely. Link: https://lore.kernel.org/r/20210306101010.243666-1-christian.brauner@ubuntu.com Fixes: `2db154b3ea` ("vfs: syscall: Add move_mount(2) to move mounts around") Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:33 +01:00
Christophe Leroy	59a057a891	powerpc/603: Fix protection of user pages mapped with PROT_NONE commit `c119565a15` upstream. On book3s/32, page protection is defined by the PP bits in the PTE which provide the following protection depending on the access keys defined in the matching segment register: - PP 00 means RW with key 0 and N/A with key 1. - PP 01 means RW with key 0 and RO with key 1. - PP 10 means RW with both key 0 and key 1. - PP 11 means RO with both key 0 and key 1. Since the implementation of kernel userspace access protection, PP bits have been set as follows: - PP00 for pages without _PAGE_USER - PP01 for pages with _PAGE_USER and _PAGE_RW - PP11 for pages with _PAGE_USER and without _PAGE_RW For kernelspace segments, kernel accesses are performed with key 0 and user accesses are performed with key 1. As PP00 is used for non _PAGE_USER pages, user can't access kernel pages not flagged _PAGE_USER while kernel can. For userspace segments, both kernel and user accesses are performed with key 0, therefore pages not flagged _PAGE_USER are still accessible to the user. This shouldn't be an issue, because userspace is expected to be accessible to the user. But unlike most other architectures, powerpc implements PROT_NONE protection by removing _PAGE_USER flag instead of flagging the page as not valid. This means that pages in userspace that are not flagged _PAGE_USER shall remain inaccessible. To get the expected behaviour, just mimic other architectures in the TLB miss handler by checking _PAGE_USER permission on userspace accesses as if it was the _PAGE_PRESENT bit. Note that this problem only is only for 603 cores. The 604+ have an hash table, and hash_page() function already implement the verification of _PAGE_USER permission on userspace pages. Fixes: `f342adca3a` ("powerpc/32s: Prepare Kernel Userspace Access Protection") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Christoph Plattner <christoph.plattner@thalesgroup.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4a0c6e3bb8f0c162457bf54d9bc6fd8d7b55129f.1612160907.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:33 +01:00
Lorenzo Bianconi	da9f2219f6	mt76: dma: do not report truncated frames to mac80211 commit `d0bd52c591` upstream. Commit `b102f0c522` ("mt76: fix array overflow on receiving too many fragments for a packet") fixes a possible OOB access but it introduces a memory leak since the pending frame is not released to page_frag_cache if the frag array of skb_shared_info is full. Commit `93a1d4791c` ("mt76: dma: fix a possible memory leak in mt76_add_fragment()") fixes the issue but does not free the truncated skb that is forwarded to mac80211 layer. Fix the leftover issue discarding even truncated skbs. Fixes: `93a1d4791c` ("mt76: dma: fix a possible memory leak in mt76_add_fragment()") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/a03166fcc8214644333c68674a781836e0f57576.1612697217.git.lorenzo@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:33 +01:00
Jiri Wiesner	95b0a3b090	ibmvnic: always store valid MAC address commit `67eb211487` upstream. The last change to ibmvnic_set_mac(), `8fc3672a8a`, meant to prevent users from setting an invalid MAC address on an ibmvnic interface that has not been brought up yet. The change also prevented the requested MAC address from being stored by the adapter object for an ibmvnic interface when the state of the ibmvnic interface is VNIC_PROBED - that is after probing has finished but before the ibmvnic interface is brought up. The MAC address stored by the adapter object is used and sent to the hypervisor for checking when an ibmvnic interface is brought up. The ibmvnic driver ignoring the requested MAC address when in VNIC_PROBED state caused LACP bonds (bonds in 802.3ad mode) with more than one slave to malfunction. The bonding code must be able to change the MAC address of its slaves before they are brought up during enslaving. The inability of kernels with `8fc3672a8a` to set the MAC addresses of bonding slaves is observable in the output of "ip address show". The MAC addresses of the slaves are the same as the MAC address of the bond on a working system whereas the slaves retain their original MAC addresses on a system with a malfunctioning LACP bond. Fixes: `8fc3672a8a` ("ibmvnic: fix ibmvnic_set_mac") Signed-off-by: Jiri Wiesner <jwiesner@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:33 +01:00
Maciej Fijalkowski	3e8ab75f33	samples, bpf: Add missing munmap in xdpsock commit `6bc6699881` upstream. We mmap the umem region, but we never munmap it. Add the missing call at the end of the cleanup. Fixes: `3945b37a97` ("samples/bpf: use hugepages in xdpsock app") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/20210303185636.18070-3-maciej.fijalkowski@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:33 +01:00
Yauheni Kaliuta	c2c3a85ab0	selftests/bpf: Mask bpf_csum_diff() return value to 16 bits in test_verifier commit `6185266c5a` upstream. The verifier test labelled "valid read map access into a read-only array 2" calls the bpf_csum_diff() helper and checks its return value. However, architecture implementations of csum_partial() (which is what the helper uses) differ in whether they fold the return value to 16 bit or not. For example, x86 version has ... if (unlikely(odd)) { result = from32to16(result); result = ((result >> 8) & 0xff) \| ((result & 0xff) << 8); } ... while generic lib/checksum.c does: result = from32to16(result); if (odd) result = ((result >> 8) & 0xff) \| ((result & 0xff) << 8); This makes the helper return different values on different architectures, breaking the test on non-x86. To fix this, add an additional instruction to always mask the return value to 16 bits, and update the expected return value accordingly. Fixes: `fb2abb73e5` ("bpf, selftest: test {rd, wr}only flags and direct value access") Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210228103017.320240-1-yauheni.kaliuta@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:33 +01:00
Hangbin Liu	57b9f13e8a	selftests/bpf: No need to drop the packet when there is no geneve opt commit `557c223b64` upstream. In bpf geneve tunnel test we set geneve option on tx side. On rx side we only call bpf_skb_get_tunnel_opt(). Since commit `9c2e14b481` ("ip_tunnels: Set tunnel option flag when tunnel metadata is present") geneve_rx() will not add TUNNEL_GENEVE_OPT flag if there is no geneve option, which cause bpf_skb_get_tunnel_opt() return ENOENT and _geneve_get_tunnel() in test_tunnel_kern.c drop the packet. As it should be valid that bpf_skb_get_tunnel_opt() return error when there is not tunnel option, there is no need to drop the packet and break all geneve rx traffic. Just set opt_class to 0 in this test and keep returning TC_ACT_OK. Fixes: `933a741e3b` ("selftests/bpf: bpf tunnel test.") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: William Tu <u9012063@gmail.com> Link: https://lore.kernel.org/bpf/20210224081403.1425474-1-liuhangbin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:32 +01:00
Vasily Averin	82e85c0e7f	netfilter: x_tables: gpf inside xt_find_revision() commit `8e24edddad` upstream. nested target/match_revfn() calls work with xt[NFPROTO_UNSPEC] lists without taking xt[NFPROTO_UNSPEC].mutex. This can race with module unload and cause host to crash: general protection fault: 0000 [#1] Modules linked in: ... [last unloaded: xt_cluster] CPU: 0 PID: 542455 Comm: iptables RIP: 0010:[<ffffffff8ffbd518>] [<ffffffff8ffbd518>] strcmp+0x18/0x40 RDX: 0000000000000003 RSI: ffff9a5a5d9abe10 RDI: dead000000000111 R13: ffff9a5a5d9abe10 R14: ffff9a5a5d9abd8c R15: dead000000000100 (VvS: %R15 -- &xt_match, %RDI -- &xt_match.name, xt_cluster unregister match in xt[NFPROTO_UNSPEC].match list) Call Trace: [<ffffffff902ccf44>] match_revfn+0x54/0xc0 [<ffffffff902ccf9f>] match_revfn+0xaf/0xc0 [<ffffffff902cd01e>] xt_find_revision+0x6e/0xf0 [<ffffffffc05a5be0>] do_ipt_get_ctl+0x100/0x420 [ip_tables] [<ffffffff902cc6bf>] nf_getsockopt+0x4f/0x70 [<ffffffff902dd99e>] ip_getsockopt+0xde/0x100 [<ffffffff903039b5>] raw_getsockopt+0x25/0x50 [<ffffffff9026c5da>] sock_common_getsockopt+0x1a/0x20 [<ffffffff9026b89d>] SyS_getsockopt+0x7d/0xf0 [<ffffffff903cbf92>] system_call_fastpath+0x25/0x2a Fixes: `656caff20e` ("netfilter 04/09: x_tables: fix match/target revision lookup") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:32 +01:00
Florian Westphal	f66b8e7381	netfilter: nf_nat: undo erroneous tcp edemux lookup commit `03a3ca37e4` upstream. Under extremely rare conditions TCP early demux will retrieve the wrong socket. 1. local machine establishes a connection to a remote server, S, on port p. This gives: laddr:lport -> S:p ... both in tcp and conntrack. 2. local machine establishes a connection to host H, on port p2. 2a. TCP stack choses same laddr:lport, so we have laddr:lport -> H:p2 from TCP point of view. 2b). There is a destination NAT rewrite in place, translating H:p2 to S:p. This results in following conntrack entries: I) laddr:lport -> S:p (origin) S:p -> laddr:lport (reply) II) laddr:lport -> H:p2 (origin) S:p -> laddr:lport2 (reply) NAT engine has rewritten laddr:lport to laddr:lport2 to map the reply packet to the correct origin. When server sends SYN/ACK to laddr:lport2, the PREROUTING hook will undo-the SNAT transformation, rewriting IP header to S:p -> laddr:lport This causes TCP early demux to associate the skb with the TCP socket of the first connection. The INPUT hook will then reverse the DNAT transformation, rewriting the IP header to H:p2 -> laddr:lport. Because packet ends up with the wrong socket, the new connection never completes: originator stays in SYN_SENT and conntrack entry remains in SYN_RECV until timeout, and responder retransmits SYN/ACK until it gives up. To resolve this, orphan the skb after the input rewrite: Because the source IP address changed, the socket must be incorrect. We can't move the DNAT undo to prerouting due to backwards compatibility, doing so will make iptables/nftables rules to no longer match the way they did. After orphan, the packet will be handed to the next protocol layer (tcp, udp, ...) and that will repeat the socket lookup just like as if early demux was disabled. Fixes: `41063e9dd1` ("ipv4: Early TCP socket demux.") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1427 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:32 +01:00
Eric Dumazet	3bf899438c	tcp: add sanity tests to TCP_QUEUE_SEQ commit `8811f4a983` upstream. Qingyu Li reported a syzkaller bug where the repro changes RCV SEQ _after_ restoring data in the receive queue. mprotect(0x4aa000, 12288, PROT_READ) = 0 mmap(0x1ffff000, 4096, PROT_NONE, MAP_PRIVATE\|MAP_FIXED\|MAP_ANONYMOUS, -1, 0) = 0x1ffff000 mmap(0x20000000, 16777216, PROT_READ\|PROT_WRITE\|PROT_EXEC, MAP_PRIVATE\|MAP_FIXED\|MAP_ANONYMOUS, -1, 0) = 0x20000000 mmap(0x21000000, 4096, PROT_NONE, MAP_PRIVATE\|MAP_FIXED\|MAP_ANONYMOUS, -1, 0) = 0x21000000 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 setsockopt(3, SOL_TCP, TCP_REPAIR, [1], 4) = 0 connect(3, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0 setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="0x0000000000000003\0\0", iov_len=20}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 setsockopt(3, SOL_TCP, TCP_REPAIR, [0], 4) = 0 setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [128], 4) = 0 recvfrom(3, NULL, 20, 0, NULL, NULL) = -1 ECONNRESET (Connection reset by peer) syslog shows: [ 111.205099] TCP recvmsg seq # bug 2: copied 80, seq 0, rcvnxt 80, fl 0 [ 111.207894] WARNING: CPU: 1 PID: 356 at net/ipv4/tcp.c:2343 tcp_recvmsg_locked+0x90e/0x29a0 This should not be allowed. TCP_QUEUE_SEQ should only be used when queues are empty. This patch fixes this case, and the tx path as well. Fixes: `ee9952831c` ("tcp: Initial repair mode") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=212005 Reported-by: Qingyu Li <ieatmuttonchuan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:32 +01:00
Torin Cooper-Bennun	b7049b6156	can: tcan4x5x: tcan4x5x_init(): fix initialization - clear MRAM before entering Normal Mode commit `2712625200` upstream. This patch prevents a potentially destructive race condition. The device is fully operational on the bus after entering Normal Mode, so zeroing the MRAM after entering this mode may lead to loss of information, e.g. new received messages. This patch fixes the problem by first initializing the MRAM, then bringing the device into Normale Mode. Fixes: `5443c226ba` ("can: tcan4x5x: Add tcan4x5x driver to the kernel") Link: https://lore.kernel.org/r/20210226163440.313628-1-torin@maxiluxsystems.com Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Torin Cooper-Bennun <torin@maxiluxsystems.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:32 +01:00
Joakim Zhang	a7e187a87e	can: flexcan: invoke flexcan_chip_freeze() to enter freeze mode commit `c63820045e` upstream. Invoke flexcan_chip_freeze() to enter freeze mode, since need poll freeze mode acknowledge. Fixes: `e955cead03` ("CAN: Add Flexcan CAN controller driver") Link: https://lore.kernel.org/r/20210218110037.16591-4-qiangqing.zhang@nxp.com Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:31 +01:00
Joakim Zhang	e0eccdfc5c	can: flexcan: enable RX FIFO after FRZ/HALT valid commit `ec15e27cc8` upstream. RX FIFO enable failed could happen when do system reboot stress test: [ 0.303958] flexcan 5a8d0000.can: 5a8d0000.can supply xceiver not found, using dummy regulator [ 0.304281] flexcan 5a8d0000.can (unnamed net_device) (uninitialized): Could not enable RX FIFO, unsupported core [ 0.314640] flexcan 5a8d0000.can: registering netdev failed [ 0.320728] flexcan 5a8e0000.can: 5a8e0000.can supply xceiver not found, using dummy regulator [ 0.320991] flexcan 5a8e0000.can (unnamed net_device) (uninitialized): Could not enable RX FIFO, unsupported core [ 0.331360] flexcan 5a8e0000.can: registering netdev failed [ 0.337444] flexcan 5a8f0000.can: 5a8f0000.can supply xceiver not found, using dummy regulator [ 0.337716] flexcan 5a8f0000.can (unnamed net_device) (uninitialized): Could not enable RX FIFO, unsupported core [ 0.348117] flexcan 5a8f0000.can: registering netdev failed RX FIFO should be enabled after the FRZ/HALT are valid. But the current code enable RX FIFO and FRZ/HALT at the same time. Fixes: `e955cead03` ("CAN: Add Flexcan CAN controller driver") Link: https://lore.kernel.org/r/20210218110037.16591-3-qiangqing.zhang@nxp.com Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:31 +01:00
Joakim Zhang	ca483b872d	can: flexcan: assert FRZ bit in flexcan_chip_freeze() commit `449052cfeb` upstream. Assert HALT bit to enter freeze mode, there is a premise that FRZ bit is asserted. This patch asserts FRZ bit in flexcan_chip_freeze, although the reset value is 1b'1. This is a prepare patch, later patch will invoke flexcan_chip_freeze() to enter freeze mode, which polling freeze mode acknowledge. Fixes: `b1aa1c7a21` ("can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze") Link: https://lore.kernel.org/r/20210218110037.16591-2-qiangqing.zhang@nxp.com Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:31 +01:00
Oleksij Rempel	6676e510d1	can: skb: can_skb_set_owner(): fix ref counting if socket was closed before setting skb ownership commit `e940e0895a` upstream. There are two ref count variables controlling the free()ing of a socket: - struct sock::sk_refcnt - which is changed by sock_hold()/sock_put() - struct sock::sk_wmem_alloc - which accounts the memory allocated by the skbs in the send path. In case there are still TX skbs on the fly and the socket() is closed, the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack clones an "echo" skb, calls sock_hold() on the original socket and references it. This produces the following back trace: \| WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 refcount_warn_saturate+0x114/0x134 \| refcount_t: addition on 0; use-after-free. \| Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E) \| CPU: 0 PID: 280 Comm: test_can.sh Tainted: G E 5.11.0-04577-gf8ff6603c617 #203 \| Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) \| Backtrace: \| [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) r7:00000000 r6:600f0113 r5:00000000 r4:81441220 \| [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8) \| [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:00000019 r8:80f4a8c2 r7:83e4150c r6:00000000 r5:00000009 r4:80528f90 \| [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) r9:83f26400 r8:80f4a8d1 r7:00000009 r6:80528f90 r5:00000019 r4:80f4a8c2 \| [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] (refcount_warn_saturate+0x114/0x134) r8:00000000 r7:00000000 r6:82b44000 r5:834e5600 r4:83f4d540 \| [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] (__refcount_add.constprop.0+0x4c/0x50) \| [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] (can_put_echo_skb+0xb0/0x13c) \| [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] (flexcan_start_xmit+0x1c4/0x230) r9:00000010 r8:83f48610 r7:0fdc0000 r6:0c080000 r5:82b44000 r4:834e5600 \| [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] (netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7:00000000 r6:834e5600 r5:82b44000 r4:82ab1f00 \| [<80969034>] (netdev_start_xmit) from [<809725a4>] (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8:00000000 r7:82ab1f00 r6:82b44000 r5:00000000 r4:834e5600 \| [<80972408>] (dev_hard_start_xmit) from [<809c6584>] (sch_direct_xmit+0xcc/0x264) r10:834e5600 r9:00000000 r8:00000000 r7:82b44000 r6:82ab1f00 r5:834e5600 r4:83f27400 \| [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534) To fix this problem, only set skb ownership to sockets which have still a ref count > 0. Fixes: `0ae89beb28` ("can: add destructor for self generated skbs") Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Andre Naujoks <nautsch2@gmail.com> Link: https://lore.kernel.org/r/20210226092456.27126-1-o.rempel@pengutronix.de Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:31 +01:00
Sergey Shtylyov	718769eb1b	sh_eth: fix TRSCER mask for SH771x commit `8c91bc3d44` upstream. According to the SH7710, SH7712, SH7713 Group User's Manual: Hardware, Rev. 3.00, the TRSCER register actually has only bit 7 valid (and named differently), with all the other bits reserved. Apparently, this was not the case with some early revisions of the manual as we have the other bits declared (and set) in the original driver. Follow the suit and add the explicit sh_eth_cpu_data::trscer_err_mask initializer for SH771x... Fixes: `86a74ff21a` ("net: sh_eth: add support for Renesas SuperH Ethernet") Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:31 +01:00
Balazs Nemeth	8baa52f26b	net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0 commit `d348ede32e` upstream. A packet with skb_inner_network_header(skb) == skb_network_header(skb) and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers from the packet. Subsequently, the call to skb_mac_gso_segment will again call mpls_gso_segment with the same packet leading to an infinite loop. In addition, ensure that the header length is a multiple of four, which should hold irrespective of the number of stacked labels. Signed-off-by: Balazs Nemeth <bnemeth@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:31 +01:00
Balazs Nemeth	ca278267d6	net: check if protocol extracted by virtio_net_hdr_set_proto is correct commit `924a9bc362` upstream. For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't set) based on the type in the virtio net hdr, but the skb could contain anything since it could come from packet_snd through a raw socket. If there is a mismatch between what virtio_net_hdr_set_proto sets and the actual protocol, then the skb could be handled incorrectly later on. An example where this poses an issue is with the subsequent call to skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set correctly. A specially crafted packet could fool skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned. Avoid blindly trusting the information provided by the virtio net header by checking that the protocol in the packet actually matches the protocol set by virtio_net_hdr_set_proto. Note that since the protocol is only checked if skb->dev implements header_ops->parse_protocol, packets from devices without the implementation are not checked at this stage. Fixes: `9274124f02` ("net: stricter validation of untrusted gso packets") Signed-off-by: Balazs Nemeth <bnemeth@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:30 +01:00
Daniel Borkmann	f2d78bbbca	net: Fix gro aggregation for udp encaps with zero csum commit `89e5c58fc1` upstream. We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the csum for the UDP header itself is 0. In that case, GRO aggregation does not take place on the phys dev, but instead is deferred to the vxlan/geneve driver (see trace below). The reason is essentially that GRO aggregation bails out in udp_gro_receive() for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, i40e, others) where for non-zero csums `2abb7cdc0d` ("udp: Add support for doing checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE and napi context has csum_valid set. This is however not the case for zero UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false). At the same time `57c67ff4bd` ("udp: additional GRO support") added matches on !uh->check ^ !uh2->check as part to determine candidates for aggregation, so it certainly is expected to handle zero csums in udp_gro_receive(). The purpose of the check added via `662880f442` ("net: Allow GRO to use and set levels of checksum unnecessary") seems to catch bad csum and stop aggregation right away. One way to fix aggregation in the zero case is to only perform the !csum_valid check in udp_gro_receive() if uh->check is infact non-zero. Before: [...] swapper 0 [008] 731.946506: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100400 len=1500 (1) swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100200 len=1500 swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101100 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101700 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101b00 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100600 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100f00 len=1500 swapper 0 [008] 731.946509: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100a00 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100500 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100700 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101d00 len=1500 (2) swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101000 len=1500 swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101c00 len=1500 swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101400 len=1500 swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100e00 len=1500 swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101600 len=1500 swapper 0 [008] 731.946521: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100800 len=774 swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497100400 len=14032 (1) swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497101d00 len=9112 (2) [...] # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 20.01 13129.24 After: [...] swapper 0 [026] 521.862641: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479000 len=11286 (1) swapper 0 [026] 521.862643: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479000 len=11236 (1) swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d478500 len=2898 (2) swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479f00 len=8490 (3) swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d478500 len=2848 (2) swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479f00 len=8440 (3) [...] # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 20.01 24576.53 Fixes: `57c67ff4bd` ("udp: additional GRO support") Fixes: `662880f442` ("net: Allow GRO to use and set levels of checksum unnecessary") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tom Herbert <tom@herbertland.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20210226212248.8300-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:30 +01:00
Felix Fietkau	9be7691611	ath9k: fix transmitting to stations in dynamic SMPS mode commit `3b9ea7206d` upstream. When transmitting to a receiver in dynamic SMPS mode, all transmissions that use multiple spatial streams need to be sent using CTS-to-self or RTS/CTS to give the receiver's extra chains some time to wake up. This fixes the tx rate getting stuck at <= MCS7 for some clients, especially Intel ones, which make aggressive use of SMPS. Cc: stable@vger.kernel.org Reported-by: Martin Kennedy <hurricos@gmail.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210214184911.96702-1-nbd@nbd.name Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:30 +01:00
Jakub Kicinski	5555ee33b6	ethernet: alx: fix order of calls on resume commit `a4dcfbc4ee` upstream. netif_device_attach() will unpause the queues so we can't call it before __alx_open(). This went undetected until commit `b0999223f2` ("alx: add ability to allocate and free alx_napi structures") but now if stack tries to xmit immediately on resume before __alx_open() we'll crash on the NAPI being null: BUG: kernel NULL pointer dereference, address: 0000000000000198 CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G OE 5.10.0-3-amd64 #1 Debian 5.10.13-1 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77-D3H, BIOS F15 11/14/2013 RIP: 0010:alx_start_xmit+0x34/0x650 [alx] Code: 41 56 41 55 41 54 55 53 48 83 ec 20 0f b7 57 7c 8b 8e b0 0b 00 00 39 ca 72 06 89 d0 31 d2 f7 f1 89 d2 48 8b 84 df RSP: 0018:ffffb09240083d28 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffffa04d80ae7800 RCX: 0000000000000004 RDX: 0000000000000000 RSI: ffffa04d80afa000 RDI: ffffa04e92e92a00 RBP: 0000000000000042 R08: 0000000000000100 R09: ffffa04ea3146700 R10: 0000000000000014 R11: 0000000000000000 R12: ffffa04e92e92100 R13: 0000000000000001 R14: ffffa04e92e92a00 R15: ffffa04e92e92a00 FS: 0000000000000000(0000) GS:ffffa0508f600000(0000) knlGS:0000000000000000 i915 0000:00:02.0: vblank wait timed out on crtc 0 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000198 CR3: 000000004460a001 CR4: 00000000001706f0 Call Trace: dev_hard_start_xmit+0xc7/0x1e0 sch_direct_xmit+0x10f/0x310 Cc: <stable@vger.kernel.org> # 4.9+ Fixes: `bc2bebe8de` ("alx: remove WoL support") Reported-by: Zbynek Michl <zbynek.michl@gmail.com> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983595 Signed-off-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Zbynek Michl <zbynek.michl@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:30 +01:00
Greg Kurz	dcb9579082	powerpc/pseries: Don't enforce MSI affinity with kdump commit `f9619d5e51` upstream. Depending on the number of online CPUs in the original kernel, it is likely for CPU #0 to be offline in a kdump kernel. The associated IRQs in the affinity mappings provided by irq_create_affinity_masks() are thus not started by irq_startup(), as per-design with managed IRQs. This can be a problem with multi-queue block devices driven by blk-mq : such a non-started IRQ is very likely paired with the single queue enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This causes the device to remain silent and likely hangs the guest at some point. This is a regression caused by commit `9ea69a55b3` ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()"). Note that this only happens with the XIVE interrupt controller because XICS has a workaround to bypass affinity, which is activated during kdump with the "noirqdistrib" kernel parameter. The issue comes from a combination of factors: - discrepancy between the number of queues detected by the multi-queue block driver, that was used to create the MSI vectors, and the single queue mode enforced later on by blk-mq because of kdump (i.e. keeping all queues fixes the issue) - CPU#0 offline (i.e. kdump always succeed with CPU#0) Given that I couldn't reproduce on x86, which seems to always have CPU#0 online even during kdump, I'm not sure where this should be fixed. Hence going for another approach : fine-grained affinity is for performance and we don't really care about that during kdump. Simply revert to the previous working behavior of ignoring affinity masks in this case only. Fixes: `9ea69a55b3` ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210215094506.1196119-1-groug@kaod.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:30 +01:00
Dmitry V. Levin	fd1824bf96	uapi: nfnetlink_cthelper.h: fix userspace compilation error commit `c33cb0020e` upstream. Apparently, <linux/netfilter/nfnetlink_cthelper.h> and <linux/netfilter/nfnetlink_acct.h> could not be included into the same compilation unit because of a cut-and-paste typo in the former header. Fixes: `12f7a50533` ("netfilter: add user-space connection tracking helper infrastructure") Cc: <stable@vger.kernel.org> # v3.6 Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:03:29 +01:00
Greg Kroah-Hartman	ce615a0840	Linux 5.4.105 Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Jason Self <jason@bluehome.net> Tested-by: Ross Schmidt <ross.schm.dev@gmail.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Hulk Robot <hulkrobot@huawei.com> Link: https://lore.kernel.org/r/20210310132320.550932445@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-11 14:06:51 +01:00
Pascal Terjan	d17cf4cb19	nvme-pci: add quirks for Lexar 256GB SSD [ Upstream commit `6e6a6828c5` ] Add the NVME_QUIRK_NO_NS_DESC_LIST and NVME_QUIRK_IGNORE_DEV_SUBNQN quirks for this buggy device. Reported and tested in https://bugs.mageia.org/show_bug.cgi?id=28417 Signed-off-by: Pascal Terjan <pterjan@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:51 +01:00
Julian Einwag	1d08ff8464	nvme-pci: mark Seagate Nytro XM1440 as QUIRK_NO_NS_DESC_LIST. [ Upstream commit `5e112d3fb8` ] The kernel fails to fully detect these SSDs, only the character devices are present: [ 10.785605] nvme nvme0: pci function 0000:04:00.0 [ 10.876787] nvme nvme1: pci function 0000:81:00.0 [ 13.198614] nvme nvme0: missing or invalid SUBNQN field. [ 13.198658] nvme nvme1: missing or invalid SUBNQN field. [ 13.206896] nvme nvme0: Shutdown timeout set to 20 seconds [ 13.215035] nvme nvme1: Shutdown timeout set to 20 seconds [ 13.225407] nvme nvme0: 16/0/0 default/read/poll queues [ 13.233602] nvme nvme1: 16/0/0 default/read/poll queues [ 13.239627] nvme nvme0: Identify Descriptors failed (8194) [ 13.246315] nvme nvme1: Identify Descriptors failed (8194) Adding the NVME_QUIRK_NO_NS_DESC_LIST fixes this problem. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=205679 Signed-off-by: Julian Einwag <jeinwag-nvme@marcapo.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
Hans de Goede	9106a7844e	HID: i2c-hid: Add I2C_HID_QUIRK_NO_IRQ_AFTER_RESET for ITE8568 EC on Voyo Winpad A15 [ Upstream commit `fc6a31b007` ] The ITE8568 EC on the Voyo Winpad A15 presents itself as an I2C-HID attached keyboard and mouse (which seems to never send any events). This needs the I2C_HID_QUIRK_NO_IRQ_AFTER_RESET quirk, otherwise we get the following errors: [ 3688.770850] i2c_hid i2c-ITE8568:00: failed to reset device. [ 3694.915865] i2c_hid i2c-ITE8568:00: failed to reset device. [ 3701.059717] i2c_hid i2c-ITE8568:00: failed to reset device. [ 3707.205944] i2c_hid i2c-ITE8568:00: failed to reset device. [ 3708.227940] i2c_hid i2c-ITE8568:00: can't add hid device: -61 [ 3708.236518] i2c_hid: probe of i2c-ITE8568:00 failed with error -61 Which leads to a significant boot delay. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
Jisheng Zhang	b5e10e9b30	mmc: sdhci-of-dwcmshc: set SDHCI_QUIRK2_PRESET_VALUE_BROKEN [ Upstream commit `5f7dfda4f2` ] The SDHCI_PRESET_FOR_* registers are not set(all read as zeros), so set the quirk. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Link: https://lore.kernel.org/r/20201210165510.76b917e5@xhacker.debian Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
AngeloGioacchino Del Regno	21f3fb36b5	drm/msm/a5xx: Remove overwriting A5XX_PC_DBG_ECO_CNTL register [ Upstream commit `8f03c30cb8` ] The PC_DBG_ECO_CNTL register on the Adreno A5xx family gets programmed to some different values on a per-model basis. At least, this is what we intend to do here; Unfortunately, though, this register is being overwritten with a static magic number, right after applying the GPU-specific configuration (including the GPU-specific quirks) and that is effectively nullifying the efforts. Let's remove the redundant and wrong write to the PC_DBG_ECO_CNTL register in order to retain the wanted configuration for the target GPU. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
Aswath Govindraju	1d113893ff	misc: eeprom_93xx46: Add quirk to support Microchip 93LC46B eeprom [ Upstream commit `f6f1f8e6e3` ] A dummy zero bit is sent preceding the data during a read transfer by the Microchip 93LC46B eeprom (section 2.7 of[1]). This results in right shift of data during a read. In order to ignore this bit a quirk can be added to send an extra zero bit after the read address. Add a quirk to ignore the zero bit sent before data by adding a zero bit after the read address. [1] - https://www.mouser.com/datasheet/2/268/20001749K-277859.pdf Signed-off-by: Aswath Govindraju <a-govindraju@ti.com> Link: https://lore.kernel.org/r/20210105105817.17644-3-a-govindraju@ti.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
Bjorn Helgaas	9f1f098875	PCI: Add function 1 DMA alias quirk for Marvell 9215 SATA controller [ Upstream commit `059983790a` ] Add function 1 DMA alias quirk for Marvell 88SS9215 PCIe SSD Controller. Link: https://bugzilla.kernel.org/show_bug.cgi?id=42679#c135 Link: https://lore.kernel.org/r/20201110220516.697934-1-helgaas@kernel.org Reported-by: John Smith <LK7S2ED64JHGLKj75shg9klejHWG49h5hk@protonmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
Chris Chiu	f40fdcb7ca	ASoC: Intel: bytcr_rt5640: Add quirk for ARCHOS Cesium 140 [ Upstream commit `1bea2256aa` ] Tha ARCHOS Cesium 140 tablet has problem with the jack-sensing, thus the heaset functions are not working. Add quirk for this model to select the correct input map, jack-detect options and channel map to enable jack sensing and headset microphone. This device uses IN1 for its internal MIC and JD2 for jack-detect. Signed-off-by: Chris Chiu <chiu@endlessos.org> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20201208060414.27646-1-chiu@endlessos.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
Jasper St. Pierre	242be7cddd	ACPI: video: Add DMI quirk for GIGABYTE GB-BXBT-2807 [ Upstream commit `25417185e9` ] The GIGABYTE GB-BXBT-2807 is a mini-PC which uses off the shelf components, like an Intel GPU which is meant for mobile systems. As such, it, by default, has a backlight controller exposed. Unfortunately, the backlight controller only confuses userspace, which sees the existence of a backlight device node and has the unrealistic belief that there is actually a backlight there! Add a DMI quirk to force the backlight off on this system. Signed-off-by: Jasper St. Pierre <jstpierre@mecheye.net> Reviewed-by: Chris Chiu <chiu@endlessos.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
Daniel Lee Kruse	86c8848d68	media: cx23885: add more quirks for reset DMA on some AMD IOMMU [ Upstream commit `dbf0b3a7b7` ] On AMD Family 15h (Models 30h-3fh), I/O Memory Management Unit RiSC engine sometimes stalls, requiring a reset. As result, MythTV and w-scan won't scan channels on the AMD Kaveri APU with the Hauppauge QuadHD TV tuner card. For the solution I added the Input/Output Memory Management Unit's PCI Identity of 0x1423 to the broken_dev_id[] array, which is used by a quirks logic meant to fix similar problems with other AMD chipsets. Signed-off-by: Daniel Lee Kruse <daniel.lee.kruse@protonmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
Ethan Warth	fd476c6d4e	HID: mf: add support for 0079:1846 Mayflash/Dragonrise USB Gamecube Adapter [ Upstream commit `1008230f2a` ] Mayflash/Dragonrise seems to have yet another device ID for one of their Gamecube controller adapters. Previous to this commit, the adapter registered only one /dev/input/js* device, and all controller inputs (from any controller) were mapped to this device. This patch defines the 1846 USB device ID and enables the HID_QUIRK_MULTI_INPUT quirk for it, which fixes that (with the patch, four /dev/input/js* devices are created, one for each of the four controller ports). Signed-off-by: Ethan Warth <redyoshi49q@gmail.com> Tested-by: Wladimir J. van der Laan <laanwj@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
Hans de Goede	ef9fa6bb85	platform/x86: acer-wmi: Add ACER_CAP_KBD_DOCK quirk for the Aspire Switch 10E SW3-016 [ Upstream commit `bf75340028` ] Add the Acer Aspire Switch 10E SW3-016 to the list of models which use the Acer Switch WMI interface for reporting SW_TABLET_MODE. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20201123151625.5530-1-hdegoede@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:50 +01:00
Hans de Goede	3a8eb20cb8	platform/x86: acer-wmi: Add support for SW_TABLET_MODE on Switch devices [ Upstream commit `5c54cb6c62` ] Add support for SW_TABLET_MODE on the Acer Switch 10 (SW5-012) and the acer Switch 10 (S1003) models. There is no way to detect if this is supported, so this uses DMI based quirks setting force_caps to ACER_CAP_KBD_DOCK (these devices have no other acer-wmi based functionality). The new SW_TABLET_MODE functionality can be tested on devices which are not in the DMI table by passing acer_wmi.force_caps=0x40 on the kernel commandline. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20201019185628.264473-6-hdegoede@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:49 +01:00
Hans de Goede	e3a3a69da1	platform/x86: acer-wmi: Add ACER_CAP_SET_FUNCTION_MODE capability flag [ Upstream commit `82cb8a5c39` ] Not all devices supporting WMID_GUID3 support the wmid3_set_function_mode() call, leading to errors like these: [ 60.138358] acer_wmi: Enabling RF Button failed: 0x1 - 0xff [ 60.140036] acer_wmi: Enabling Launch Manager failed: 0x1 - 0xff Add an ACER_CAP_SET_FUNCTION_MODE capability flag, so that these calls can be disabled through the new force_caps mechanism. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20201019185628.264473-5-hdegoede@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:49 +01:00
Hans de Goede	b734af305c	platform/x86: acer-wmi: Add new force_caps module parameter [ Upstream commit `39aa009bb6` ] Add a new force_caps module parameter to allow overriding the drivers builtin capability detection mechanism. This can be used to for example: -Disable rfkill functionality on devices where there is an AA OEM DMI record advertising non functional rfkill switches -Force loading of the driver on devices with a missing AA OEM DMI record Note that force_caps is -1 when unset, this allows forcing the capability field to 0, which results in acer-wmi only providing WMI hotkey handling while disabling all other (led, rfkill, backlight) functionality. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20201019185628.264473-4-hdegoede@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:49 +01:00
Hans de Goede	0251802442	platform/x86: acer-wmi: Cleanup accelerometer device handling [ Upstream commit `9feb0763e4` ] Cleanup accelerometer device handling: -Drop acer_wmi_accel_destroy instead directly call input_unregister_device. -The information tracked by the CAP_ACCEL flag mirrors acer_wmi_accel_dev being NULL. Drop the CAP flag, this is a preparation change for allowing users to override the capability flags. Dropping the flag stops users from causing a NULL pointer dereference by forcing the capability. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20201019185628.264473-3-hdegoede@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:49 +01:00
Hans de Goede	37b4324cb7	platform/x86: acer-wmi: Cleanup ACER_CAP_FOO defines [ Upstream commit `7c936d8d26` ] Cleanup the ACER_CAP_FOO defines: -Switch to using BIT() macro. -The ACER_CAP_RFBTN flag is set, but it is never checked anywhere, drop it. -Drop the unused ACER_CAP_ANY define. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20201019185628.264473-2-hdegoede@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:49 +01:00
Tsuchiya Yuto	200e14759d	mwifiex: pcie: skip cancel_work_sync() on reset failure path [ Upstream commit `4add4d988f` ] If a reset is performed, but even the reset fails for some reasons (e.g., on Surface devices, the fw reset requires another quirks), cancel_work_sync() hangs in mwifiex_cleanup_pcie(). # firmware went into a bad state [...] [ 1608.281690] mwifiex_pcie 0000:03:00.0: info: shutdown mwifiex... [ 1608.282724] mwifiex_pcie 0000:03:00.0: rx_pending=0, tx_pending=1, cmd_pending=0 [ 1608.292400] mwifiex_pcie 0000:03:00.0: PREP_CMD: card is removed [ 1608.292405] mwifiex_pcie 0000:03:00.0: PREP_CMD: card is removed # reset performed after firmware went into a bad state [ 1609.394320] mwifiex_pcie 0000:03:00.0: WLAN FW already running! Skip FW dnld [ 1609.394335] mwifiex_pcie 0000:03:00.0: WLAN FW is active # but even the reset failed [ 1619.499049] mwifiex_pcie 0000:03:00.0: mwifiex_cmd_timeout_func: Timeout cmd id = 0xfa, act = 0xe000 [ 1619.499094] mwifiex_pcie 0000:03:00.0: num_data_h2c_failure = 0 [ 1619.499103] mwifiex_pcie 0000:03:00.0: num_cmd_h2c_failure = 0 [ 1619.499110] mwifiex_pcie 0000:03:00.0: is_cmd_timedout = 1 [ 1619.499117] mwifiex_pcie 0000:03:00.0: num_tx_timeout = 0 [ 1619.499124] mwifiex_pcie 0000:03:00.0: last_cmd_index = 0 [ 1619.499133] mwifiex_pcie 0000:03:00.0: last_cmd_id: fa 00 07 01 07 01 07 01 07 01 [ 1619.499140] mwifiex_pcie 0000:03:00.0: last_cmd_act: 00 e0 00 00 00 00 00 00 00 00 [ 1619.499147] mwifiex_pcie 0000:03:00.0: last_cmd_resp_index = 3 [ 1619.499155] mwifiex_pcie 0000:03:00.0: last_cmd_resp_id: 07 81 07 81 07 81 07 81 07 81 [ 1619.499162] mwifiex_pcie 0000:03:00.0: last_event_index = 2 [ 1619.499169] mwifiex_pcie 0000:03:00.0: last_event: 58 00 58 00 58 00 58 00 58 00 [ 1619.499177] mwifiex_pcie 0000:03:00.0: data_sent=0 cmd_sent=1 [ 1619.499185] mwifiex_pcie 0000:03:00.0: ps_mode=0 ps_state=0 [ 1619.499215] mwifiex_pcie 0000:03:00.0: info: _mwifiex_fw_dpc: unregister device # mwifiex_pcie_work hang happening [ 1823.233923] INFO: task kworker/3:1:44 blocked for more than 122 seconds. [ 1823.233932] Tainted: G WC OE 5.10.0-rc1-1-mainline #1 [ 1823.233935] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1823.233940] task:kworker/3:1 state:D stack: 0 pid: 44 ppid: 2 flags:0x00004000 [ 1823.233960] Workqueue: events mwifiex_pcie_work [mwifiex_pcie] [ 1823.233965] Call Trace: [ 1823.233981] __schedule+0x292/0x820 [ 1823.233990] schedule+0x45/0xe0 [ 1823.233995] schedule_timeout+0x11c/0x160 [ 1823.234003] wait_for_completion+0x9e/0x100 [ 1823.234012] __flush_work.isra.0+0x156/0x210 [ 1823.234018] ? flush_workqueue_prep_pwqs+0x130/0x130 [ 1823.234026] __cancel_work_timer+0x11e/0x1a0 [ 1823.234035] mwifiex_cleanup_pcie+0x28/0xd0 [mwifiex_pcie] [ 1823.234049] mwifiex_free_adapter+0x24/0xe0 [mwifiex] [ 1823.234060] _mwifiex_fw_dpc+0x294/0x560 [mwifiex] [ 1823.234074] mwifiex_reinit_sw+0x15d/0x300 [mwifiex] [ 1823.234080] mwifiex_pcie_reset_done+0x50/0x80 [mwifiex_pcie] [ 1823.234087] pci_try_reset_function+0x5c/0x90 [ 1823.234094] process_one_work+0x1d6/0x3a0 [ 1823.234100] worker_thread+0x4d/0x3d0 [ 1823.234107] ? rescuer_thread+0x410/0x410 [ 1823.234112] kthread+0x142/0x160 [ 1823.234117] ? __kthread_bind_mask+0x60/0x60 [ 1823.234124] ret_from_fork+0x22/0x30 [...] This is a deadlock caused by calling cancel_work_sync() in mwifiex_cleanup_pcie(): - Device resets are done via mwifiex_pcie_card_reset() - which schedules card->work to call mwifiex_pcie_card_reset_work() - which calls pci_try_reset_function(). - This leads to mwifiex_pcie_reset_done() be called on the same workqueue, which in turn calls - mwifiex_reinit_sw() and that calls - _mwifiex_fw_dpc(). The problem is now that _mwifiex_fw_dpc() calls mwifiex_free_adapter() in case firmware initialization fails. That ends up calling mwifiex_cleanup_pcie(). Note that all those calls are still running on the workqueue. So when mwifiex_cleanup_pcie() now calls cancel_work_sync(), it's really waiting on itself to complete, causing a deadlock. This commit fixes the deadlock by skipping cancel_work_sync() on a reset failure path. After this commit, when reset fails, the following output is expected to be shown: kernel: mwifiex_pcie 0000:03:00.0: info: _mwifiex_fw_dpc: unregister device kernel: mwifiex: Failed to bring up adapter: -5 kernel: mwifiex_pcie 0000:03:00.0: reinit failed: -5 To reproduce this issue, for example, try putting the root port of wifi into D3 (replace "00:1d.3" with your setup). # put into D3 (root port) sudo setpci -v -s 00:1d.3 CAP_PM+4.b=0b Cc: Maximilian Luz <luzmaximilian@gmail.com> Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20201028142346.18355-1-kitakar@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 14:06:49 +01:00
Andrey Ryabinin	c699a89d38	iommu/amd: Fix sleeping in atomic in increase_address_space() commit `140456f994` upstream. increase_address_space() calls get_zeroed_page(gfp) under spin_lock with disabled interrupts. gfp flags passed to increase_address_space() may allow sleeping, so it comes to this: BUG: sleeping function called from invalid context at mm/page_alloc.c:4342 in_atomic(): 1, irqs_disabled(): 1, pid: 21555, name: epdcbbf1qnhbsd8 Call Trace: dump_stack+0x66/0x8b ___might_sleep+0xec/0x110 __alloc_pages_nodemask+0x104/0x300 get_zeroed_page+0x15/0x40 iommu_map_page+0xdd/0x3e0 amd_iommu_map+0x50/0x70 iommu_map+0x106/0x220 vfio_iommu_type1_ioctl+0x76e/0x950 [vfio_iommu_type1] do_vfs_ioctl+0xa3/0x6f0 ksys_ioctl+0x66/0x70 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4e/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by moving get_zeroed_page() out of spin_lock/unlock section. Fixes: `754265bcab` ("iommu/amd: Fix race in increase_address_space()") Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com> Acked-by: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210217143004.19165-1-arbn@yandex-team.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-11 14:06:49 +01:00
Hans de Goede	fa56bf637e	ACPICA: Fix race in generic_serial_bus (I2C) and GPIO op_region parameter handling commit `c27f3d011b` upstream. ACPICA commit c9e0116952363b0fa815143dca7e9a2eb4fefa61 The handling of the generic_serial_bus (I2C) and GPIO op_regions in acpi_ev_address_space_dispatch() passes a number of extra parameters to the address-space handler through the address-space Context pointer (instead of using more function parameters). The Context is shared between threads, so if multiple threads try to call the handler for the same address-space at the same time, then a second thread could change the parameters of a first thread while the handler is running for the first thread. An example of this race hitting is the Lenovo Yoga Tablet2 1015L, where there are both attrib_bytes accesses and attrib_byte accesses to the same address-space. The attrib_bytes access stores the number of bytes to transfer in Context->access_length. Where as for the attrib_byte access the number of bytes to transfer is always 1 and field_obj->Field.access_length is unused (so 0). Both types of accesses racing from different threads leads to the following problem: 1. Thread a. starts an attrib_bytes access, stores a non 0 value from field_obj->Field.access_length in Context->access_length 2. Thread b. starts an attrib_byte access, stores 0 in Context->access_length 3. Thread a. calls i2c_acpi_space_handler() (under Linux). Which sees that the access-type is ACPI_GSB_ACCESS_ATTRIB_MULTIBYTE and calls acpi_gsb_i2c_read_bytes(..., Context->access_length) 4. At this point Context->access_length is 0 (set by thread b.) rather then the field_obj->Field.access_length value from thread a. This 0 length reads leads to the following errors being logged: i2c i2c-0: adapter quirk: no zero length (addr 0x0078, size 0, read) i2c i2c-0: i2c read 0 bytes from client@0x78 starting at reg 0x0 failed, error: -95 Note this is just an example of the problems which this race can cause. There are likely many more (sporadic) problems caused by this race. This commit adds a new context_mutex to struct acpi_object_addr_handler and makes acpi_ev_address_space_dispatch() take that mutex when using the shared Context to pass extra parameters to an address-space handler, fixing this race. Note the new mutex must be taken after exiting the interpreter, therefor the existing acpi_ex_exit_interpreter() call is moved to above the code which stores the extra parameters in the Context. Link: https://github.com/acpica/acpica/commit/c9e01169 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Erik Kaneda <erik.kaneda@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-11 14:06:49 +01:00
Jeffle Xu	f27765adb3	dm table: fix zoned iterate_devices based device capability checks commit `24f6b6036c` upstream. Fix dm_table_supports_zoned_model() and invert logic of both iterate_devices_callout_fn so that all devices' zoned capabilities are properly checked. Add one more parameter to dm_table_any_dev_attr(), which is actually used as the @data parameter of iterate_devices_callout_fn, so that dm_table_matches_zone_sectors() can be replaced by dm_table_any_dev_attr(). Fixes: `dd88d313be` ("dm table: add zoned block devices validation") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> [jeffle: also convert partial completion check] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-11 14:06:49 +01:00

1 2 3 4 5 ...

885279 Commits