linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-06 02:50:49 +09:00

Author	SHA1	Message	Date
Jeson Gao	64999249d5	ANDROID: thermal: Add hook to enable/disable thermal power throttle By default, thermal power throttle is always enable, but sometimes it need to be disabled for a period of time, so add it to meet platform thermal requirement. Bug: 209386157 Signed-off-by: Jeson Gao <jeson.gao@unisoc.com> Change-Id: If9c53a9669eec8e2821d837cfa3c660a9cfbf934	2021-12-07 20:44:29 +00:00
Donnie Pollitz	d1b2876104	ANDROID: GKI: Add symbols abi for USB IP kernel modules. Bug: 207100354 Test: Manually, Emulator boots up. Signed-off-by: Donnie Pollitz <donpollitz@google.com> Change-Id: If60fdd76bf70f956d8e9c534924c5c796e75d0b4	2021-12-07 17:36:20 +00:00
David Brazdil	8a8500235e	Revert "ANDROID: KVM: arm64: Unmap S2MPU MMIO regions in MPT" Unmapping S2MPU MMIO regions from MPTs is causing issues on oriole/raven because it forces a 4K mapping in the first physical GB region instead of 1G. During suspend AOC/APM can issue traffic through the S2MPUs and the page lookup transaction fails because the MIF is powered off. Revert the patch until the problem is fixed by AOC/APM device drivers. This reverts commit `533c59945d`. Test: unplug device, wait until it suspends, observe no crash Bug: 190463801 Bug: 209399107 Signed-off-by: David Brazdil <dbrazdil@google.com> Change-Id: I9a84ccf4ace459dc35918fc31d86933bc5b923f7	2021-12-07 10:32:54 +00:00
Alex Hong	f398bdcb07	ANDROID: Update the ABI symbol list Update the generic symbol list. Bug: 199698959 Signed-off-by: Alex Hong <rurumihong@google.com> Change-Id: I68b786d210bb5303bf462d3798510d4789212be8	2021-12-07 08:37:24 +08:00
Jaegeuk Kim	35cfa55917	FROMGIT: f2fs: show number of pending discard commands This information can be used to check how much time we need to give to issue all the discard commands. Bug: 206863097 Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> (cherry picked from commit fc4ae5492ca4afd7a8a9d261f4908b09f221d314 git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev) Change-Id: Ibd2f1d6c171f584ec9ca3817d9ea561db98f4693	2021-12-06 18:58:38 +00:00
David Brazdil	395d045123	ANDROID: KVM: arm64: Initialize pkvm_pgtable.mm_ops earlier The `init` callback of an IOMMU driver is called just before `finalize_host_mappings` so that EL2 mappings created by drivers are subsequently unmapped from host stage-2. However, at this point hyp has already switched to the buddy allocator, having reserved pages allocated by the early allocator, but `pkvm_pgtable.mm_ops` have not been switched to buddy allocator callbacks. As a result, pages allocated for EL2 mappings of the IOMMU driver are allocated by the obsoleted early allocator and remain treated as free by the buddy allocator. This likely leads to a corruption in the free page lists and a later hyp panic. Move the initialization of `pkvm_pgtable.mm_ops` before `finalize_host_mappings` and the call to IOMMU's `init`. Test: run a VM Test: adb shell cmd jobscheduler run -f android 5132250 Bug: 190463801 Bug: 209004831 Signed-off-by: David Brazdil <dbrazdil@google.com> Change-Id: I42ad28e6b4cfb014d50dbbf722e4c02d5f684178	2021-12-06 11:33:50 +00:00
Konstantin Vyshetsky	ee507cfc11	ANDROID: ABI: update generic symbol list add blk_crypto_init_key Bug: 202417931 Signed-off-by: Konstantin Vyshetsky <vkon@google.com> Change-Id: I6976ec1519ec9f5fcc4d557cccfce8a26335ce17	2021-12-03 11:00:27 -08:00
Quentin Perret	52ec143f0b	ANDROID: sched: Make uclamp changes depend on CAP_SYS_NICE There is currently nothing preventing tasks from changing their per-task clamp values in anyway that they like. The rationale is probably that system administrators are still able to limit those clamps thanks to the cgroup interface. However, this causes pain in a system where both per-task and per-cgroup clamp values are expected to be under the control of core system components (as is the case for Android). To fix this, let's require CAP_SYS_NICE to change per-task clamp values. There are ongoing discussions upstream about more flexible approaches than this using the RLIMIT API -- see [1]. But the upstream discussion has not converged yet, and this is way too late for UAPI changes in android12-5.10 anyway, so let's apply this change which provides the behaviour we want without actually impacting UAPIs. [1] https://lore.kernel.org/lkml/20210623123441.592348-4-qperret@google.com/ Bug: 187186685 Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I749312a77306460318ac5374cf243d00b78120dd	2021-12-03 18:10:01 +00:00
Martin Liu	4266de9b4b	ANDROID: sched: move blocked reason trace point to cover all class Now, we only export CFS taks' blocked reasons but it's important and useful to know other class' blocked reasons such as RT tasks. Move the blocked reason trace point to where the scheduler core layer and before the task's state moves to the waking state. Thus, we could cover all the sched classes. Bug: 203080186 Test: check traces Signed-off-by: Martin Liu <liumartin@google.com> Change-Id: Ic61865642d852d0127cdcf474adf8c06e4c2d570 (cherry picked from commit `44447dec6e`) Signed-off-by: Quentin Perret <qperret@google.com>	2021-12-03 18:09:21 +00:00
vincent.wang	cf33d6fae0	ANDROID: vendor_hooks: add a hook to control the delay time of frequency up and down. balance the power and performance via the different delay time. Bug: 208722787 Signed-off-by: vincent.wang <vincent.wang@unisoc.com> Change-Id: I104a090bc697ab9c68007081e6533820364f1351	2021-12-03 05:42:33 +00:00
David Brazdil	160ebf93a9	UPSTREAM: of: restricted dma: Fix condition for rmem init of_dma_set_restricted_buffer fails to handle negative return values from of_property_count_elems_of_size, e.g. when the property does not exist. This results in an attempt to assign a non-existent reserved memory region to the device and a warning being printed. Fix the condition to take negative values into account. Fixes: `f3cfd136ae` ("of: restricted dma: Don't fail device probe on rmem init failure") Cc: Will Deacon <will@kernel.org> Signed-off-by: David Brazdil <dbrazdil@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210917131423.2760155-1-dbrazdil@google.com Signed-off-by: Rob Herring <robh@kernel.org> (cherry picked from commit `31c8025fac`) Bug: 190591509 Signed-off-by: Will Deacon <willdeacon@google.com> Change-Id: I2a5ea9ba6f78d2998cb5db040a2e13e85ec0194f	2021-12-02 09:43:37 +00:00
Jeson Gao	fc827b344f	ANDROID: thermal: Add vendor hooks for thermal Need to get the request frequency and target frequency use it to do frequency check and modify it according to the platform thermal requirements. Bug: 208166320 Signed-off-by: Jeson Gao <jeson.gao@unisoc.com> Change-Id: I776b43c8f559b8a072abd8d3abcb3528348b2c5d	2021-12-01 18:46:06 +00:00
Quentin Perret	479f72db7d	UPSTREAM: sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS SCHED_FLAG_KEEP_PARAMS can be passed to sched_setattr to specify that the call must not touch scheduling parameters (nice or priority). This is particularly handy for uclamp when used in conjunction with SCHED_FLAG_KEEP_POLICY as that allows to issue a syscall that only impacts uclamp values. However, sched_setattr always checks whether the priorities and nice values passed in sched_attr are valid first, even if those never get used down the line. This is useless at best since userspace can trivially bypass this check to set the uclamp values by specifying low priorities. However, it is cumbersome to do so as there is no single expression of this that skips both RT and CFS checks at once. As such, userspace needs to query the task policy first with e.g. sched_getattr and then set sched_attr.sched_priority accordingly. This is racy and slower than a single call. As the priority and nice checks are useless when SCHED_FLAG_KEEP_PARAMS is specified, simply inherit them in this case to match the policy inheritance of SCHED_FLAG_KEEP_POLICY. Bug: 208552362 Reported-by: Wei Wang <wvw@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Qais Yousef <qais.yousef@arm.com> Link: https://lore.kernel.org/r/20210805102154.590709-3-qperret@google.com (cherry picked from commit `f4dddf90d5`) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I18913667e69e558d96cbe3d991c1920e8342bf8b	2021-12-01 18:04:02 +00:00
Quentin Perret	c7ba583d00	UPSTREAM: sched: Don't report SCHED_FLAG_SUGOV in sched_getattr() SCHED_FLAG_SUGOV is supposed to be a kernel-only flag that userspace cannot interact with. However, sched_getattr() currently reports it in sched_flags if called on a sugov worker even though it is not actually defined in a UAPI header. To avoid this, make sure to clean-up the sched_flags field in sched_getattr() before returning to userspace. Bug: 208552362 Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210727101103.2729607-3-qperret@google.com (cherry picked from commit `7ad721bf10`) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I6ffe847ce0cc1135519b7061882d118cc8d683e1	2021-12-01 18:03:56 +00:00
Huang Jianan	13b2af2668	UPSTREAM: erofs: fix deadlock when shrink erofs slab We observed the following deadlock in the stress test under low memory scenario: Thread A Thread B - erofs_shrink_scan - erofs_try_to_release_workgroup - erofs_workgroup_try_to_freeze -- A - z_erofs_do_read_page - z_erofs_collection_begin - z_erofs_register_collection - erofs_insert_workgroup - xa_lock(&sbi->managed_pslots) -- B - erofs_workgroup_get - erofs_wait_on_workgroup_freezed -- A - xa_erase - xa_lock(&sbi->managed_pslots) -- B To fix this, it needs to hold xa_lock before freezing the workgroup since xarray will be touched then. So let's hold the lock before accessing each workgroup, just like what we did with the radix tree before. [ Gao Xiang: Jianhua Hao also reports this issue at https://lore.kernel.org/r/b10b85df30694bac8aadfe43537c897a@xiaomi.com ] Link: https://lore.kernel.org/r/20211118135844.3559-1-huangjianan@oppo.com Fixes: `64094a0441` ("erofs: convert workstn to XArray") Reviewed-by: Chao Yu <chao@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Huang Jianan <huangjianan@oppo.com> Reported-by: Jianhua Hao <haojianhua1@xiaomi.com> Signed-off-by: Gao Xiang <xiang@kernel.org> Signed-off-by: haojianhua1 <haojianhua1@xiaomi.com> Change-Id: Icdfb520c073de8773d44cff64fd5f7a8499cdc85 Signed-off-by: haojianhua1 <haojianhua1@xiaomi.com> Bug:208318427 (cherry picked from commit `57bbeacdbe`) Signed-off-by: Gao Xiang <xiang@kernel.org> Change-Id: Ie282b4362898fa8e0c829b8c2454887e953a2610 Signed-off-by: haojianhua1 <haojianhua1@xiaomi.com>	2021-12-01 17:48:15 +00:00
Catalin Marinas	02e966de95	UPSTREAM: KVM: arm64: Avoid setting the upper 32 bits of TCR_EL2 and CPTR_EL2 to 1 Having a signed (1 << 31) constant for TCR_EL2_RES1 and CPTR_EL2_TCPAC causes the upper 32-bit to be set to 1 when assigning them to a 64-bit variable. Bit 32 in TCR_EL2 is no longer RES0 in ARMv8.7: with FEAT_LPA2 it changes the meaning of bits 49:48 and 9:8 in the stage 1 EL2 page table entries. As a result of the sign-extension, a non-VHE kernel can no longer boot on a model with ARMv8.7 enabled. CPTR_EL2 still has the top 32 bits RES0 but we should preempt any future problems Make these top bit constants unsigned as per commit `df655b75c4` ("arm64: KVM: Avoid setting the upper 32 bits of VTCR_EL2 to 1"). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Chris January <Chris.January@arm.com> Cc: <stable@vger.kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211125152014.2806582-1-catalin.marinas@arm.com (cherry picked from commit `1f80d15020`) Bug: 204960018 Signed-off-by: Will Deacon <willdeacon@google.com> Change-Id: I078b87529df898086e0f883f28bd2e9ff7f0e74a	2021-12-01 15:09:15 +00:00
Marc Zyngier	cb3cc13c72	UPSTREAM: KVM: arm64: Move pkvm's special 32bit handling into a generic infrastructure Protected KVM is trying to turn AArch32 exceptions into an illegal exception entry. Unfortunately, it does that in a way that is a bit abrupt, and too early for PSTATE to be available. Instead, move it to the fixup code, which is a more reasonable place for it. This will also be useful for the NV code. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> (cherry picked from commit `7183b2b5ae`) Bug: 204960018 Signed-off-by: Will Deacon <willdeacon@google.com> Change-Id: I9e8926e8c9cd9399073216abb8885a3e2613836f	2021-12-01 15:09:15 +00:00
Marc Zyngier	e740c119d2	UPSTREAM: KVM: arm64: Save PSTATE early on exit In order to be able to use primitives such as vcpu_mode_is_32bit(), we need to synchronize the guest PSTATE. However, this is currently done deep into the bowels of the world-switch code, and we do have helpers evaluating this much earlier (__vgic_v3_perform_cpuif_access and handle_aarch32_guest, for example). Move the saving of the guest pstate into the early fixups, which cures the first issue. The second one will be addressed separately. Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> (cherry picked from commit `83bb2c1a01`) Bug: 204960018 Signed-off-by: Will Deacon <willdeacon@google.com> Change-Id: Iddbe02a20d34c470651da59390fb7f76ba89ba4b	2021-12-01 15:09:15 +00:00
Vitaly Kuznetsov	df21a6213e	UPSTREAM: KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus() Generally, it doesn't make sense to return the recommended maximum number of vCPUs which exceeds the maximum possible number of vCPUs. Note: ARM64 is special as the value returned by KVM_CAP_MAX_VCPUS differs depending on whether it is a system-wide ioctl or a per-VM one. Previously, KVM_CAP_NR_VCPUS didn't have this difference and it seems preferable to keep the status quo. Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus() which is what gets returned by system-wide KVM_CAP_MAX_VCPUS. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211116163443.88707-2-vkuznets@redhat.com> Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `f60a00d729`) Bug: 204960018 Signed-off-by: Will Deacon <willdeacon@google.com> Change-Id: I8f4470d8a79d111b8f4a2730f97ec1bde4d76935	2021-12-01 15:09:15 +00:00
Greg Kroah-Hartman	6a79abcd18	Merge 5.10.83 into android13-5.10 Changes in 5.10.83 bpf: Fix toctou on read-only map's constant scalar tracking ACPI: Get acpi_device's parent from the parent field USB: serial: option: add Telit LE910S1 0x9200 composition USB: serial: option: add Fibocom FM101-GL variants usb: dwc2: gadget: Fix ISOC flow for elapsed frames usb: dwc2: hcd_queue: Fix use of floating point literal usb: dwc3: gadget: Ignore NoStream after End Transfer usb: dwc3: gadget: Check for L1/L2/U3 for Start Transfer usb: dwc3: gadget: Fix null pointer exception net: nexthop: fix null pointer dereference when IPv6 is not enabled usb: chipidea: ci_hdrc_imx: fix potential error pointer dereference in probe usb: typec: fusb302: Fix masking of comparator and bc_lvl interrupts usb: hub: Fix usb enumeration issue due to address0 race usb: hub: Fix locking issues with address0_mutex binder: fix test regression due to sender_euid change ALSA: ctxfi: Fix out-of-range access ALSA: hda/realtek: Add quirk for ASRock NUC Box 1100 ALSA: hda/realtek: Fix LED on HP ProBook 435 G7 media: cec: copy sequence field for the reply Revert "parisc: Fix backtrace to always include init funtion names" HID: wacom: Use "Confidence" flag to prevent reporting invalid contacts staging/fbtft: Fix backlight staging: greybus: Add missing rwsem around snd_ctl_remove() calls staging: rtl8192e: Fix use after free in _rtl92e_pci_disconnect() fuse: release pipe buf after last use xen: don't continue xenstore initialization in case of errors xen: detect uninitialized xenbus in xenbus_init KVM: PPC: Book3S HV: Prevent POWER7/8 TLB flush flushing SLB tracing/uprobe: Fix uprobe_perf_open probes iteration tracing: Fix pid filtering when triggers are attached mmc: sdhci-esdhc-imx: disable CMDQ support mmc: sdhci: Fix ADMA for PAGE_SIZE >= 64KiB mdio: aspeed: Fix "Link is Down" issue powerpc/32: Fix hardlockup on vmap stack overflow PCI: aardvark: Deduplicate code in advk_pcie_rd_conf() PCI: aardvark: Update comment about disabling link training PCI: aardvark: Implement re-issuing config requests on CRS response PCI: aardvark: Simplify initialization of rootcap on virtual bridge PCI: aardvark: Fix link training proc/vmcore: fix clearing user buffer by properly using clear_user() netfilter: ctnetlink: fix filtering with CTA_TUPLE_REPLY netfilter: ctnetlink: do not erase error code with EINVAL netfilter: ipvs: Fix reuse connection if RS weight is 0 netfilter: flowtable: fix IPv6 tunnel addr match ARM: dts: BCM5301X: Fix I2C controller interrupt ARM: dts: BCM5301X: Add interrupt properties to GPIO node ARM: dts: bcm2711: Fix PCIe interrupts ASoC: qdsp6: q6routing: Conditionally reset FrontEnd Mixer ASoC: qdsp6: q6asm: fix q6asm_dai_prepare error handling ASoC: topology: Add missing rwsem around snd_ctl_remove() calls ASoC: codecs: wcd934x: return error code correctly from hw_params net: ieee802154: handle iftypes as u32 firmware: arm_scmi: pm: Propagate return value to caller NFSv42: Don't fail clone() unless the OP_CLONE operation failed ARM: socfpga: Fix crash with CONFIG_FORTIRY_SOURCE drm/nouveau/acr: fix a couple NULL vs IS_ERR() checks scsi: mpt3sas: Fix kernel panic during drive powercycle test drm/vc4: fix error code in vc4_create_object() net: marvell: prestera: fix double free issue on err path iavf: Prevent changing static ITR values if adaptive moderation is on ALSA: intel-dsp-config: add quirk for JSL devices based on ES8336 codec mptcp: fix delack timer firmware: smccc: Fix check for ARCH_SOC_ID not implemented ipv6: fix typos in __ip6_finish_output() nfp: checking parameter process for rx-usecs/tx-usecs is invalid net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume net: stmmac: retain PTP clock time during SIOCSHWTSTAMP ioctls net: ipv6: add fib6_nh_release_dsts stub net: nexthop: release IPv6 per-cpu dsts when replacing a nexthop group ice: fix vsi->txq_map sizing ice: avoid bpf_prog refcount underflow scsi: core: sysfs: Fix setting device state to SDEV_RUNNING scsi: scsi_debug: Zero clear zones at reset write pointer erofs: fix deadlock when shrink erofs slab net/smc: Ensure the active closing peer first closes clcsock mlxsw: Verify the accessed index doesn't exceed the array length mlxsw: spectrum: Protect driver from buggy firmware net: marvell: mvpp2: increase MTU limit when XDP enabled nvmet-tcp: fix incomplete data digest send net/ncsi : Add payload to be 32-bit aligned to fix dropped packets PM: hibernate: use correct mode for swsusp_close() drm/amd/display: Set plane update flags for all planes in reset tcp_cubic: fix spurious Hystart ACK train detections for not-cwnd-limited flows lan743x: fix deadlock in lan743x_phy_link_status_change() net: phylink: Force link down and retrigger resolve on interface change net: phylink: Force retrigger in case of latched link-fail indicator net/smc: Fix NULL pointer dereferencing in smc_vlan_by_tcpsk() net/smc: Fix loop in smc_listen nvmet: use IOCB_NOWAIT only if the filesystem supports it igb: fix netpoll exit with traffic MIPS: loongson64: fix FTLB configuration MIPS: use 3-level pgtable for 64KB page size on MIPS_VA_BITS_48 tls: splice_read: fix record type check tls: fix replacing proto_ops net/sched: sch_ets: don't peek at classes beyond 'nbands' net: vlan: fix underflow for the real_dev refcnt net/smc: Don't call clcsock shutdown twice when smc shutdown net: hns3: fix VF RSS failed problem after PF enable multi-TCs net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP net: mscc: ocelot: correctly report the timestamping RX filters in ethtool tcp: correctly handle increased zerocopy args struct size sched/scs: Reset task stack state in bringup_cpu() f2fs: set SBI_NEED_FSCK flag when inconsistent node block found ceph: properly handle statfs on multifs setups smb3: do not error on fsync when readonly iommu/amd: Clarify AMD IOMMUv2 initialization messages vhost/vsock: fix incorrect used length reported to the guest tracing: Check pid filtering when creating events xen: sync include/xen/interface/io/ring.h with Xen's newest version xen/blkfront: read response from backend only once xen/blkfront: don't take local copy of a request from the ring page xen/blkfront: don't trust the backend response data blindly xen/netfront: read response from backend only once xen/netfront: don't read data from request on the ring page xen/netfront: disentangle tx_skb_freelist xen/netfront: don't trust the backend response data blindly tty: hvc: replace BUG_ON() with negative return value s390/mm: validate VMA in PGSTE manipulation functions shm: extend forced shm destroy to support objects from several IPC nses net: stmmac: platform: fix build warning when with !CONFIG_PM_SLEEP drm/amdgpu/gfx9: switch to golden tsc registers for renoir+ Linux 5.10.83 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I934dc727030cfb60b31525252df104436ff00ae0	2021-12-01 09:37:11 +01:00
Greg Kroah-Hartman	a324ad7945	Linux 5.10.83 Link: https://lore.kernel.org/r/20211129181711.642046348@linuxfoundation.org Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Hulk Robot <hulkrobot@huawei.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Fox Chen <foxhlchen@gmail.com> Tested-by: Pavel Machek (CIP) <pavel@denx.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:10 +01:00
Alex Deucher	45b42cd053	drm/amdgpu/gfx9: switch to golden tsc registers for renoir+ commit `53af98c091` upstream. Renoir and newer gfx9 APUs have new TSC register that is not part of the gfxoff tile, so it can be read without needing to disable gfx off. Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:10 +01:00
Joakim Zhang	98b02755d5	net: stmmac: platform: fix build warning when with !CONFIG_PM_SLEEP commit `2a48d96fd5` upstream. Use __maybe_unused for noirq_suspend()/noirq_resume() hooks to avoid build warning with !CONFIG_PM_SLEEP: >> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:796:12: error: 'stmmac_pltfr_noirq_resume' defined but not used [-Werror=unused-function] 796 \| static int stmmac_pltfr_noirq_resume(struct device dev) \| ^~~~~~~~~~~~~~~~~~~~~~~~~ >> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:775:12: error: 'stmmac_pltfr_noirq_suspend' defined but not used [-Werror=unused-function] 775 \| static int stmmac_pltfr_noirq_suspend(struct device dev) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Fixes: `276aae3772` ("net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:10 +01:00
Alexander Mikhalitsyn	a15261d2a1	shm: extend forced shm destroy to support objects from several IPC nses commit `85b6d24646` upstream. Currently, the exit_shm() function not designed to work properly when task->sysvshm.shm_clist holds shm objects from different IPC namespaces. This is a real pain when sysctl kernel.shm_rmid_forced = 1, because it leads to use-after-free (reproducer exists). This is an attempt to fix the problem by extending exit_shm mechanism to handle shm's destroy from several IPC ns'es. To achieve that we do several things: 1. add a namespace (non-refcounted) pointer to the struct shmid_kernel 2. during new shm object creation (newseg()/shmget syscall) we initialize this pointer by current task IPC ns 3. exit_shm() fully reworked such that it traverses over all shp's in task->sysvshm.shm_clist and gets IPC namespace not from current task as it was before but from shp's object itself, then call shm_destroy(shp, ns). Note: We need to be really careful here, because as it was said before (1), our pointer to IPC ns non-refcnt'ed. To be on the safe side we using special helper get_ipc_ns_not_zero() which allows to get IPC ns refcounter only if IPC ns not in the "state of destruction". Q/A Q: Why can we access shp->ns memory using non-refcounted pointer? A: Because shp object lifetime is always shorther than IPC namespace lifetime, so, if we get shp object from the task->sysvshm.shm_clist while holding task_lock(task) nobody can steal our namespace. Q: Does this patch change semantics of unshare/setns/clone syscalls? A: No. It's just fixes non-covered case when process may leave IPC namespace without getting task->sysvshm.shm_clist list cleaned up. Link: https://lkml.kernel.org/r/67bb03e5-f79c-1815-e2bf-949c67047418@colorfullife.com Link: https://lkml.kernel.org/r/20211109151501.4921-1-manfred@colorfullife.com Fixes: `ab602f7991` ("shm: make exit_shm work proportional to task activity") Co-developed-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: Vasily Averin <vvs@virtuozzo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:10 +01:00
David Hildenbrand	aa20e966d8	s390/mm: validate VMA in PGSTE manipulation functions commit `fe3d100240` upstream. We should not walk/touch page tables outside of VMA boundaries when holding only the mmap sem in read mode. Evil user space can modify the VMA layout just before this function runs and e.g., trigger races with page table removal code since commit `dd2283f260` ("mm: mmap: zap pages with read mmap_sem in munmap"). gfn_to_hva() will only translate using KVM memory regions, but won't validate the VMA. Further, we should not allocate page tables outside of VMA boundaries: if evil user space decides to map hugetlbfs to these ranges, bad things will happen because we suddenly have PTE or PMD page tables where we shouldn't have them. Similarly, we have to check if we suddenly find a hugetlbfs VMA, before calling get_locked_pte(). Fixes: `2d42f94773` ("s390/kvm: Add PGSTE manipulation functions") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20210909162248.14969-4-david@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:10 +01:00
Juergen Gross	a94e4a7b77	tty: hvc: replace BUG_ON() with negative return value commit `e679004dec` upstream. Xen frontends shouldn't BUG() in case of illegal data received from their backends. So replace the BUG_ON()s when reading illegal data from the ring page with negative return values. Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20210707091045.460-1-jgross@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:10 +01:00
Juergen Gross	1c5f722a8f	xen/netfront: don't trust the backend response data blindly commit `a884daa61a` upstream. Today netfront will trust the backend to send only sane response data. In order to avoid privilege escalations or crashes in case of malicious backends verify the data to be within expected limits. Especially make sure that the response always references an outstanding request. Note that only the tx queue needs special id handling, as for the rx queue the id is equal to the index in the ring page. Introduce a new indicator for the device whether it is broken and let the device stop working when it is set. Set this indicator in case the backend sets any weird data. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Juergen Gross	334b0f2787	xen/netfront: disentangle tx_skb_freelist commit `21631d2d74` upstream. The tx_skb_freelist elements are in a single linked list with the request id used as link reference. The per element link field is in a union with the skb pointer of an in use request. Move the link reference out of the union in order to enable a later reuse of it for requests which need a populated skb pointer. Rename add_id_to_freelist() and get_id_from_freelist() to add_id_to_list() and get_id_from_list() in order to prepare using those for other lists as well. Define ~0 as value to indicate the end of a list and place that value into the link for a request not being on the list. When freeing a skb zero the skb pointer in the request. Use a NULL value of the skb pointer instead of skb_entry_is_link() for deciding whether a request has a skb linked to it. Remove skb_entry_set_link() and open code it instead as it is really trivial now. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Juergen Gross	e17ee047ee	xen/netfront: don't read data from request on the ring page commit `162081ec33` upstream. In order to avoid a malicious backend being able to influence the local processing of a request build the request locally first and then copy it to the ring page. Any reading from the request influencing the processing in the frontend needs to be done on the local instance. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Juergen Gross	f5e4937098	xen/netfront: read response from backend only once commit `8446066bf8` upstream. In order to avoid problems in case the backend is modifying a response on the ring page while the frontend has already seen it, just read the response into a local buffer in one go and then operate on that buffer only. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Juergen Gross	1ffb20f052	xen/blkfront: don't trust the backend response data blindly commit `b94e4b147f` upstream. Today blkfront will trust the backend to send only sane response data. In order to avoid privilege escalations or crashes in case of malicious backends verify the data to be within expected limits. Especially make sure that the response always references an outstanding request. Introduce a new state of the ring BLKIF_STATE_ERROR which will be switched to in case an inconsistency is being detected. Recovering from this state is possible only via removing and adding the virtual device again (e.g. via a suspend/resume cycle). Make all warning messages issued due to valid error responses rate limited in order to avoid message floods being triggered by a malicious backend. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210730103854.12681-4-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Juergen Gross	8e147855fc	xen/blkfront: don't take local copy of a request from the ring page commit `8f5a695d99` upstream. In order to avoid a malicious backend being able to influence the local copy of a request build the request locally first and then copy it to the ring page instead of doing it the other way round as today. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210730103854.12681-3-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Juergen Gross	273f04d5d1	xen/blkfront: read response from backend only once commit `71b66243f9` upstream. In order to avoid problems in case the backend is modifying a response on the ring page while the frontend has already seen it, just read the response into a local buffer in one go and then operate on that buffer only. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210730103854.12681-2-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Juergen Gross	b98284aa3f	xen: sync include/xen/interface/io/ring.h with Xen's newest version commit `629a5d87e2` upstream. Sync include/xen/interface/io/ring.h with Xen's newest version in order to get the RING_COPY_RESPONSE() and RING_RESPONSE_PROD_OVERFLOW() macros. Note that this will correct the wrong license info by adding the missing original copyright notice. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Steven Rostedt (VMware)	406f2d5fe3	tracing: Check pid filtering when creating events commit `6cb206508b` upstream. When pid filtering is activated in an instance, all of the events trace files for that instance has the PID_FILTER flag set. This determines whether or not pid filtering needs to be done on the event, otherwise the event is executed as normal. If pid filtering is enabled when an event is created (via a dynamic event or modules), its flag is not updated to reflect the current state, and the events are not filtered properly. Cc: stable@vger.kernel.org Fixes: `3fdaf80f4a` ("tracing: Implement event pid filtering") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Stefano Garzarella	4fd0ad08ee	vhost/vsock: fix incorrect used length reported to the guest commit `49d8c5ffad` upstream. The "used length" reported by calling vhost_add_used() must be the number of bytes written by the device (using "in" buffers). In vhost_vsock_handle_tx_kick() the device only reads the guest buffers (they are all "out" buffers), without writing anything, so we must pass 0 as "used length" to comply virtio spec. Fixes: `433fc58e6b` ("VSOCK: Introduce vhost_vsock.ko") Cc: stable@vger.kernel.org Reported-by: Halil Pasic <pasic@linux.ibm.com> Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20211122163525.294024-2-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Joerg Roedel	fbc0514e1a	iommu/amd: Clarify AMD IOMMUv2 initialization messages commit `717e88aad3` upstream. The messages printed on the initialization of the AMD IOMMUv2 driver have caused some confusion in the past. Clarify the messages to lower the confusion in the future. Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20211123105507.7654-3-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-01 09:19:09 +01:00
Steve French	5655b8bccb	smb3: do not error on fsync when readonly [ Upstream commit `71e6864eac` ] Linux allows doing a flush/fsync on a file open for read-only, but the protocol does not allow that. If the file passed in on the flush is read-only try to find a writeable handle for the same inode, if that is not possible skip sending the fsync call to the server to avoid breaking the apps. Reported-by: Julian Sikorski <belegdol@gmail.com> Tested-by: Julian Sikorski <belegdol@gmail.com> Suggested-by: Jeremy Allison <jra@samba.org> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:08 +01:00
Jeff Layton	c380062d08	ceph: properly handle statfs on multifs setups [ Upstream commit `8cfc0c7ed3` ] ceph_statfs currently stuffs the cluster fsid into the f_fsid field. This was fine when we only had a single filesystem per cluster, but now that we have multiples we need to use something that will vary between them. Change ceph_statfs to xor each 32-bit chunk of the fsid (aka cluster id) into the lower bits of the statfs->f_fsid. Change the lower bits to hold the fscid (filesystem ID within the cluster). That should give us a value that is guaranteed to be unique between filesystems within a cluster, and should minimize the chance of collisions between mounts of different clusters. URL: https://tracker.ceph.com/issues/52812 Reported-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:08 +01:00
Weichao Guo	22423c966e	f2fs: set SBI_NEED_FSCK flag when inconsistent node block found [ Upstream commit `6663b138de` ] Inconsistent node block will cause a file fail to open or read, which could make the user process crashes or stucks. Let's mark SBI_NEED_FSCK flag to trigger a fix at next fsck time. After unlinking the corrupted file, the user process could regenerate a new one and work correctly. Signed-off-by: Weichao Guo <guoweichao@oppo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:08 +01:00
Mark Rutland	e6ee7abd6b	sched/scs: Reset task stack state in bringup_cpu() [ Upstream commit `dce1ca0525` ] To hot unplug a CPU, the idle task on that CPU calls a few layers of C code before finally leaving the kernel. When KASAN is in use, poisoned shadow is left around for each of the active stack frames, and when shadow call stacks are in use. When shadow call stacks (SCS) are in use the task's saved SCS SP is left pointing at an arbitrary point within the task's shadow call stack. When a CPU is offlined than onlined back into the kernel, this stale state can adversely affect execution. Stale KASAN shadow can alias new stackframes and result in bogus KASAN warnings. A stale SCS SP is effectively a memory leak, and prevents a portion of the shadow call stack being used. Across a number of hotplug cycles the idle task's entire shadow call stack can become unusable. We previously fixed the KASAN issue in commit: `e1b77c9298` ("sched/kasan: remove stale KASAN poison after hotplug") ... by removing any stale KASAN stack poison immediately prior to onlining a CPU. Subsequently in commit: `f1a0a376ca` ("sched/core: Initialize the idle task with preemption disabled") ... the refactoring left the KASAN and SCS cleanup in one-time idle thread initialization code rather than something invoked prior to each CPU being onlined, breaking both as above. We fixed SCS (but not KASAN) in commit: `63acd42c0d` ("sched/scs: Reset the shadow stack when idle_task_exit") ... but as this runs in the context of the idle task being offlined it's potentially fragile. To fix these consistently and more robustly, reset the SCS SP and KASAN shadow of a CPU's idle task immediately before we online that CPU in bringup_cpu(). This ensures the idle task always has a consistent state when it is running, and removes the need to so so when exiting an idle task. Whenever any thread is created, dup_task_struct() will give the task a stack which is free of KASAN shadow, and initialize the task's SCS SP, so there's no need to specially initialize either for idle thread within init_idle(), as this was only necessary to handle hotplug cycles. I've tested this on arm64 with: * gcc 11.1.0, defconfig +KASAN_INLINE, KASAN_STACK * clang 12.0.0, defconfig +KASAN_INLINE, KASAN_STACK, SHADOW_CALL_STACK ... offlining and onlining CPUS with: \| while true; do \| for C in /sys/devices/system/cpu/cpu*/online; do \| echo 0 > $C; \| echo 1 > $C; \| done \| done Fixes: `f1a0a376ca` ("sched/core: Initialize the idle task with preemption disabled") Reported-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Qian Cai <quic_qiancai@quicinc.com> Link: https://lore.kernel.org/lkml/20211115113310.35693-1-mark.rutland@arm.com/ Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:08 +01:00
Arjun Roy	71e38a0c7c	tcp: correctly handle increased zerocopy args struct size [ Upstream commit `e0fecb289a` ] A prior patch increased the size of struct tcp_zerocopy_receive but did not update do_tcp_getsockopt() handling to properly account for this. This patch simply reintroduces content erroneously cut from the referenced prior patch that handles the new struct size. Fixes: `18fb76ed53` ("net-zerocopy: Copy straggler unaligned data for TCP Rx. zerocopy.") Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:08 +01:00
Vladimir Oltean	72f2117e45	net: mscc: ocelot: correctly report the timestamping RX filters in ethtool [ Upstream commit `c49a35eedf` ] The driver doesn't support RX timestamping for non-PTP packets, but it declares that it does. Restrict the reported RX filters to PTP v2 over L2 and over L4. Fixes: `4e3b0468e6` ("net: mscc: PTP Hardware Clock (PHC) support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:08 +01:00
Vladimir Oltean	73115a2b38	net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP [ Upstream commit `8a075464d1` ] The ocelot driver, when asked to timestamp all receiving packets, 1588 v1 or NTP, says "nah, here's 1588 v2 for you". According to this discussion: https://patchwork.kernel.org/project/netdevbpf/patch/20211104133204.19757-8-martin.kaistra@linutronix.de/#24577647 drivers that downgrade from a wider request to a narrower response (or even a response where the intersection with the request is empty) are buggy, and should return -ERANGE instead. This patch fixes that. Fixes: `4e3b0468e6` ("net: mscc: PTP Hardware Clock (PHC) support") Suggested-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:08 +01:00
Guangbin Huang	62343dadbb	net: hns3: fix VF RSS failed problem after PF enable multi-TCs [ Upstream commit `8d2ad993aa` ] When PF is set to multi-TCs and configured mapping relationship between priorities and TCs, the hardware will active these settings for this PF and its VFs. In this case when VF just uses one TC and its rx packets contain priority, and if the priority is not mapped to TC0, as other TCs of VF is not valid, hardware always put this kind of packets to the queue 0. It cause this kind of packets of VF can not be used RSS function. To fix this problem, set tc mode of all unused TCs of VF to the setting of TC0, then rx packet with priority which map to unused TC will be direct to TC0. Fixes: `e2cb1dec97` ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:08 +01:00
Tony Lu	215167df45	net/smc: Don't call clcsock shutdown twice when smc shutdown [ Upstream commit `bacb6c1e47` ] When applications call shutdown() with SHUT_RDWR in userspace, smc_close_active() calls kernel_sock_shutdown(), and it is called twice in smc_shutdown(). This fixes this by checking sk_state before do clcsock shutdown, and avoids missing the application's call of smc_shutdown(). Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linux.ibm.com/ Fixes: `606a63c978` ("net/smc: Ensure the active closing peer first closes clcsock") Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/20211126024134.45693-1-tonylu@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:08 +01:00
Ziyang Xuan	6e800ee432	net: vlan: fix underflow for the real_dev refcnt [ Upstream commit `01d9cc2dea` ] Inject error before dev_hold(real_dev) in register_vlan_dev(), and execute the following testcase: ip link add dev dummy1 type dummy ip link add name dummy1.100 link dummy1 type vlan id 100 ip link del dev dummy1 When the dummy netdevice is removed, we will get a WARNING as following: ======================================================================= refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 2 PID: 0 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 and an endless loop of: ======================================================================= unregister_netdevice: waiting for dummy1 to become free. Usage count = -1073741824 That is because dev_put(real_dev) in vlan_dev_free() be called without dev_hold(real_dev) in register_vlan_dev(). It makes the refcnt of real_dev underflow. Move the dev_hold(real_dev) to vlan_dev_init() which is the call-back of ndo_init(). That makes dev_hold() and dev_put() for vlan's real_dev symmetrical. Fixes: `563bcbae3b` ("net: vlan: fix a UAF in vlan_dev_real_dev()") Reported-by: Petr Machata <petrm@nvidia.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Link: https://lore.kernel.org/r/20211126015942.2918542-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:08 +01:00
Davide Caratti	ae2659d2c6	net/sched: sch_ets: don't peek at classes beyond 'nbands' [ Upstream commit `de6d25924c` ] when the number of DRR classes decreases, the round-robin active list can contain elements that have already been freed in ets_qdisc_change(). As a consequence, it's possible to see a NULL dereference crash, caused by the attempt to call cl->qdisc->ops->peek(cl->qdisc) when cl->qdisc is NULL: BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 910 Comm: mausezahn Not tainted 5.16.0-rc1+ #475 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 RIP: 0010:ets_qdisc_dequeue+0x129/0x2c0 [sch_ets] Code: c5 01 41 39 ad e4 02 00 00 0f 87 18 ff ff ff 49 8b 85 c0 02 00 00 49 39 c4 0f 84 ba 00 00 00 49 8b ad c0 02 00 00 48 8b 7d 10 <48> 8b 47 18 48 8b 40 38 0f ae e8 ff d0 48 89 c3 48 85 c0 0f 84 9d RSP: 0000:ffffbb36c0b5fdd8 EFLAGS: 00010287 RAX: ffff956678efed30 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffff9b938dc9 RDI: 0000000000000000 RBP: ffff956678efed30 R08: e2f3207fe360129c R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff956678efeac0 R13: ffff956678efe800 R14: ffff956611545000 R15: ffff95667ac8f100 FS: 00007f2aa9120740(0000) GS:ffff95667b800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 000000011070c000 CR4: 0000000000350ee0 Call Trace: <TASK> qdisc_peek_dequeued+0x29/0x70 [sch_ets] tbf_dequeue+0x22/0x260 [sch_tbf] __qdisc_run+0x7f/0x630 net_tx_action+0x290/0x4c0 __do_softirq+0xee/0x4f8 irq_exit_rcu+0xf4/0x130 sysvec_apic_timer_interrupt+0x52/0xc0 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0033:0x7f2aa7fc9ad4 Code: b9 ff ff 48 8b 54 24 18 48 83 c4 08 48 89 ee 48 89 df 5b 5d e9 ed fc ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <53> 48 83 ec 10 48 8b 05 10 64 33 00 48 8b 00 48 85 c0 0f 85 84 00 RSP: 002b:00007ffe5d33fab8 EFLAGS: 00000202 RAX: 0000000000000002 RBX: 0000561f72c31460 RCX: 0000561f72c31720 RDX: 0000000000000002 RSI: 0000561f72c31722 RDI: 0000561f72c31720 RBP: 000000000000002a R08: 00007ffe5d33fa40 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 0000561f7187e380 R13: 0000000000000000 R14: 0000000000000000 R15: 0000561f72c31460 </TASK> Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt intel_rapl_msr iTCO_vendor_support intel_rapl_common joydev virtio_balloon lpc_ich i2c_i801 i2c_smbus pcspkr ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel serio_raw libata virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000018 Ensuring that 'alist' was never zeroed [1] was not sufficient, we need to remove from the active list those elements that are no more SP nor DRR. [1] https://lore.kernel.org/netdev/60d274838bf09777f0371253416e8af71360bc08.1633609148.git.dcaratti@redhat.com/ v3: fix race between ets_qdisc_change() and ets_qdisc_dequeue() delisting DRR classes beyond 'nbands' in ets_qdisc_change() with the qdisc lock acquired, thanks to Cong Wang. v2: when a NULL qdisc is found in the DRR active list, try to dequeue skb from the next list item. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Fixes: `dcc68b4d80` ("net: sch_ets: Add a new Qdisc") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/7a5c496eed2d62241620bdbb83eb03fb9d571c99.1637762721.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:07 +01:00
Jakub Kicinski	e3509feb46	tls: fix replacing proto_ops [ Upstream commit `f3911f73f5` ] We replace proto_ops whenever TLS is configured for RX. But our replacement also overrides sendpage_locked, which will crash unless TX is also configured. Similarly we plug both of those in for TLS_HW (NIC crypto offload) even tho TLS_HW has a completely different implementation for TX. Last but not least we always plug in something based on inet_stream_ops even though a few of the callbacks differ for IPv6 (getname, release, bind). Use a callback building method similar to what we do for struct proto. Fixes: `c46234ebb4` ("tls: RX path for ktls") Fixes: `d4ffb02dee` ("net/tls: enable sk_msg redirect to tls socket egress") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:07 +01:00
Jakub Kicinski	22156242b1	tls: splice_read: fix record type check [ Upstream commit `520493f66f` ] We don't support splicing control records. TLS 1.3 changes moved the record type check into the decrypt if(). The skb may already be decrypted and still be an alert. Note that decrypt_skb_update() is idempotent and updates ctx->decrypted so the if() is pointless. Reorder the check for decryption errors with the content type check while touching them. This part is not really a bug, because if decryption failed in TLS 1.3 content type will be DATA, and for TLS 1.2 it will be correct. Nevertheless its strange to touch output before checking if the function has failed. Fixes: `fedf201e12` ("net: tls: Refactor control message handling on recv") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-12-01 09:19:07 +01:00

1 2 3 4 5 ...

985159 Commits