Commit Graph

35505 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
e05e5bd887 Merge 5.10.42 into android13-5.10
Changes in 5.10.42
	ALSA: hda/realtek: the bass speaker can't output sound on Yoga 9i
	ALSA: hda/realtek: Headphone volume is controlled by Front mixer
	ALSA: hda/realtek: Chain in pop reduction fixup for ThinkStation P340
	ALSA: hda/realtek: fix mute/micmute LEDs for HP 855 G8
	ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook G8
	ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook Fury 15 G8
	ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook Fury 17 G8
	ALSA: usb-audio: scarlett2: Fix device hang with ehci-pci
	ALSA: usb-audio: scarlett2: Improve driver startup messages
	cifs: set server->cipher_type to AES-128-CCM for SMB3.0
	NFSv4: Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()
	iommu/vt-d: Fix sysfs leak in alloc_iommu()
	perf intel-pt: Fix sample instruction bytes
	perf intel-pt: Fix transaction abort handling
	perf scripts python: exported-sql-viewer.py: Fix copy to clipboard from Top Calls by elapsed Time report
	perf scripts python: exported-sql-viewer.py: Fix Array TypeError
	perf scripts python: exported-sql-viewer.py: Fix warning display
	proc: Check /proc/$pid/attr/ writes against file opener
	net: hso: fix control-request directions
	net/sched: fq_pie: re-factor fix for fq_pie endless loop
	net/sched: fq_pie: fix OOB access in the traffic path
	netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check, fallback to non-AVX2 version
	mac80211: assure all fragments are encrypted
	mac80211: prevent mixed key and fragment cache attacks
	mac80211: properly handle A-MSDUs that start with an RFC 1042 header
	cfg80211: mitigate A-MSDU aggregation attacks
	mac80211: drop A-MSDUs on old ciphers
	mac80211: add fragment cache to sta_info
	mac80211: check defrag PN against current frame
	mac80211: prevent attacks on TKIP/WEP as well
	mac80211: do not accept/forward invalid EAPOL frames
	mac80211: extend protection against mixed key and fragment cache attacks
	ath10k: add CCMP PN replay protection for fragmented frames for PCIe
	ath10k: drop fragments with multicast DA for PCIe
	ath10k: drop fragments with multicast DA for SDIO
	ath10k: drop MPDU which has discard flag set by firmware for SDIO
	ath10k: Fix TKIP Michael MIC verification for PCIe
	ath10k: Validate first subframe of A-MSDU before processing the list
	ath11k: Clear the fragment cache during key install
	dm snapshot: properly fix a crash when an origin has no snapshots
	drm/amd/pm: correct MGpuFanBoost setting
	drm/amdgpu/vcn1: add cancel_delayed_work_sync before power gate
	drm/amdkfd: correct sienna_cichlid SDMA RLC register offset error
	drm/amdgpu/vcn2.0: add cancel_delayed_work_sync before power gate
	drm/amdgpu/vcn2.5: add cancel_delayed_work_sync before power gate
	drm/amdgpu/jpeg2.0: add cancel_delayed_work_sync before power gate
	selftests/gpio: Use TEST_GEN_PROGS_EXTENDED
	selftests/gpio: Move include of lib.mk up
	selftests/gpio: Fix build when source tree is read only
	kgdb: fix gcc-11 warnings harder
	Documentation: seccomp: Fix user notification documentation
	seccomp: Refactor notification handler to prepare for new semantics
	serial: core: fix suspicious security_locked_down() call
	misc/uss720: fix memory leak in uss720_probe
	thunderbolt: usb4: Fix NVM read buffer bounds and offset issue
	thunderbolt: dma_port: Fix NVM read buffer bounds and offset issue
	KVM: X86: Fix vCPU preempted state from guest's point of view
	KVM: arm64: Prevent mixed-width VM creation
	mei: request autosuspend after sending rx flow control
	staging: iio: cdc: ad7746: avoid overwrite of num_channels
	iio: gyro: fxas21002c: balance runtime power in error path
	iio: dac: ad5770r: Put fwnode in error case during ->probe()
	iio: adc: ad7768-1: Fix too small buffer passed to iio_push_to_buffers_with_timestamp()
	iio: adc: ad7124: Fix missbalanced regulator enable / disable on error.
	iio: adc: ad7124: Fix potential overflow due to non sequential channel numbers
	iio: adc: ad7923: Fix undersized rx buffer.
	iio: adc: ad7793: Add missing error code in ad7793_setup()
	iio: adc: ad7192: Avoid disabling a clock that was never enabled.
	iio: adc: ad7192: handle regulator voltage error first
	serial: 8250: Add UART_BUG_TXRACE workaround for Aspeed VUART
	serial: 8250_dw: Add device HID for new AMD UART controller
	serial: 8250_pci: Add support for new HPE serial device
	serial: 8250_pci: handle FL_NOIRQ board flag
	USB: trancevibrator: fix control-request direction
	Revert "irqbypass: do not start cons/prod when failed connect"
	USB: usbfs: Don't WARN about excessively large memory allocations
	drivers: base: Fix device link removal
	serial: tegra: Fix a mask operation that is always true
	serial: sh-sci: Fix off-by-one error in FIFO threshold register setting
	serial: rp2: use 'request_firmware' instead of 'request_firmware_nowait'
	USB: serial: ti_usb_3410_5052: add startech.com device id
	USB: serial: option: add Telit LE910-S1 compositions 0x7010, 0x7011
	USB: serial: ftdi_sio: add IDs for IDS GmbH Products
	USB: serial: pl2303: add device id for ADLINK ND-6530 GC
	thermal/drivers/intel: Initialize RW trip to THERMAL_TEMP_INVALID
	usb: dwc3: gadget: Properly track pending and queued SG
	usb: gadget: udc: renesas_usb3: Fix a race in usb3_start_pipen()
	usb: typec: mux: Fix matching with typec_altmode_desc
	net: usb: fix memory leak in smsc75xx_bind
	Bluetooth: cmtp: fix file refcount when cmtp_attach_device fails
	fs/nfs: Use fatal_signal_pending instead of signal_pending
	NFS: fix an incorrect limit in filelayout_decode_layout()
	NFS: Fix an Oopsable condition in __nfs_pageio_add_request()
	NFS: Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
	NFSv4: Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
	drm/meson: fix shutdown crash when component not probed
	net/mlx5e: reset XPS on error flow if netdev isn't registered yet
	net/mlx5e: Fix multipath lag activation
	net/mlx5e: Fix error path of updating netdev queues
	{net,vdpa}/mlx5: Configure interface MAC into mpfs L2 table
	net/mlx5e: Fix nullptr in add_vlan_push_action()
	net/mlx5: Set reformat action when needed for termination rules
	net/mlx5e: Fix null deref accessing lag dev
	net/mlx4: Fix EEPROM dump support
	net/mlx5: Set term table as an unmanaged flow table
	SUNRPC in case of backlog, hand free slots directly to waiting task
	Revert "net:tipc: Fix a double free in tipc_sk_mcast_rcv"
	tipc: wait and exit until all work queues are done
	tipc: skb_linearize the head skb when reassembling msgs
	spi: spi-fsl-dspi: Fix a resource leak in an error handling path
	netfilter: flowtable: Remove redundant hw refresh bit
	net: dsa: mt7530: fix VLAN traffic leaks
	net: dsa: fix a crash if ->get_sset_count() fails
	net: dsa: sja1105: update existing VLANs from the bridge VLAN list
	net: dsa: sja1105: use 4095 as the private VLAN for untagged traffic
	net: dsa: sja1105: error out on unsupported PHY mode
	net: dsa: sja1105: add error handling in sja1105_setup()
	net: dsa: sja1105: call dsa_unregister_switch when allocating memory fails
	net: dsa: sja1105: fix VL lookup command packing for P/Q/R/S
	i2c: s3c2410: fix possible NULL pointer deref on read message after write
	i2c: mediatek: Disable i2c start_en and clear intr_stat brfore reset
	i2c: i801: Don't generate an interrupt on bus reset
	i2c: sh_mobile: Use new clock calculation formulas for RZ/G2E
	afs: Fix the nlink handling of dir-over-dir rename
	perf jevents: Fix getting maximum number of fds
	nvmet-tcp: fix inline data size comparison in nvmet_tcp_queue_response
	mptcp: avoid error message on infinite mapping
	mptcp: drop unconditional pr_warn on bad opt
	mptcp: fix data stream corruption
	platform/x86: hp_accel: Avoid invoking _INI to speed up resume
	gpio: cadence: Add missing MODULE_DEVICE_TABLE
	Revert "crypto: cavium/nitrox - add an error message to explain the failure of pci_request_mem_regions"
	Revert "media: usb: gspca: add a missed check for goto_low_power"
	Revert "ALSA: sb: fix a missing check of snd_ctl_add"
	Revert "serial: max310x: pass return value of spi_register_driver"
	serial: max310x: unregister uart driver in case of failure and abort
	Revert "net: fujitsu: fix a potential NULL pointer dereference"
	net: fujitsu: fix potential null-ptr-deref
	Revert "net/smc: fix a NULL pointer dereference"
	net/smc: properly handle workqueue allocation failure
	Revert "net: caif: replace BUG_ON with recovery code"
	net: caif: remove BUG_ON(dev == NULL) in caif_xmit
	Revert "char: hpet: fix a missing check of ioremap"
	char: hpet: add checks after calling ioremap
	Revert "ALSA: gus: add a check of the status of snd_ctl_add"
	Revert "ALSA: usx2y: Fix potential NULL pointer dereference"
	Revert "isdn: mISDNinfineon: fix potential NULL pointer dereference"
	isdn: mISDNinfineon: check/cleanup ioremap failure correctly in setup_io
	Revert "ath6kl: return error code in ath6kl_wmi_set_roam_lrssi_cmd()"
	ath6kl: return error code in ath6kl_wmi_set_roam_lrssi_cmd()
	Revert "isdn: mISDN: Fix potential NULL pointer dereference of kzalloc"
	isdn: mISDN: correctly handle ph_info allocation failure in hfcsusb_ph_info
	Revert "dmaengine: qcom_hidma: Check for driver register failure"
	dmaengine: qcom_hidma: comment platform_driver_register call
	Revert "libertas: add checks for the return value of sysfs_create_group"
	libertas: register sysfs groups properly
	Revert "ASoC: cs43130: fix a NULL pointer dereference"
	ASoC: cs43130: handle errors in cs43130_probe() properly
	Revert "media: dvb: Add check on sp8870_readreg"
	media: dvb: Add check on sp8870_readreg return
	Revert "media: gspca: mt9m111: Check write_bridge for timeout"
	media: gspca: mt9m111: Check write_bridge for timeout
	Revert "media: gspca: Check the return value of write_bridge for timeout"
	media: gspca: properly check for errors in po1030_probe()
	Revert "net: liquidio: fix a NULL pointer dereference"
	net: liquidio: Add missing null pointer checks
	Revert "brcmfmac: add a check for the status of usb_register"
	brcmfmac: properly check for bus register errors
	btrfs: return whole extents in fiemap
	scsi: ufs: ufs-mediatek: Fix power down spec violation
	scsi: BusLogic: Fix 64-bit system enumeration error for Buslogic
	openrisc: Define memory barrier mb
	scsi: pm80xx: Fix drives missing during rmmod/insmod loop
	btrfs: release path before starting transaction when cloning inline extent
	btrfs: do not BUG_ON in link_to_fixup_dir
	platform/x86: hp-wireless: add AMD's hardware id to the supported list
	platform/x86: intel_punit_ipc: Append MODULE_DEVICE_TABLE for ACPI
	platform/x86: touchscreen_dmi: Add info for the Mediacom Winpad 7.0 W700 tablet
	SMB3: incorrect file id in requests compounded with open
	drm/amd/display: Disconnect non-DP with no EDID
	drm/amd/amdgpu: fix refcount leak
	drm/amdgpu: Fix a use-after-free
	drm/amd/amdgpu: fix a potential deadlock in gpu reset
	drm/amdgpu: stop touching sched.ready in the backend
	platform/x86: touchscreen_dmi: Add info for the Chuwi Hi10 Pro (CWI529) tablet
	block: fix a race between del_gendisk and BLKRRPART
	linux/bits.h: fix compilation error with GENMASK
	net: netcp: Fix an error message
	net: dsa: fix error code getting shifted with 4 in dsa_slave_get_sset_count
	interconnect: qcom: bcm-voter: add a missing of_node_put()
	interconnect: qcom: Add missing MODULE_DEVICE_TABLE
	ASoC: cs42l42: Regmap must use_single_read/write
	net: stmmac: Fix MAC WoL not working if PHY does not support WoL
	net: ipa: memory region array is variable size
	vfio-ccw: Check initialized flag in cp_init()
	spi: Assume GPIO CS active high in ACPI case
	net: really orphan skbs tied to closing sk
	net: packetmmap: fix only tx timestamp on request
	net: fec: fix the potential memory leak in fec_enet_init()
	chelsio/chtls: unlock on error in chtls_pt_recvmsg()
	net: mdio: thunder: Fix a double free issue in the .remove function
	net: mdio: octeon: Fix some double free issues
	cxgb4/ch_ktls: Clear resources when pf4 device is removed
	openvswitch: meter: fix race when getting now_ms.
	tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT
	net: sched: fix packet stuck problem for lockless qdisc
	net: sched: fix tx action rescheduling issue during deactivation
	net: sched: fix tx action reschedule issue with stopped queue
	net: hso: check for allocation failure in hso_create_bulk_serial_device()
	net: bnx2: Fix error return code in bnx2_init_board()
	bnxt_en: Include new P5 HV definition in VF check.
	bnxt_en: Fix context memory setup for 64K page size.
	mld: fix panic in mld_newpack()
	net/smc: remove device from smcd_dev_list after failed device_add()
	gve: Check TX QPL was actually assigned
	gve: Update mgmt_msix_idx if num_ntfy changes
	gve: Add NULL pointer checks when freeing irqs.
	gve: Upgrade memory barrier in poll routine
	gve: Correct SKB queue index validation.
	iommu/virtio: Add missing MODULE_DEVICE_TABLE
	net: hns3: fix incorrect resp_msg issue
	net: hns3: put off calling register_netdev() until client initialize complete
	iommu/vt-d: Use user privilege for RID2PASID translation
	cxgb4: avoid accessing registers when clearing filters
	staging: emxx_udc: fix loop in _nbu2ss_nuke()
	ASoC: cs35l33: fix an error code in probe()
	bpf, offload: Reorder offload callback 'prepare' in verifier
	bpf: Set mac_len in bpf_skb_change_head
	ixgbe: fix large MTU request from VF
	ASoC: qcom: lpass-cpu: Use optional clk APIs
	scsi: libsas: Use _safe() loop in sas_resume_port()
	net: lantiq: fix memory corruption in RX ring
	ipv6: record frag_max_size in atomic fragments in input path
	ALSA: usb-audio: scarlett2: snd_scarlett_gen2_controls_create() can be static
	net: ethernet: mtk_eth_soc: Fix packet statistics support for MT7628/88
	sch_dsmark: fix a NULL deref in qdisc_reset()
	net: hsr: fix mac_len checks
	MIPS: alchemy: xxs1500: add gpio-au1000.h header file
	MIPS: ralink: export rt_sysc_membase for rt2880_wdt.c
	net: zero-initialize tc skb extension on allocation
	net: mvpp2: add buffer header handling in RX
	i915: fix build warning in intel_dp_get_link_status()
	samples/bpf: Consider frame size in tx_only of xdpsock sample
	net: hns3: check the return of skb_checksum_help()
	bpftool: Add sock_release help info for cgroup attach/prog load command
	SUNRPC: More fixes for backlog congestion
	Revert "Revert "ALSA: usx2y: Fix potential NULL pointer dereference""
	net: hso: bail out on interrupt URB allocation failure
	scripts/clang-tools: switch explicitly to Python 3
	neighbour: Prevent Race condition in neighbour subsytem
	usb: core: reduce power-on-good delay time of root hub
	Linux 5.10.42

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0ab94ea8b4d662a079f7998ad023c4f7d2749bc7
2021-06-03 09:16:26 +02:00
Yinjun Zhang
24cb8bb7f6 bpf, offload: Reorder offload callback 'prepare' in verifier
[ Upstream commit ceb11679d9 ]

Commit 4976b718c3 ("bpf: Introduce pseudo_btf_id") switched the
order of resolve_pseudo_ldimm(), in which some pseudo instructions
are rewritten. Thus those rewritten instructions cannot be passed
to driver via 'prepare' offload callback.

Reorder the 'prepare' offload callback to fix it.

Fixes: 4976b718c3 ("bpf: Introduce pseudo_btf_id")
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210520085834.15023-1-simon.horman@netronome.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03 09:00:49 +02:00
Sargun Dhillon
b71781c589 seccomp: Refactor notification handler to prepare for new semantics
commit ddc4739169 upstream.

This refactors the user notification code to have a do / while loop around
the completion condition. This has a small change in semantic, in that
previously we ignored addfd calls upon wakeup if the notification had been
responded to, but instead with the new change we check for an outstanding
addfd calls prior to returning to userspace.

Rodrigo Campos also identified a bug that can result in addfd causing
an early return, when the supervisor didn't actually handle the
syscall [1].

[1]: https://lore.kernel.org/lkml/20210413160151.3301-1-rodrigo@kinvolk.io/

Fixes: 7cf97b1254 ("seccomp: Introduce addfd ioctl to seccomp user notifier")
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Tycho Andersen <tycho@tycho.pizza>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Rodrigo Campos <rodrigo@kinvolk.io>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210517193908.3113-3-sargun@sargun.me
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03 09:00:31 +02:00
Will Deacon
94155f60a5 FROMLIST: sched: Defer wakeup in ttwu() for unschedulable frozen tasks
Asymmetric systems may not offer the same level of userspace ISA support
across all CPUs, meaning that some applications cannot be executed by
some CPUs. As a concrete example, upcoming arm64 big.LITTLE designs do
not feature support for 32-bit applications on both clusters.

Although we take care to prevent explicit hot-unplug of all 32-bit
capable CPUs on such a system, this is required when suspending on some
SoCs where the firmware mandates that the suspend/resume operation is
handled by CPU 0, which may not be capable of running 32-bit tasks.

Consequently, there is a window on the resume path where no 32-bit
capable CPUs are available for scheduling and waking up a 32-bit task
will result in a scheduler BUG() due to failure of select_fallback_rq():

  | kernel BUG at kernel/sched/core.c:2858!
  | Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  | ...
  | Call trace:
  |  select_fallback_rq+0x4b0/0x4e4
  |  try_to_wake_up.llvm.4388853297126348405+0x460/0x5b0
  |  default_wake_function+0x1c/0x30
  |  autoremove_wake_function+0x1c/0x60
  |  __wake_up_common.llvm.11763074518265335900+0x100/0x1b8
  |  __wake_up+0x78/0xc4
  |  ep_poll_callback+0x20c/0x3fc

Prevent wakeups of unschedulable frozen tasks in ttwu() and instead
defer the wakeup to __thaw_tasks(), which runs only once all the
secondary CPUs are back online.

Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/linux-arch/20210525151432.16875-17-will@kernel.org/
Bug: 186372082
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I5a0531b48d537a79e1926289b5a87edcd7dd78ad
2021-06-02 16:45:04 +01:00
Will Deacon
9c12d36117 FROMLIST: freezer: Add frozen_or_skipped() helper function
Occasionally it is necessary to see if a task is either frozen or
sleeping in the PF_FREEZER_SKIP state. In preparation for adding
additional users of this check, introduce a frozen_or_skipped() helper
function and convert the hung task detector over to using it.

Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/linux-arch/20210525151432.16875-16-will@kernel.org/
Bug: 186372082
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I138ffe2fae5a2da96df6f30d50d3a8a0dc61724c
2021-06-02 16:44:56 +01:00
Greg Kroah-Hartman
8fd49ccc4d Merge 5.10.41 into android13-5.10
Changes in 5.10.41
	bpf: Wrap aux data inside bpf_sanitize_info container
	bpf: Fix mask direction swap upon off reg sign change
	bpf: No need to simulate speculative domain for immediates
	context_tracking: Move guest exit context tracking to separate helpers
	context_tracking: Move guest exit vtime accounting to separate helpers
	KVM: x86: Defer vtime accounting 'til after IRQ handling
	perf unwind: Fix separate debug info files when using elfutils' libdw's unwinder
	perf unwind: Set userdata for all __report_module() paths
	NFC: nci: fix memory leak in nci_allocate_device
	Linux 5.10.41

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ia121e11c6f5a1e6387fc4566e873adf37ad7a705
2021-05-28 13:29:05 +02:00
Daniel Borkmann
27acfd11ba bpf: No need to simulate speculative domain for immediates
commit a703619127 upstream.

In 801c6058d1 ("bpf: Fix leakage of uninitialized bpf stack under
speculation") we replaced masking logic with direct loads of immediates
if the register is a known constant. Given in this case we do not apply
any masking, there is also no reason for the operation to be truncated
under the speculative domain.

Therefore, there is also zero reason for the verifier to branch-off and
simulate this case, it only needs to do it for unknown but bounded scalars.
As a side-effect, this also enables few test cases that were previously
rejected due to simulation under zero truncation.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-28 13:17:43 +02:00
Daniel Borkmann
c87ef240a8 bpf: Fix mask direction swap upon off reg sign change
commit bb01a1bba5 upstream.

Masking direction as indicated via mask_to_left is considered to be
calculated once and then used to derive pointer limits. Thus, this
needs to be placed into bpf_sanitize_info instead so we can pass it
to sanitize_ptr_alu() call after the pointer move. Piotr noticed a
corner case where the off reg causes masking direction change which
then results in an incorrect final aux->alu_limit.

Fixes: 7fedb63a83 ("bpf: Tighten speculative pointer arithmetic mask")
Reported-by: Piotr Krysiuk <piotras@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-28 13:17:43 +02:00
Daniel Borkmann
4e2c7b2974 bpf: Wrap aux data inside bpf_sanitize_info container
commit 3d0220f686 upstream.

Add a container structure struct bpf_sanitize_info which holds
the current aux info, and update call-sites to sanitize_ptr_alu()
to pass it in. This is needed for passing in additional state
later on.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-28 13:17:43 +02:00
Greg Kroah-Hartman
04bd735420 Merge 5.10.40 into android13-5.10
Changes in 5.10.40
	firmware: arm_scpi: Prevent the ternary sign expansion bug
	openrisc: Fix a memory leak
	tee: amdtee: unload TA only when its refcount becomes 0
	RDMA/siw: Properly check send and receive CQ pointers
	RDMA/siw: Release xarray entry
	RDMA/core: Prevent divide-by-zero error triggered by the user
	RDMA/rxe: Clear all QP fields if creation failed
	scsi: ufs: core: Increase the usable queue depth
	scsi: qedf: Add pointer checks in qedf_update_link_speed()
	scsi: qla2xxx: Fix error return code in qla82xx_write_flash_dword()
	RDMA/mlx5: Recover from fatal event in dual port mode
	RDMA/core: Don't access cm_id after its destruction
	nvmet: remove unused ctrl->cqs
	nvmet: fix memory leak in nvmet_alloc_ctrl()
	nvme-loop: fix memory leak in nvme_loop_create_ctrl()
	nvme-tcp: rerun io_work if req_list is not empty
	nvme-fc: clear q_live at beginning of association teardown
	platform/mellanox: mlxbf-tmfifo: Fix a memory barrier issue
	platform/x86: intel_int0002_vgpio: Only call enable_irq_wake() when using s2idle
	platform/x86: dell-smbios-wmi: Fix oops on rmmod dell_smbios
	RDMA/mlx5: Fix query DCT via DEVX
	RDMA/uverbs: Fix a NULL vs IS_ERR() bug
	tools/testing/selftests/exec: fix link error
	powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks
	ptrace: make ptrace() fail if the tracee changed its pid unexpectedly
	nvmet: seset ns->file when open fails
	perf/x86: Avoid touching LBR_TOS MSR for Arch LBR
	locking/lockdep: Correct calling tracepoints
	locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to signal
	powerpc: Fix early setup to make early_ioremap() work
	btrfs: avoid RCU stalls while running delayed iputs
	cifs: fix memory leak in smb2_copychunk_range
	misc: eeprom: at24: check suspend status before disable regulator
	ALSA: dice: fix stream format for TC Electronic Konnekt Live at high sampling transfer frequency
	ALSA: intel8x0: Don't update period unless prepared
	ALSA: firewire-lib: fix amdtp_packet tracepoints event for packet_index field
	ALSA: line6: Fix racy initialization of LINE6 MIDI
	ALSA: dice: fix stream format at middle sampling rate for Alesis iO 26
	ALSA: firewire-lib: fix calculation for size of IR context payload
	ALSA: usb-audio: Validate MS endpoint descriptors
	ALSA: bebob/oxfw: fix Kconfig entry for Mackie d.2 Pro
	ALSA: hda: fixup headset for ASUS GU502 laptop
	Revert "ALSA: sb8: add a check for request_region"
	ALSA: firewire-lib: fix check for the size of isochronous packet payload
	ALSA: hda/realtek: reset eapd coeff to default value for alc287
	ALSA: hda/realtek: Add some CLOVE SSIDs of ALC293
	ALSA: hda/realtek: Fix silent headphone output on ASUS UX430UA
	ALSA: hda/realtek: Add fixup for HP OMEN laptop
	ALSA: hda/realtek: Add fixup for HP Spectre x360 15-df0xxx
	uio_hv_generic: Fix a memory leak in error handling paths
	Revert "rapidio: fix a NULL pointer dereference when create_workqueue() fails"
	rapidio: handle create_workqueue() failure
	Revert "serial: mvebu-uart: Fix to avoid a potential NULL pointer dereference"
	nvme-tcp: fix possible use-after-completion
	x86/sev-es: Move sev_es_put_ghcb() in prep for follow on patch
	x86/sev-es: Invalidate the GHCB after completing VMGEXIT
	x86/sev-es: Don't return NULL from sev_es_get_ghcb()
	x86/sev-es: Use __put_user()/__get_user() for data accesses
	x86/sev-es: Forward page-faults which happen during emulation
	drm/amdgpu: Fix GPU TLB update error when PAGE_SIZE > AMDGPU_PAGE_SIZE
	drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
	drm/amdgpu: update gc golden setting for Navi12
	drm/amdgpu: update sdma golden setting for Navi12
	powerpc/64s/syscall: Use pt_regs.trap to distinguish syscall ABI difference between sc and scv syscalls
	powerpc/64s/syscall: Fix ptrace syscall info with scv syscalls
	mmc: sdhci-pci-gli: increase 1.8V regulator wait
	xen-pciback: redo VF placement in the virtual topology
	xen-pciback: reconfigure also from backend watch handler
	ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry
	dm snapshot: fix crash with transient storage and zero chunk size
	kcsan: Fix debugfs initcall return type
	Revert "video: hgafb: fix potential NULL pointer dereference"
	Revert "net: stmicro: fix a missing check of clk_prepare"
	Revert "leds: lp5523: fix a missing check of return value of lp55xx_read"
	Revert "hwmon: (lm80) fix a missing check of bus read in lm80 probe"
	Revert "video: imsttfb: fix potential NULL pointer dereferences"
	Revert "ecryptfs: replace BUG_ON with error handling code"
	Revert "scsi: ufs: fix a missing check of devm_reset_control_get"
	Revert "gdrom: fix a memory leak bug"
	cdrom: gdrom: deallocate struct gdrom_unit fields in remove_gdrom
	cdrom: gdrom: initialize global variable at init time
	Revert "media: rcar_drif: fix a memory disclosure"
	Revert "rtlwifi: fix a potential NULL pointer dereference"
	Revert "qlcnic: Avoid potential NULL pointer dereference"
	Revert "niu: fix missing checks of niu_pci_eeprom_read"
	ethernet: sun: niu: fix missing checks of niu_pci_eeprom_read()
	net: stmicro: handle clk_prepare() failure during init
	scsi: ufs: handle cleanup correctly on devm_reset_control_get error
	net: rtlwifi: properly check for alloc_workqueue() failure
	ics932s401: fix broken handling of errors when word reading fails
	leds: lp5523: check return value of lp5xx_read and jump to cleanup code
	qlcnic: Add null check after calling netdev_alloc_skb
	video: hgafb: fix potential NULL pointer dereference
	vgacon: Record video mode changes with VT_RESIZEX
	vt_ioctl: Revert VT_RESIZEX parameter handling removal
	vt: Fix character height handling with VT_RESIZEX
	tty: vt: always invoke vc->vc_sw->con_resize callback
	drm/i915/gt: Disable HiZ Raw Stall Optimization on broken gen7
	openrisc: mm/init.c: remove unused memblock_region variable in map_ram()
	x86/Xen: swap NX determination and GDT setup on BSP
	nvme-multipath: fix double initialization of ANA state
	rtc: pcf85063: fallback to parent of_node
	x86/boot/compressed/64: Check SEV encryption in the 32-bit boot-path
	nvmet: use new ana_log_size instead the old one
	video: hgafb: correctly handle card detect failure during probe
	Bluetooth: SMP: Fail if remote and local public keys are identical
	Linux 5.10.40

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic4c293b7ba38718ef030f5697afb6288051d19ba
2021-05-27 08:32:23 +02:00
Arnd Bergmann
fae4f4debf kcsan: Fix debugfs initcall return type
commit 976aac5f88 upstream.

clang with CONFIG_LTO_CLANG points out that an initcall function should
return an 'int' due to the changes made to the initcall macros in commit
3578ad11f3 ("init: lto: fix PREL32 relocations"):

kernel/kcsan/debugfs.c:274:15: error: returning 'void' from a function with incompatible result type 'int'
late_initcall(kcsan_debugfs_init);
~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
include/linux/init.h:292:46: note: expanded from macro 'late_initcall'
 #define late_initcall(fn)               __define_initcall(fn, 7)

Fixes: e36299efe7 ("kcsan, debugfs: Move debugfs file creation out of early init")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-26 12:06:54 +02:00
Zqiang
e354e3744b locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to signal
[ Upstream commit 3a010c4932 ]

When a interruptible mutex locker is interrupted by a signal
without acquiring this lock and removed from the wait queue.
if the mutex isn't contended enough to have a waiter
put into the wait queue again, the setting of the WAITER
bit will force mutex locker to go into the slowpath to
acquire the lock every time, so if the wait queue is empty,
the WAITER bit need to be clear.

Fixes: 040a0a3710 ("mutex: Add support for wound/wait style locks")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210517034005.30828-1-qiang.zhang@windriver.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26 12:06:50 +02:00
Leo Yan
5dfed1be0e locking/lockdep: Correct calling tracepoints
[ Upstream commit 89e70d5c58 ]

The commit eb1f00237a ("lockdep,trace: Expose tracepoints") reverses
tracepoints for lock_contended() and lock_acquired(), thus the ftrace
log shows the wrong locking sequence that "acquired" event is prior to
"contended" event:

  <idle>-0       [001] d.s3 20803.501685: lock_acquire: 0000000008b91ab4 &sg_policy->update_lock
  <idle>-0       [001] d.s3 20803.501686: lock_acquired: 0000000008b91ab4 &sg_policy->update_lock
  <idle>-0       [001] d.s3 20803.501689: lock_contended: 0000000008b91ab4 &sg_policy->update_lock
  <idle>-0       [001] d.s3 20803.501690: lock_release: 0000000008b91ab4 &sg_policy->update_lock

This patch fixes calling tracepoints for lock_contended() and
lock_acquired().

Fixes: eb1f00237a ("lockdep,trace: Expose tracepoints")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210512120937.90211-1-leo.yan@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26 12:06:50 +02:00
Oleg Nesterov
6f08af55ea ptrace: make ptrace() fail if the tracee changed its pid unexpectedly
[ Upstream commit dbb5afad10 ]

Suppose we have 2 threads, the group-leader L and a sub-theread T,
both parked in ptrace_stop(). Debugger tries to resume both threads
and does

	ptrace(PTRACE_CONT, T);
	ptrace(PTRACE_CONT, L);

If the sub-thread T execs in between, the 2nd PTRACE_CONT doesn not
resume the old leader L, it resumes the post-exec thread T which was
actually now stopped in PTHREAD_EVENT_EXEC. In this case the
PTHREAD_EVENT_EXEC event is lost, and the tracer can't know that the
tracee changed its pid.

This patch makes ptrace() fail in this case until debugger does wait()
and consumes PTHREAD_EVENT_EXEC which reports old_pid. This affects all
ptrace requests except the "asynchronous" PTRACE_INTERRUPT/KILL.

The patch doesn't add the new PTRACE_ option to not complicate the API,
and I _hope_ this won't cause any noticeable regression:

	- If debugger uses PTRACE_O_TRACEEXEC and the thread did an exec
	  and the tracer does a ptrace request without having consumed
	  the exec event, it's 100% sure that the thread the ptracer
	  thinks it is targeting does not exist anymore, or isn't the
	  same as the one it thinks it is targeting.

	- To some degree this patch adds nothing new. In the scenario
	  above ptrace(L) can fail with -ESRCH if it is called after the
	  execing sub-thread wakes the leader up and before it "steals"
	  the leader's pid.

Test-case:

	#include <stdio.h>
	#include <unistd.h>
	#include <signal.h>
	#include <sys/ptrace.h>
	#include <sys/wait.h>
	#include <errno.h>
	#include <pthread.h>
	#include <assert.h>

	void *tf(void *arg)
	{
		execve("/usr/bin/true", NULL, NULL);
		assert(0);

		return NULL;
	}

	int main(void)
	{
		int leader = fork();
		if (!leader) {
			kill(getpid(), SIGSTOP);

			pthread_t th;
			pthread_create(&th, NULL, tf, NULL);
			for (;;)
				pause();

			return 0;
		}

		waitpid(leader, NULL, WSTOPPED);

		ptrace(PTRACE_SEIZE, leader, 0,
				PTRACE_O_TRACECLONE | PTRACE_O_TRACEEXEC);
		waitpid(leader, NULL, 0);

		ptrace(PTRACE_CONT, leader, 0,0);
		waitpid(leader, NULL, 0);

		int status, thread = waitpid(-1, &status, 0);
		assert(thread > 0 && thread != leader);
		assert(status == 0x80137f);

		ptrace(PTRACE_CONT, thread, 0,0);
		/*
		 * waitid() because waitpid(leader, &status, WNOWAIT) does not
		 * report status. Why ????
		 *
		 * Why WEXITED? because we have another kernel problem connected
		 * to mt-exec.
		 */
		siginfo_t info;
		assert(waitid(P_PID, leader, &info, WSTOPPED|WEXITED|WNOWAIT) == 0);
		assert(info.si_pid == leader && info.si_status == 0x0405);

		/* OK, it sleeps in ptrace(PTRACE_EVENT_EXEC == 0x04) */
		assert(ptrace(PTRACE_CONT, leader, 0,0) == -1);
		assert(errno == ESRCH);

		assert(leader == waitpid(leader, &status, WNOHANG));
		assert(status == 0x04057f);

		assert(ptrace(PTRACE_CONT, leader, 0,0) == 0);

		return 0;
	}

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Simon Marchi <simon.marchi@efficios.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Pedro Alves <palves@redhat.com>
Acked-by: Simon Marchi <simon.marchi@efficios.com>
Acked-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26 12:06:49 +02:00
Will Deacon
1732e398f9 ANDROID: sched: fix race with CPU hot-unplug when overriding affinity
Migrating a task to a CPU which is concurrently being taken offline can
cause the migration to fail silently, with the task left running on the
old CPU. This is usually not the end of the world, but when forcefully
migrating a 32-bit task during execve() from a 64-bit task, it is
imperative that we do not attempt to return to userspace on a
64-bit-only CPU.

Take the CPU hotplug lock for read while forcefully migrating a 32-bit
task on execve() so that the migration cannot fail.

Bug: 187917024
Change-Id: I6eaf2a564fe3ad73c03f0a6029aade09c707330f
Signed-off-by: Will Deacon <willdeacon@google.com>
2021-05-25 16:56:52 +01:00
Greg Kroah-Hartman
999e8eed26 Merge 5.10.38 into android13-5.10
Changes in 5.10.38
	KEYS: trusted: Fix memory leak on object td
	tpm: fix error return code in tpm2_get_cc_attrs_tbl()
	tpm, tpm_tis: Extend locality handling to TPM2 in tpm_tis_gen_interrupt()
	tpm, tpm_tis: Reserve locality in tpm_tis_resume()
	KVM: x86/mmu: Remove the defunct update_pte() paging hook
	KVM/VMX: Invoke NMI non-IST entry instead of IST entry
	ACPI: PM: Add ACPI ID of Alder Lake Fan
	PM: runtime: Fix unpaired parent child_count for force_resume
	cpufreq: intel_pstate: Use HWP if enabled by platform firmware
	kvm: Cap halt polling at kvm->max_halt_poll_ns
	ath11k: fix thermal temperature read
	fs: dlm: fix debugfs dump
	fs: dlm: add errno handling to check callback
	fs: dlm: check on minimum msglen size
	fs: dlm: flush swork on shutdown
	tipc: convert dest node's address to network order
	ASoC: Intel: bytcr_rt5640: Enable jack-detect support on Asus T100TAF
	net/mlx5e: Use net_prefetchw instead of prefetchw in MPWQE TX datapath
	net: stmmac: Set FIFO sizes for ipq806x
	ASoC: rsnd: core: Check convert rate in rsnd_hw_params
	Bluetooth: Fix incorrect status handling in LE PHY UPDATE event
	i2c: bail out early when RDWR parameters are wrong
	ALSA: hdsp: don't disable if not enabled
	ALSA: hdspm: don't disable if not enabled
	ALSA: rme9652: don't disable if not enabled
	ALSA: bebob: enable to deliver MIDI messages for multiple ports
	Bluetooth: Set CONF_NOT_COMPLETE as l2cap_chan default
	Bluetooth: initialize skb_queue_head at l2cap_chan_create()
	net/sched: cls_flower: use ntohs for struct flow_dissector_key_ports
	net: bridge: when suppression is enabled exclude RARP packets
	Bluetooth: check for zapped sk before connecting
	selftests/powerpc: Fix L1D flushing tests for Power10
	powerpc/32: Statically initialise first emergency context
	net: hns3: remediate a potential overflow risk of bd_num_list
	net: hns3: add handling for xmit skb with recursive fraglist
	ip6_vti: proper dev_{hold|put} in ndo_[un]init methods
	ASoC: Intel: bytcr_rt5640: Add quirk for the Chuwi Hi8 tablet
	ice: handle increasing Tx or Rx ring sizes
	Bluetooth: btusb: Enable quirk boolean flag for Mediatek Chip.
	ASoC: rt5670: Add a quirk for the Dell Venue 10 Pro 5055
	i2c: Add I2C_AQ_NO_REP_START adapter quirk
	MIPS: Loongson64: Use _CACHE_UNCACHED instead of _CACHE_UNCACHED_ACCELERATED
	coresight: Do not scan for graph if none is present
	IB/hfi1: Correct oversized ring allocation
	mac80211: clear the beacon's CRC after channel switch
	pinctrl: samsung: use 'int' for register masks in Exynos
	rtw88: 8822c: add LC calibration for RTL8822C
	mt76: mt7615: support loading EEPROM for MT7613BE
	mt76: mt76x0: disable GTK offloading
	mt76: mt7915: fix txpower init for TSSI off chips
	fuse: invalidate attrs when page writeback completes
	virtiofs: fix userns
	cuse: prevent clone
	iwlwifi: pcie: make cfg vs. trans_cfg more robust
	powerpc/mm: Add cond_resched() while removing hpte mappings
	ASoC: rsnd: call rsnd_ssi_master_clk_start() from rsnd_ssi_init()
	Revert "iommu/amd: Fix performance counter initialization"
	iommu/amd: Remove performance counter pre-initialization test
	drm/amd/display: Force vsync flip when reconfiguring MPCC
	selftests: Set CC to clang in lib.mk if LLVM is set
	kconfig: nconf: stop endless search loops
	ALSA: hda/realtek: Add quirk for Lenovo Ideapad S740
	ASoC: Intel: sof_sdw: add quirk for new ADL-P Rvp
	ALSA: hda/hdmi: fix race in handling acomp ELD notification at resume
	sctp: Fix out-of-bounds warning in sctp_process_asconf_param()
	flow_dissector: Fix out-of-bounds warning in __skb_flow_bpf_to_target()
	powerpc/smp: Set numa node before updating mask
	ASoC: rt286: Generalize support for ALC3263 codec
	ethtool: ioctl: Fix out-of-bounds warning in store_link_ksettings_for_user()
	net: sched: tapr: prevent cycle_time == 0 in parse_taprio_schedule
	samples/bpf: Fix broken tracex1 due to kprobe argument change
	powerpc/pseries: Stop calling printk in rtas_stop_self()
	drm/amd/display: fixed divide by zero kernel crash during dsc enablement
	drm/amd/display: add handling for hdcp2 rx id list validation
	drm/amdgpu: Add mem sync flag for IB allocated by SA
	mt76: mt7615: fix entering driver-own state on mt7663
	crypto: ccp: Free SEV device if SEV init fails
	wl3501_cs: Fix out-of-bounds warnings in wl3501_send_pkt
	wl3501_cs: Fix out-of-bounds warnings in wl3501_mgmt_join
	qtnfmac: Fix possible buffer overflow in qtnf_event_handle_external_auth
	powerpc/iommu: Annotate nested lock for lockdep
	iavf: remove duplicate free resources calls
	net: ethernet: mtk_eth_soc: fix RX VLAN offload
	selftests: mlxsw: Increase the tolerance of backlog buildup
	selftests: mlxsw: Fix mausezahn invocation in ERSPAN scale test
	kbuild: generate Module.symvers only when vmlinux exists
	bnxt_en: Add PCI IDs for Hyper-V VF devices.
	ia64: module: fix symbolizer crash on fdescr
	watchdog: rename __touch_watchdog() to a better descriptive name
	watchdog: explicitly update timestamp when reporting softlockup
	watchdog/softlockup: remove logic that tried to prevent repeated reports
	watchdog: fix barriers when printing backtraces from all CPUs
	ASoC: rt286: Make RT286_SET_GPIO_* readable and writable
	thermal: thermal_of: Fix error return code of thermal_of_populate_bind_params()
	f2fs: move ioctl interface definitions to separated file
	f2fs: fix compat F2FS_IOC_{MOVE,GARBAGE_COLLECT}_RANGE
	f2fs: fix to allow migrating fully valid segment
	f2fs: fix panic during f2fs_resize_fs()
	f2fs: fix a redundant call to f2fs_balance_fs if an error occurs
	remoteproc: qcom_q6v5_mss: Replace ioremap with memremap
	remoteproc: qcom_q6v5_mss: Validate p_filesz in ELF loader
	PCI: iproc: Fix return value of iproc_msi_irq_domain_alloc()
	PCI: Release OF node in pci_scan_device()'s error path
	ARM: 9064/1: hw_breakpoint: Do not directly check the event's overflow_handler hook
	f2fs: fix to align to section for fallocate() on pinned file
	f2fs: fix to update last i_size if fallocate partially succeeds
	PCI: endpoint: Make *_get_first_free_bar() take into account 64 bit BAR
	PCI: endpoint: Add helper API to get the 'next' unreserved BAR
	PCI: endpoint: Make *_free_bar() to return error codes on failure
	PCI: endpoint: Fix NULL pointer dereference for ->get_features()
	f2fs: fix to avoid touching checkpointed data in get_victim()
	f2fs: fix to cover __allocate_new_section() with curseg_lock
	f2fs: Fix a hungtask problem in atomic write
	f2fs: fix to avoid accessing invalid fio in f2fs_allocate_data_block()
	rpmsg: qcom_glink_native: fix error return code of qcom_glink_rx_data()
	NFS: nfs4_bitmask_adjust() must not change the server global bitmasks
	NFS: Fix attribute bitmask in _nfs42_proc_fallocate()
	NFSv4.2: Always flush out writes in nfs42_proc_fallocate()
	NFS: Deal correctly with attribute generation counter overflow
	PCI: endpoint: Fix missing destroy_workqueue()
	pNFS/flexfiles: fix incorrect size check in decode_nfs_fh()
	NFSv4.2 fix handling of sr_eof in SEEK's reply
	SUNRPC: Move fault injection call sites
	SUNRPC: Remove trace_xprt_transmit_queued
	SUNRPC: Handle major timeout in xprt_adjust_timeout()
	thermal/drivers/tsens: Fix missing put_device error
	NFSv4.x: Don't return NFS4ERR_NOMATCHING_LAYOUT if we're unmounting
	nfsd: ensure new clients break delegations
	rtc: fsl-ftm-alarm: add MODULE_TABLE()
	dmaengine: idxd: Fix potential null dereference on pointer status
	dmaengine: idxd: fix dma device lifetime
	dmaengine: idxd: fix cdev setup and free device lifetime issues
	SUNRPC: fix ternary sign expansion bug in tracing
	pwm: atmel: Fix duty cycle calculation in .get_state()
	xprtrdma: Avoid Receive Queue wrapping
	xprtrdma: Fix cwnd update ordering
	xprtrdma: rpcrdma_mr_pop() already does list_del_init()
	swiotlb: Fix the type of index
	ceph: fix inode leak on getattr error in __fh_to_dentry
	scsi: qla2xxx: Prevent PRLI in target mode
	scsi: ufs: core: Do not put UFS power into LPM if link is broken
	scsi: ufs: core: Cancel rpm_dev_flush_recheck_work during system suspend
	scsi: ufs: core: Narrow down fast path in system suspend path
	rtc: ds1307: Fix wday settings for rx8130
	net: hns3: fix incorrect configuration for igu_egu_hw_err
	net: hns3: initialize the message content in hclge_get_link_mode()
	net: hns3: add check for HNS3_NIC_STATE_INITED in hns3_reset_notify_up_enet()
	net: hns3: fix for vxlan gpe tx checksum bug
	net: hns3: use netif_tx_disable to stop the transmit queue
	net: hns3: disable phy loopback setting in hclge_mac_start_phy
	sctp: do asoc update earlier in sctp_sf_do_dupcook_a
	RISC-V: Fix error code returned by riscv_hartid_to_cpuid()
	sunrpc: Fix misplaced barrier in call_decode
	libbpf: Fix signed overflow in ringbuf_process_ring
	block/rnbd-clt: Change queue_depth type in rnbd_clt_session to size_t
	block/rnbd-clt: Check the return value of the function rtrs_clt_query
	ethernet:enic: Fix a use after free bug in enic_hard_start_xmit
	sctp: fix a SCTP_MIB_CURRESTAB leak in sctp_sf_do_dupcook_b
	netfilter: xt_SECMARK: add new revision to fix structure layout
	xsk: Fix for xp_aligned_validate_desc() when len == chunk_size
	net: stmmac: Clear receive all(RA) bit when promiscuous mode is off
	drm/radeon: Fix off-by-one power_state index heap overwrite
	drm/radeon: Avoid power table parsing memory leaks
	arm64: entry: factor irq triage logic into macros
	arm64: entry: always set GIC_PRIO_PSR_I_SET during entry
	khugepaged: fix wrong result value for trace_mm_collapse_huge_page_isolate()
	mm/hugeltb: handle the error case in hugetlb_fix_reserve_counts()
	mm/migrate.c: fix potential indeterminate pte entry in migrate_vma_insert_page()
	ksm: fix potential missing rmap_item for stable_node
	mm/gup: check every subpage of a compound page during isolation
	mm/gup: return an error on migration failure
	mm/gup: check for isolation errors
	ethtool: fix missing NLM_F_MULTI flag when dumping
	net: fix nla_strcmp to handle more then one trailing null character
	smc: disallow TCP_ULP in smc_setsockopt()
	netfilter: nfnetlink_osf: Fix a missing skb_header_pointer() NULL check
	netfilter: nftables: Fix a memleak from userdata error path in new objects
	can: mcp251xfd: mcp251xfd_probe(): add missing can_rx_offload_del() in error path
	can: mcp251x: fix resume from sleep before interface was brought up
	can: m_can: m_can_tx_work_queue(): fix tx_skb race condition
	sched: Fix out-of-bound access in uclamp
	sched/fair: Fix unfairness caused by missing load decay
	fs/proc/generic.c: fix incorrect pde_is_permanent check
	kernel: kexec_file: fix error return code of kexec_calculate_store_digests()
	kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
	kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources
	netfilter: nftables: avoid overflows in nft_hash_buckets()
	i40e: fix broken XDP support
	i40e: Fix use-after-free in i40e_client_subtask()
	i40e: fix the restart auto-negotiation after FEC modified
	i40e: Fix PHY type identifiers for 2.5G and 5G adapters
	mptcp: fix splat when closing unaccepted socket
	f2fs: avoid unneeded data copy in f2fs_ioc_move_range()
	ARC: entry: fix off-by-one error in syscall number validation
	ARC: mm: PAE: use 40-bit physical page mask
	ARC: mm: Use max_high_pfn as a HIGHMEM zone border
	powerpc/64s: Fix crashes when toggling stf barrier
	powerpc/64s: Fix crashes when toggling entry flush barrier
	hfsplus: prevent corruption in shrinking truncate
	squashfs: fix divide error in calculate_skip()
	userfaultfd: release page in error path to avoid BUG_ON
	kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled
	mm/hugetlb: fix F_SEAL_FUTURE_WRITE
	blk-iocost: fix weight updates of inner active iocgs
	arm64: mte: initialize RGSR_EL1.SEED in __cpu_setup
	arm64: Fix race condition on PG_dcache_clean in __sync_icache_dcache()
	btrfs: fix race leading to unpersisted data and metadata on fsync
	drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are connected
	drm/amd/display: Initialize attribute for hdcp_srm sysfs file
	drm/i915: Avoid div-by-zero on gen2
	kvm: exit halt polling on need_resched() as well
	KVM: LAPIC: Accurately guarantee busy wait for timer to expire when using hv_timer
	drm/msm/dp: initialize audio_comp when audio starts
	KVM: x86: Cancel pvclock_gtod_work on module removal
	KVM: x86: Prevent deadlock against tk_core.seq
	dax: Add an enum for specifying dax wakup mode
	dax: Add a wakeup mode parameter to put_unlocked_entry()
	dax: Wake up all waiters after invalidating dax entry
	xen/unpopulated-alloc: consolidate pgmap manipulation
	xen/unpopulated-alloc: fix error return code in fill_list()
	perf tools: Fix dynamic libbpf link
	usb: dwc3: gadget: Free gadget structure only after freeing endpoints
	iio: light: gp2ap002: Fix rumtime PM imbalance on error
	iio: proximity: pulsedlight: Fix rumtime PM imbalance on error
	iio: hid-sensors: select IIO_TRIGGERED_BUFFER under HID_SENSOR_IIO_TRIGGER
	usb: fotg210-hcd: Fix an error message
	hwmon: (occ) Fix poll rate limiting
	usb: musb: Fix an error message
	ACPI: scan: Fix a memory leak in an error handling path
	kyber: fix out of bounds access when preempted
	nvmet: add lba to sect conversion helpers
	nvmet: fix inline bio check for bdev-ns
	nvmet-rdma: Fix NULL deref when SEND is completed with error
	f2fs: compress: fix to free compress page correctly
	f2fs: compress: fix race condition of overwrite vs truncate
	f2fs: compress: fix to assign cc.cluster_idx correctly
	nbd: Fix NULL pointer in flush_workqueue
	blk-mq: plug request for shared sbitmap
	blk-mq: Swap two calls in blk_mq_exit_queue()
	usb: dwc3: omap: improve extcon initialization
	usb: dwc3: pci: Enable usb2-gadget-lpm-disable for Intel Merrifield
	usb: xhci: Increase timeout for HC halt
	usb: dwc2: Fix gadget DMA unmap direction
	usb: core: hub: fix race condition about TRSMRCY of resume
	usb: dwc3: gadget: Enable suspend events
	usb: dwc3: gadget: Return success always for kick transfer in ep queue
	usb: typec: ucsi: Retrieve all the PDOs instead of just the first 4
	usb: typec: ucsi: Put fwnode in any case during ->probe()
	xhci-pci: Allow host runtime PM as default for Intel Alder Lake xHCI
	xhci: Do not use GFP_KERNEL in (potentially) atomic context
	xhci: Add reset resume quirk for AMD xhci controller.
	iio: gyro: mpu3050: Fix reported temperature value
	iio: tsl2583: Fix division by a zero lux_val
	cdc-wdm: untangle a circular dependency between callback and softint
	xen/gntdev: fix gntdev_mmap() error exit path
	KVM: x86: Emulate RDPID only if RDTSCP is supported
	KVM: x86: Move RDPID emulation intercept to its own enum
	KVM: nVMX: Always make an attempt to map eVMCS after migration
	KVM: VMX: Do not advertise RDPID if ENABLE_RDTSCP control is unsupported
	KVM: VMX: Disable preemption when probing user return MSRs
	Revert "iommu/vt-d: Remove WO permissions on second-level paging entries"
	Revert "iommu/vt-d: Preset Access/Dirty bits for IOVA over FL"
	iommu/vt-d: Preset Access/Dirty bits for IOVA over FL
	iommu/vt-d: Remove WO permissions on second-level paging entries
	mm: fix struct page layout on 32-bit systems
	MIPS: Reinstate platform `__div64_32' handler
	MIPS: Avoid DIVU in `__div64_32' is result would be zero
	MIPS: Avoid handcoded DIVU in `__div64_32' altogether
	clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue
	clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940
	ARM: 9011/1: centralize phys-to-virt conversion of DT/ATAGS address
	ARM: 9012/1: move device tree mapping out of linear region
	ARM: 9020/1: mm: use correct section size macro to describe the FDT virtual address
	ARM: 9027/1: head.S: explicitly map DT even if it lives in the first physical section
	usb: typec: tcpm: Fix error while calculating PPS out values
	kobject_uevent: remove warning in init_uevent_argv()
	drm/i915/gt: Fix a double free in gen8_preallocate_top_level_pdp
	drm/i915: Read C0DRB3/C1DRB3 as 16 bits again
	drm/i915/overlay: Fix active retire callback alignment
	drm/i915: Fix crash in auto_retire
	clk: exynos7: Mark aclk_fsys1_200 as critical
	media: rkvdec: Remove of_match_ptr()
	i2c: mediatek: Fix send master code at more than 1MHz
	dt-bindings: media: renesas,vin: Make resets optional on R-Car Gen1
	dt-bindings: serial: 8250: Remove duplicated compatible strings
	debugfs: Make debugfs_allow RO after init
	ext4: fix debug format string warning
	nvme: do not try to reconfigure APST when the controller is not live
	ASoC: rsnd: check all BUSIF status when error
	Linux 5.10.38

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic57b1be8934d1c7e740960be4cb5e0bac085f001
2021-05-19 11:26:25 +02:00
Christoph Hellwig
b67c8ee2f2 UPSTREAM: module: unexport find_module and module_mutex
find_module is not used by modular code any more, and random driver code
has no business calling it to start with.

Bug: 157965270
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
(cherry picked from commit 089049f6c9)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: If736b690e00331156626ff41de224628f5aed7c5
2021-05-19 10:27:25 +02:00
David Hildenbrand
f665dedeed kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources
[ Upstream commit 3c9c797534 ]

It used to be true that we can have system RAM (IORESOURCE_SYSTEM_RAM |
IORESOURCE_BUSY) only on the first level in the resource tree.  However,
this is no longer holds for driver-managed system RAM (i.e., added via
dax/kmem and virtio-mem), which gets added on lower levels, for example,
inside device containers.

IORESOURCE_SYSTEM_RAM is defined as IORESOURCE_MEM | IORESOURCE_SYSRAM and
just a special type of IORESOURCE_MEM.

The function walk_mem_res() only considers the first level and is used in
arch/x86/mm/ioremap.c:__ioremap_check_mem() only.  We currently fail to
identify System RAM added by dax/kmem and virtio-mem as
"IORES_MAP_SYSTEM_RAM", for example, allowing for remapping of such
"normal RAM" in __ioremap_caller().

Let's find all IORESOURCE_MEM | IORESOURCE_BUSY resources, making the
function behave similar to walk_system_ram_res().

Link: https://lkml.kernel.org/r/20210325115326.7826-3-david@redhat.com
Fixes: ebf71552bb ("virtio-mem: Add parent resource for all added "System RAM"")
Fixes: c221c0b030 ("device-dax: "Hotplug" persistent memory for use like normal RAM")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:13:09 +02:00
David Hildenbrand
1ec1932552 kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
[ Upstream commit 97f61c8f44 ]

Patch series "kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree", v2.

Playing with kdump+virtio-mem I noticed that kexec_file_load() does not
consider System RAM added via dax/kmem and virtio-mem when preparing the
elf header for kdump.  Looking into the details, the logic used in
walk_system_ram_res() and walk_mem_res() seems to be outdated.

walk_system_ram_range() already does the right thing, let's change
walk_system_ram_res() and walk_mem_res(), and clean up.

Loading a kdump kernel via "kexec -p -s" ...  will result in the kdump
kernel to also dump dax/kmem and virtio-mem added System RAM now.

Note: kexec-tools on x86-64 also have to be updated to consider this
memory in the kexec_load() case when processing /proc/iomem.

This patch (of 3):

It used to be true that we can have system RAM (IORESOURCE_SYSTEM_RAM |
IORESOURCE_BUSY) only on the first level in the resource tree.  However,
this is no longer holds for driver-managed system RAM (i.e., added via
dax/kmem and virtio-mem), which gets added on lower levels, for example,
inside device containers.

We have two users of walk_system_ram_res(), which currently only
consideres the first level:

a) kernel/kexec_file.c:kexec_walk_resources() -- We properly skip
   IORESOURCE_SYSRAM_DRIVER_MANAGED resources via
   locate_mem_hole_callback(), so even after this change, we won't be
   placing kexec images onto dax/kmem and virtio-mem added memory.  No
   change.

b) arch/x86/kernel/crash.c:fill_up_crash_elf_data() -- we're currently
   not adding relevant ranges to the crash elf header, resulting in them
   not getting dumped via kdump.

This change fixes loading a crashkernel via kexec_file_load() and
including dax/kmem and virtio-mem added System RAM in the crashdump on
x86-64.  Note that e.g,, arm64 relies on memblock data and, therefore,
always considers all added System RAM already.

Let's find all IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY resources, making
the function behave like walk_system_ram_range().

Link: https://lkml.kernel.org/r/20210325115326.7826-1-david@redhat.com
Link: https://lkml.kernel.org/r/20210325115326.7826-2-david@redhat.com
Fixes: ebf71552bb ("virtio-mem: Add parent resource for all added "System RAM"")
Fixes: c221c0b030 ("device-dax: "Hotplug" persistent memory for use like normal RAM")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:13:09 +02:00
Jia-Ju Bai
0886bb143c kernel: kexec_file: fix error return code of kexec_calculate_store_digests()
[ Upstream commit 31d82c2c78 ]

When vzalloc() returns NULL to sha_regions, no error return code of
kexec_calculate_store_digests() is assigned.  To fix this bug, ret is
assigned with -ENOMEM in this case.

Link: https://lkml.kernel.org/r/20210309083904.24321-1-baijiaju1990@gmail.com
Fixes: a43cac0d9d ("kexec: split kexec_file syscall code to kexec_file.c")
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:13:09 +02:00
Odin Ugedal
f89b408d50 sched/fair: Fix unfairness caused by missing load decay
[ Upstream commit 0258bdfaff ]

This fixes an issue where old load on a cfs_rq is not properly decayed,
resulting in strange behavior where fairness can decrease drastically.
Real workloads with equally weighted control groups have ended up
getting a respective 99% and 1%(!!) of cpu time.

When an idle task is attached to a cfs_rq by attaching a pid to a cgroup,
the old load of the task is attached to the new cfs_rq and sched_entity by
attach_entity_cfs_rq. If the task is then moved to another cpu (and
therefore cfs_rq) before being enqueued/woken up, the load will be moved
to cfs_rq->removed from the sched_entity. Such a move will happen when
enforcing a cpuset on the task (eg. via a cgroup) that force it to move.

The load will however not be removed from the task_group itself, making
it look like there is a constant load on that cfs_rq. This causes the
vruntime of tasks on other sibling cfs_rq's to increase faster than they
are supposed to; causing severe fairness issues. If no other task is
started on the given cfs_rq, and due to the cpuset it would not happen,
this load would never be properly unloaded. With this patch the load
will be properly removed inside update_blocked_averages. This also
applies to tasks moved to the fair scheduling class and moved to another
cpu, and this path will also fix that. For fork, the entity is queued
right away, so this problem does not affect that.

This applies to cases where the new process is the first in the cfs_rq,
issue introduced 3d30544f02 ("sched/fair: Apply more PELT fixes"), and
when there has previously been load on the cgroup but the cgroup was
removed from the leaflist due to having null PELT load, indroduced
in 039ae8bcf7 ("sched/fair: Fix O(nr_cgroups) in the load balancing
path").

For a simple cgroup hierarchy (as seen below) with two equally weighted
groups, that in theory should get 50/50 of cpu time each, it often leads
to a load of 60/40 or 70/30.

parent/
  cg-1/
    cpu.weight: 100
    cpuset.cpus: 1
  cg-2/
    cpu.weight: 100
    cpuset.cpus: 1

If the hierarchy is deeper (as seen below), while keeping cg-1 and cg-2
equally weighted, they should still get a 50/50 balance of cpu time.
This however sometimes results in a balance of 10/90 or 1/99(!!) between
the task groups.

$ ps u -C stress
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       18568  1.1  0.0   3684   100 pts/12   R+   13:36   0:00 stress --cpu 1
root       18580 99.3  0.0   3684   100 pts/12   R+   13:36   0:09 stress --cpu 1

parent/
  cg-1/
    cpu.weight: 100
    sub-group/
      cpu.weight: 1
      cpuset.cpus: 1
  cg-2/
    cpu.weight: 100
    sub-group/
      cpu.weight: 10000
      cpuset.cpus: 1

This can be reproduced by attaching an idle process to a cgroup and
moving it to a given cpuset before it wakes up. The issue is evident in
many (if not most) container runtimes, and has been reproduced
with both crun and runc (and therefore docker and all its "derivatives"),
and with both cgroup v1 and v2.

Fixes: 3d30544f02 ("sched/fair: Apply more PELT fixes")
Fixes: 039ae8bcf7 ("sched/fair: Fix O(nr_cgroups) in the load balancing path")
Signed-off-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210501141950.23622-2-odin@uged.al
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:13:09 +02:00
Quentin Perret
f7347c8549 sched: Fix out-of-bound access in uclamp
[ Upstream commit 6d2f8909a5 ]

Util-clamp places tasks in different buckets based on their clamp values
for performance reasons. However, the size of buckets is currently
computed using a rounding division, which can lead to an off-by-one
error in some configurations.

For instance, with 20 buckets, the bucket size will be 1024/20=51. A
task with a clamp of 1024 will be mapped to bucket id 1024/51=20. Sadly,
correct indexes are in range [0,19], hence leading to an out of bound
memory access.

Clamp the bucket id to fix the issue.

Fixes: 69842cba9a ("sched/uclamp: Add CPU's clamp buckets refcounting")
Suggested-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/20210430151412.160913-1-qperret@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:13:09 +02:00
Claire Chang
a01572e21f swiotlb: Fix the type of index
[ Upstream commit 95b079d821 ]

Fix the type of index from unsigned int to int since find_slots() might
return -1.

Fixes: 26a7e09478 ("swiotlb: refactor swiotlb_tbl_map_single")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Claire Chang <tientzu@chromium.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:13:04 +02:00
Petr Mladek
5b66867966 watchdog: fix barriers when printing backtraces from all CPUs
[ Upstream commit 9f113bf760 ]

Any parallel softlockup reports are skipped when one CPU is already
printing backtraces from all CPUs.

The exclusive rights are synchronized using one bit in
soft_lockup_nmi_warn.  There is also one memory barrier that does not make
much sense.

Use two barriers on the right location to prevent mixing two reports.

[pmladek@suse.com: use bit lock operations to prevent multiple soft-lockup reports]
  Link: https://lkml.kernel.org/r/YFSVsLGVWMXTvlbk@alley

Link: https://lkml.kernel.org/r/20210311122130.6788-6-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:13:00 +02:00
Petr Mladek
a68c246065 watchdog/softlockup: remove logic that tried to prevent repeated reports
[ Upstream commit 1bc503cb4a ]

The softlockup detector does some gymnastic with the variable
soft_watchdog_warn.  It was added by the commit 58687acba5
("lockup_detector: Combine nmi_watchdog and softlockup detector").

The purpose is not completely clear.  There are the following clues.  They
describe the situation how it looked after the above mentioned commit:

  1. The variable was checked with a comment "only warn once".

  2. The variable was set when softlockup was reported. It was cleared
     only when the CPU was not longer in the softlockup state.

  3. watchdog_touch_ts was not explicitly updated when the softlockup
     was reported. Without this variable, the report would normally
     be printed again during every following watchdog_timer_fn()
     invocation.

The logic has got even more tangled up by the commit ed235875e2
("kernel/watchdog.c: print traces for all cpus on lockup detection").
After this commit, soft_watchdog_warn is set only when
softlockup_all_cpu_backtrace is enabled.  But multiple reports from all
CPUs are prevented by a new variable soft_lockup_nmi_warn.

Conclusion:

The variable probably never worked as intended.  In each case, it has not
worked last many years because the softlockup was reported repeatedly
after the full period defined by watchdog_thresh.

The reason is that watchdog gets touched in many known slow paths, for
example, in printk_stack_address().  This code is called also when
printing the softlockup report.  It means that the watchdog timestamp gets
updated after each report.

Solution:

Simply remove the logic. People want the periodic report anyway.

Link: https://lkml.kernel.org/r/20210311122130.6788-5-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:13:00 +02:00
Petr Mladek
9413b1ee38 watchdog: explicitly update timestamp when reporting softlockup
[ Upstream commit c9ad17c991 ]

The softlockup situation might stay for a long time or even forever.  When
it happens, the softlockup debug messages are printed in regular intervals
defined by get_softlockup_thresh().

There is a mystery.  The repeated message is printed after the full
interval that is defined by get_softlockup_thresh().  But the timer
callback is called more often as defined by sample_period.  The code looks
like the soflockup should get reported in every sample_period when it was
once behind the thresh.

It works only by chance.  The watchdog is touched when printing the stall
report, for example, in printk_stack_address().

Make the behavior clear and predictable by explicitly updating the
timestamp in watchdog_timer_fn() when the report gets printed.

Link: https://lkml.kernel.org/r/20210311122130.6788-3-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:12:59 +02:00
Petr Mladek
018655f875 watchdog: rename __touch_watchdog() to a better descriptive name
[ Upstream commit 7c0012f522 ]

Patch series "watchdog/softlockup: Report overall time and some cleanup", v2.

I dug deep into the softlockup watchdog history when time permitted this
year.  And reworked the patchset that fixed timestamps and cleaned up the
code[2].

I split it into very small steps and did even more code clean up.  The
result looks quite strightforward and I am pretty confident with the
changes.

[1] v2: https://lore.kernel.org/r/20201210160038.31441-1-pmladek@suse.com
[2] v1: https://lore.kernel.org/r/20191024114928.15377-1-pmladek@suse.com

This patch (of 6):

There are many touch_*watchdog() functions.  They are called in situations
where the watchdog could report false positives or create unnecessary
noise.  For example, when CPU is entering idle mode, a virtual machine is
stopped, or a lot of messages are printed in the atomic context.

These functions set SOFTLOCKUP_RESET instead of a real timestamp.  It
allows to call them even in a context where jiffies might be outdated.
For example, in an atomic context.

The real timestamp is set by __touch_watchdog() that is called from the
watchdog timer callback.

Rename this callback to update_touch_ts().  It better describes the effect
and clearly distinguish is from the other touch_*watchdog() functions.

Another motivation is that two timestamps are going to be used.  One will
be used for the total softlockup time.  The other will be used to measure
time since the last report.  The new function name will help to
distinguish which timestamp is being updated.

Link: https://lkml.kernel.org/r/20210311122130.6788-1-pmladek@suse.com
Link: https://lkml.kernel.org/r/20210311122130.6788-2-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19 10:12:59 +02:00
Greg Kroah-Hartman
0eda5f918b Merge 5.10.37 into android13-5.10
Changes in 5.10.37
	Bluetooth: verify AMP hci_chan before amp_destroy
	bluetooth: eliminate the potential race condition when removing the HCI controller
	net/nfc: fix use-after-free llcp_sock_bind/connect
	io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffers
	Revert "USB: cdc-acm: fix rounding error in TIOCSSERIAL"
	usb: roles: Call try_module_get() from usb_role_switch_find_by_fwnode()
	tty: moxa: fix TIOCSSERIAL jiffies conversions
	tty: amiserial: fix TIOCSSERIAL permission check
	USB: serial: usb_wwan: fix TIOCSSERIAL jiffies conversions
	staging: greybus: uart: fix TIOCSSERIAL jiffies conversions
	USB: serial: ti_usb_3410_5052: fix TIOCSSERIAL permission check
	staging: fwserial: fix TIOCSSERIAL jiffies conversions
	tty: moxa: fix TIOCSSERIAL permission check
	staging: fwserial: fix TIOCSSERIAL permission check
	drm: bridge: fix LONTIUM use of mipi_dsi_() functions
	usb: typec: tcpm: Address incorrect values of tcpm psy for fixed supply
	usb: typec: tcpm: Address incorrect values of tcpm psy for pps supply
	usb: typec: tcpm: update power supply once partner accepts
	usb: xhci-mtk: remove or operator for setting schedule parameters
	usb: xhci-mtk: improve bandwidth scheduling with TT
	ASoC: samsung: tm2_wm5110: check of of_parse return value
	ASoC: Intel: kbl_da7219_max98927: Fix kabylake_ssp_fixup function
	ASoC: tlv320aic32x4: Register clocks before registering component
	ASoC: tlv320aic32x4: Increase maximum register in regmap
	MIPS: pci-mt7620: fix PLL lock check
	MIPS: pci-rt2880: fix slot 0 configuration
	FDDI: defxx: Bail out gracefully with unassigned PCI resource for CSR
	PCI: Allow VPD access for QLogic ISP2722
	KVM: x86: Defer the MMU unload to the normal path on an global INVPCID
	PCI: xgene: Fix cfg resource mapping
	PCI: keystone: Let AM65 use the pci_ops defined in pcie-designware-host.c
	PM / devfreq: Unlock mutex and free devfreq struct in error path
	soc/tegra: regulators: Fix locking up when voltage-spread is out of range
	iio: inv_mpu6050: Fully validate gyro and accel scale writes
	iio:accel:adis16201: Fix wrong axis assignment that prevents loading
	iio:adc:ad7476: Fix remove handling
	sc16is7xx: Defer probe if device read fails
	phy: cadence: Sierra: Fix PHY power_on sequence
	misc: lis3lv02d: Fix false-positive WARN on various HP models
	phy: ti: j721e-wiz: Invoke wiz_init() before of_platform_device_create()
	misc: vmw_vmci: explicitly initialize vmci_notify_bm_set_msg struct
	misc: vmw_vmci: explicitly initialize vmci_datagram payload
	selinux: add proper NULL termination to the secclass_map permissions
	x86, sched: Treat Intel SNC topology as default, COD as exception
	async_xor: increase src_offs when dropping destination page
	md/bitmap: wait for external bitmap writes to complete during tear down
	md-cluster: fix use-after-free issue when removing rdev
	md: split mddev_find
	md: factor out a mddev_find_locked helper from mddev_find
	md: md_open returns -EBUSY when entering racing area
	md: Fix missing unused status line of /proc/mdstat
	mt76: mt7615: use ieee80211_free_txskb() in mt7615_tx_token_put()
	ipw2x00: potential buffer overflow in libipw_wx_set_encodeext()
	cfg80211: scan: drop entry from hidden_list on overflow
	rtw88: Fix array overrun in rtw_get_tx_power_params()
	mt76: fix potential DMA mapping leak
	FDDI: defxx: Make MMIO the configuration default except for EISA
	drm/i915/gvt: Fix virtual display setup for BXT/APL
	drm/i915/gvt: Fix vfio_edid issue for BXT/APL
	drm/qxl: use ttm bo priorities
	drm/panfrost: Clear MMU irqs before handling the fault
	drm/panfrost: Don't try to map pages that are already mapped
	drm/radeon: fix copy of uninitialized variable back to userspace
	drm/dp_mst: Revise broadcast msg lct & lcr
	drm/dp_mst: Set CLEAR_PAYLOAD_ID_TABLE as broadcast
	drm: bridge/panel: Cleanup connector on bridge detach
	drm/amd/display: Reject non-zero src_y and src_x for video planes
	drm/amdgpu: fix concurrent VM flushes on Vega/Navi v2
	ALSA: hda/realtek: Re-order ALC882 Acer quirk table entries
	ALSA: hda/realtek: Re-order ALC882 Sony quirk table entries
	ALSA: hda/realtek: Re-order ALC882 Clevo quirk table entries
	ALSA: hda/realtek: Re-order ALC269 HP quirk table entries
	ALSA: hda/realtek: Re-order ALC269 Acer quirk table entries
	ALSA: hda/realtek: Re-order ALC269 Dell quirk table entries
	ALSA: hda/realtek: Re-order ALC269 ASUS quirk table entries
	ALSA: hda/realtek: Re-order ALC269 Sony quirk table entries
	ALSA: hda/realtek: Re-order ALC269 Lenovo quirk table entries
	ALSA: hda/realtek: Re-order remaining ALC269 quirk table entries
	ALSA: hda/realtek: Re-order ALC662 quirk table entries
	ALSA: hda/realtek: Remove redundant entry for ALC861 Haier/Uniwill devices
	ALSA: hda/realtek: ALC285 Thinkpad jack pin quirk is unreachable
	ALSA: hda/realtek: Fix speaker amp on HP Envy AiO 32
	KVM: s390: VSIE: correctly handle MVPG when in VSIE
	KVM: s390: split kvm_s390_logical_to_effective
	KVM: s390: fix guarded storage control register handling
	s390: fix detection of vector enhancements facility 1 vs. vector packed decimal facility
	KVM: s390: VSIE: fix MVPG handling for prefixing and MSO
	KVM: s390: split kvm_s390_real_to_abs
	KVM: s390: extend kvm_s390_shadow_fault to return entry pointer
	KVM: x86/mmu: Alloc page for PDPTEs when shadowing 32-bit NPT with 64-bit
	KVM: x86: Remove emulator's broken checks on CR0/CR3/CR4 loads
	KVM: nSVM: Set the shadow root level to the TDP level for nested NPT
	KVM: SVM: Don't strip the C-bit from CR2 on #PF interception
	KVM: SVM: Do not allow SEV/SEV-ES initialization after vCPUs are created
	KVM: SVM: Inject #GP on guest MSR_TSC_AUX accesses if RDTSCP unsupported
	KVM: nVMX: Defer the MMU reload to the normal path on an EPTP switch
	KVM: nVMX: Truncate bits 63:32 of VMCS field on nested check in !64-bit
	KVM: nVMX: Truncate base/index GPR value on address calc in !64-bit
	KVM: arm/arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST read
	KVM: Destroy I/O bus devices on unregister failure _after_ sync'ing SRCU
	KVM: Stop looking for coalesced MMIO zones if the bus is destroyed
	KVM: arm64: Fully zero the vcpu state on reset
	KVM: arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION read
	Revert "drivers/net/wan/hdlc_fr: Fix a double free in pvc_xmit"
	Revert "i3c master: fix missing destroy_workqueue() on error in i3c_master_register"
	ovl: fix missing revert_creds() on error path
	Revert "drm/qxl: do not run release if qxl failed to init"
	usb: gadget: pch_udc: Revert d3cb25a121 completely
	Revert "tools/power turbostat: adjust for temperature offset"
	firmware: xilinx: Fix dereferencing freed memory
	firmware: xilinx: Add a blank line after function declaration
	firmware: xilinx: Remove zynqmp_pm_get_eemi_ops() in IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE)
	fpga: fpga-mgr: xilinx-spi: fix error messages on -EPROBE_DEFER
	crypto: sun8i-ss - fix result memory leak on error path
	memory: gpmc: fix out of bounds read and dereference on gpmc_cs[]
	ARM: dts: exynos: correct fuel gauge interrupt trigger level on GT-I9100
	ARM: dts: exynos: correct fuel gauge interrupt trigger level on Midas family
	ARM: dts: exynos: correct MUIC interrupt trigger level on Midas family
	ARM: dts: exynos: correct PMIC interrupt trigger level on Midas family
	ARM: dts: exynos: correct PMIC interrupt trigger level on Odroid X/U3 family
	ARM: dts: exynos: correct PMIC interrupt trigger level on SMDK5250
	ARM: dts: exynos: correct PMIC interrupt trigger level on Snow
	ARM: dts: s5pv210: correct fuel gauge interrupt trigger level on Fascinate family
	ARM: dts: renesas: Add mmc aliases into R-Car Gen2 board dts files
	arm64: dts: renesas: Add mmc aliases into board dts files
	x86/platform/uv: Set section block size for hubless architectures
	serial: stm32: fix code cleaning warnings and checks
	serial: stm32: add "_usart" prefix in functions name
	serial: stm32: fix probe and remove order for dma
	serial: stm32: Use of_device_get_match_data()
	serial: stm32: fix startup by enabling usart for reception
	serial: stm32: fix incorrect characters on console
	serial: stm32: fix TX and RX FIFO thresholds
	serial: stm32: fix a deadlock condition with wakeup event
	serial: stm32: fix wake-up flag handling
	serial: stm32: fix a deadlock in set_termios
	serial: stm32: fix tx dma completion, release channel
	serial: stm32: call stm32_transmit_chars locked
	serial: stm32: fix FIFO flush in startup and set_termios
	serial: stm32: add FIFO flush when port is closed
	serial: stm32: fix tx_empty condition
	usb: typec: tcpci: Check ROLE_CONTROL while interpreting CC_STATUS
	usb: typec: tps6598x: Fix return value check in tps6598x_probe()
	usb: typec: stusb160x: fix return value check in stusb160x_probe()
	regmap: set debugfs_name to NULL after it is freed
	spi: rockchip: avoid objtool warning
	mtd: rawnand: fsmc: Fix error code in fsmc_nand_probe()
	mtd: rawnand: brcmnand: fix OOB R/W with Hamming ECC
	mtd: Handle possible -EPROBE_DEFER from parse_mtd_partitions()
	mtd: rawnand: qcom: Return actual error code instead of -ENODEV
	mtd: don't lock when recursively deleting partitions
	mtd: maps: fix error return code of physmap_flash_remove()
	ARM: dts: stm32: fix usart 2 & 3 pinconf to wake up with flow control
	arm64: dts: qcom: sm8250: Fix level triggered PMU interrupt polarity
	arm64: dts: qcom: sm8250: Fix timer interrupt to specify EL2 physical timer
	arm64: dts: qcom: sdm845: fix number of pins in 'gpio-ranges'
	arm64: dts: qcom: sm8150: fix number of pins in 'gpio-ranges'
	arm64: dts: qcom: sm8250: fix number of pins in 'gpio-ranges'
	arm64: dts: qcom: db845c: fix correct powerdown pin for WSA881x
	crypto: sun8i-ss - Fix memory leak of object d when dma_iv fails to map
	spi: stm32: drop devres version of spi_register_master
	regulator: bd9576: Fix return from bd957x_probe()
	arm64: dts: renesas: r8a77980: Fix vin4-7 endpoint binding
	spi: stm32: Fix use-after-free on unbind
	x86/microcode: Check for offline CPUs before requesting new microcode
	devtmpfs: fix placement of complete() call
	usb: gadget: pch_udc: Replace cpu_to_le32() by lower_32_bits()
	usb: gadget: pch_udc: Check if driver is present before calling ->setup()
	usb: gadget: pch_udc: Check for DMA mapping error
	usb: gadget: pch_udc: Initialize device pointer before use
	usb: gadget: pch_udc: Provide a GPIO line used on Intel Minnowboard (v1)
	crypto: ccp - fix command queuing to TEE ring buffer
	crypto: qat - don't release uninitialized resources
	crypto: qat - ADF_STATUS_PF_RUNNING should be set after adf_dev_init
	fotg210-udc: Fix DMA on EP0 for length > max packet size
	fotg210-udc: Fix EP0 IN requests bigger than two packets
	fotg210-udc: Remove a dubious condition leading to fotg210_done
	fotg210-udc: Mask GRP2 interrupts we don't handle
	fotg210-udc: Don't DMA more than the buffer can take
	fotg210-udc: Complete OUT requests on short packets
	usb: gadget: s3c: Fix incorrect resources releasing
	usb: gadget: s3c: Fix the error handling path in 's3c2410_udc_probe()'
	dt-bindings: serial: stm32: Use 'type: object' instead of false for 'additionalProperties'
	mtd: require write permissions for locking and badblock ioctls
	arm64: dts: renesas: r8a779a0: Fix PMU interrupt
	bus: qcom: Put child node before return
	soundwire: bus: Fix device found flag correctly
	phy: ti: j721e-wiz: Delete "clk_div_sel" clk provider during cleanup
	phy: marvell: ARMADA375_USBCLUSTER_PHY should not default to y, unconditionally
	arm64: dts: mediatek: fix reset GPIO level on pumpkin
	NFSD: Fix sparse warning in nfs4proc.c
	NFSv4.2: fix copy stateid copying for the async copy
	crypto: poly1305 - fix poly1305_core_setkey() declaration
	crypto: qat - fix error path in adf_isr_resource_alloc()
	usb: gadget: aspeed: fix dma map failure
	USB: gadget: udc: fix wrong pointer passed to IS_ERR() and PTR_ERR()
	drivers: nvmem: Fix voltage settings for QTI qfprom-efuse
	driver core: platform: Declare early_platform_cleanup() prototype
	memory: pl353: fix mask of ECC page_size config register
	soundwire: stream: fix memory leak in stream config error path
	m68k: mvme147,mvme16x: Don't wipe PCC timer config bits
	firmware: qcom_scm: Make __qcom_scm_is_call_available() return bool
	firmware: qcom_scm: Reduce locking section for __get_convention()
	firmware: qcom_scm: Workaround lack of "is available" call on SC7180
	iio: adc: Kconfig: make AD9467 depend on ADI_AXI_ADC symbol
	mtd: rawnand: gpmi: Fix a double free in gpmi_nand_init
	irqchip/gic-v3: Fix OF_BAD_ADDR error handling
	staging: comedi: tests: ni_routes_test: Fix compilation error
	staging: rtl8192u: Fix potential infinite loop
	staging: fwserial: fix TIOCSSERIAL implementation
	staging: fwserial: fix TIOCGSERIAL implementation
	staging: greybus: uart: fix unprivileged TIOCCSERIAL
	soc: qcom: pdr: Fix error return code in pdr_register_listener
	PM / devfreq: Use more accurate returned new_freq as resume_freq
	clocksource/drivers/timer-ti-dm: Fix posted mode status check order
	clocksource/drivers/timer-ti-dm: Add missing set_state_oneshot_stopped
	clocksource/drivers/ingenic_ost: Fix return value check in ingenic_ost_probe()
	spi: Fix use-after-free with devm_spi_alloc_*
	spi: fsl: add missing iounmap() on error in of_fsl_spi_probe()
	soc: qcom: mdt_loader: Validate that p_filesz < p_memsz
	soc: qcom: mdt_loader: Detect truncated read of segments
	PM: runtime: Replace inline function pm_runtime_callbacks_present()
	cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration
	ACPI: CPPC: Replace cppc_attr with kobj_attribute
	crypto: allwinner - add missing CRYPTO_ prefix
	crypto: sun8i-ss - Fix memory leak of pad
	crypto: sa2ul - Fix memory leak of rxd
	crypto: qat - Fix a double free in adf_create_ring
	cpufreq: armada-37xx: Fix setting TBG parent for load levels
	clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock
	cpufreq: armada-37xx: Fix the AVS value for load L1
	clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz
	clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0
	cpufreq: armada-37xx: Fix driver cleanup when registration failed
	cpufreq: armada-37xx: Fix determining base CPU frequency
	spi: spi-zynqmp-gqspi: use wait_for_completion_timeout to make zynqmp_qspi_exec_op not interruptible
	spi: spi-zynqmp-gqspi: add mutex locking for exec_op
	spi: spi-zynqmp-gqspi: transmit dummy circles by using the controller's internal functionality
	spi: spi-zynqmp-gqspi: fix incorrect operating mode in zynqmp_qspi_read_op
	spi: fsl-lpspi: Fix PM reference leak in lpspi_prepare_xfer_hardware()
	usb: gadget: r8a66597: Add missing null check on return from platform_get_resource
	USB: cdc-acm: fix unprivileged TIOCCSERIAL
	USB: cdc-acm: fix TIOCGSERIAL implementation
	tty: actually undefine superseded ASYNC flags
	tty: fix return value for unsupported ioctls
	tty: Remove dead termiox code
	tty: fix return value for unsupported termiox ioctls
	serial: core: return early on unsupported ioctls
	firmware: qcom-scm: Fix QCOM_SCM configuration
	node: fix device cleanups in error handling code
	crypto: chelsio - Read rxchannel-id from firmware
	usbip: vudc: fix missing unlock on error in usbip_sockfd_store()
	m68k: Add missing mmap_read_lock() to sys_cacheflush()
	spi: spi-zynqmp-gqspi: Fix missing unlock on error in zynqmp_qspi_exec_op()
	memory: renesas-rpc-if: fix possible NULL pointer dereference of resource
	memory: samsung: exynos5422-dmc: handle clk_set_parent() failure
	security: keys: trusted: fix TPM2 authorizations
	platform/x86: pmc_atom: Match all Beckhoff Automation baytrail boards with critclk_systems DMI table
	ARM: dts: aspeed: Rainier: Fix humidity sensor bus address
	Drivers: hv: vmbus: Use after free in __vmbus_open()
	spi: spi-zynqmp-gqspi: fix clk_enable/disable imbalance issue
	spi: spi-zynqmp-gqspi: fix hang issue when suspend/resume
	spi: spi-zynqmp-gqspi: fix use-after-free in zynqmp_qspi_exec_op
	spi: spi-zynqmp-gqspi: return -ENOMEM if dma_map_single fails
	x86/platform/uv: Fix !KEXEC build failure
	hwmon: (pmbus/pxe1610) don't bail out when not all pages are active
	Drivers: hv: vmbus: Increase wait time for VMbus unload
	PM: hibernate: x86: Use crc32 instead of md5 for hibernation e820 integrity check
	usb: dwc2: Fix host mode hibernation exit with remote wakeup flow.
	usb: dwc2: Fix hibernation between host and device modes.
	ttyprintk: Add TTY hangup callback.
	serial: omap: don't disable rs485 if rts gpio is missing
	serial: omap: fix rs485 half-duplex filtering
	xen-blkback: fix compatibility bug with single page rings
	soc: aspeed: fix a ternary sign expansion bug
	drm/tilcdc: send vblank event when disabling crtc
	drm/stm: Fix bus_flags handling
	drm/amd/display: Fix off by one in hdmi_14_process_transaction()
	drm/mcde/panel: Inverse misunderstood flag
	sched/fair: Fix shift-out-of-bounds in load_balance()
	afs: Fix updating of i_mode due to 3rd party change
	rcu: Remove spurious instrumentation_end() in rcu_nmi_enter()
	media: vivid: fix assignment of dev->fbuf_out_flags
	media: saa7134: use sg_dma_len when building pgtable
	media: saa7146: use sg_dma_len when building pgtable
	media: omap4iss: return error code when omap4iss_get() failed
	media: rkisp1: rsz: crash fix when setting src format
	media: aspeed: fix clock handling logic
	drm/probe-helper: Check epoch counter in output_poll_execute()
	media: venus: core: Fix some resource leaks in the error path of 'venus_probe()'
	media: platform: sunxi: sun6i-csi: fix error return code of sun6i_video_start_streaming()
	media: m88ds3103: fix return value check in m88ds3103_probe()
	media: docs: Fix data organization of MEDIA_BUS_FMT_RGB101010_1X30
	media: [next] staging: media: atomisp: fix memory leak of object flash
	media: atomisp: Fixed error handling path
	media: m88rs6000t: avoid potential out-of-bounds reads on arrays
	media: atomisp: Fix use after free in atomisp_alloc_css_stat_bufs()
	drm/amdkfd: fix build error with AMD_IOMMU_V2=m
	of: overlay: fix for_each_child.cocci warnings
	x86/kprobes: Fix to check non boostable prefixes correctly
	selftests: fix prepending $(OUTPUT) to $(TEST_PROGS)
	pata_arasan_cf: fix IRQ check
	pata_ipx4xx_cf: fix IRQ check
	sata_mv: add IRQ checks
	ata: libahci_platform: fix IRQ check
	seccomp: Fix CONFIG tests for Seccomp_filters
	nvme-tcp: block BH in sk state_change sk callback
	nvmet-tcp: fix incorrect locking in state_change sk callback
	clk: imx: Fix reparenting of UARTs not associated with stdout
	power: supply: bq25980: Move props from battery node
	nvme: retrigger ANA log update if group descriptor isn't found
	media: i2c: imx219: Move out locking/unlocking of vflip and hflip controls from imx219_set_stream
	media: i2c: imx219: Balance runtime PM use-count
	media: v4l2-ctrls.c: fix race condition in hdl->requests list
	vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
	vfio/pci: Move VGA and VF initialization to functions
	vfio/pci: Re-order vfio_pci_probe()
	vfio/mdev: Do not allow a mdev_type to have a NULL parent pointer
	clk: zynqmp: move zynqmp_pll_set_mode out of round_rate callback
	clk: zynqmp: pll: add set_pll_mode to check condition in zynqmp_pll_enable
	drm: xlnx: zynqmp: fix a memset in zynqmp_dp_train()
	clk: qcom: a53-pll: Add missing MODULE_DEVICE_TABLE
	clk: qcom: apss-ipq-pll: Add missing MODULE_DEVICE_TABLE
	drm/amd/display: use GFP_ATOMIC in dcn20_resource_construct
	drm/radeon: Fix a missing check bug in radeon_dp_mst_detect()
	clk: uniphier: Fix potential infinite loop
	scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check()
	scsi: pm80xx: Fix potential infinite loop
	scsi: ufs: ufshcd-pltfrm: Fix deferred probing
	scsi: hisi_sas: Fix IRQ checks
	scsi: jazz_esp: Add IRQ check
	scsi: sun3x_esp: Add IRQ check
	scsi: sni_53c710: Add IRQ check
	scsi: ibmvfc: Fix invalid state machine BUG_ON()
	mailbox: sprd: Introduce refcnt when clients requests/free channels
	mfd: stm32-timers: Avoid clearing auto reload register
	nvmet-tcp: fix a segmentation fault during io parsing error
	nvme-pci: don't simple map sgl when sgls are disabled
	media: cedrus: Fix H265 status definitions
	HSI: core: fix resource leaks in hsi_add_client_from_dt()
	x86/events/amd/iommu: Fix sysfs type mismatch
	perf/amd/uncore: Fix sysfs type mismatch
	io_uring: fix overflows checks in provide buffers
	sched/debug: Fix cgroup_path[] serialization
	drivers/block/null_blk/main: Fix a double free in null_init.
	xsk: Respect device's headroom and tailroom on generic xmit path
	HID: plantronics: Workaround for double volume key presses
	perf symbols: Fix dso__fprintf_symbols_by_name() to return the number of printed chars
	ASoC: Intel: boards: sof-wm8804: add check for PLL setting
	ASoC: Intel: Skylake: Compile when any configuration is selected
	RDMA/mlx5: Fix mlx5 rates to IB rates map
	wilc1000: write value to WILC_INTR2_ENABLE register
	KVM: x86/mmu: Retry page faults that hit an invalid memslot
	Bluetooth: avoid deadlock between hci_dev->lock and socket lock
	net: lapbether: Prevent racing when checking whether the netif is running
	libbpf: Add explicit padding to bpf_xdp_set_link_opts
	bpftool: Fix maybe-uninitialized warnings
	iommu: Check dev->iommu in iommu_dev_xxx functions
	iommu/vt-d: Reject unsupported page request modes
	selftests/bpf: Re-generate vmlinux.h and BPF skeletons if bpftool changed
	libbpf: Add explicit padding to btf_dump_emit_type_decl_opts
	powerpc/fadump: Mark fadump_calculate_reserve_size as __init
	powerpc/prom: Mark identical_pvr_fixup as __init
	MIPS: fix local_irq_{disable,enable} in asmmacro.h
	ima: Fix the error code for restoring the PCR value
	inet: use bigger hash table for IP ID generation
	pinctrl: pinctrl-single: remove unused parameter
	pinctrl: pinctrl-single: fix pcs_pin_dbg_show() when bits_per_mux is not zero
	MIPS: loongson64: fix bug when PAGE_SIZE > 16KB
	ASoC: wm8960: Remove bitclk relax condition in wm8960_configure_sysclk
	iommu/arm-smmu-v3: add bit field SFM into GERROR_ERR_MASK
	RDMA/mlx5: Fix drop packet rule in egress table
	IB/isert: Fix a use after free in isert_connect_request
	powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration
	MIPS/bpf: Enable bpf_probe_read{, str}() on MIPS again
	gpio: guard gpiochip_irqchip_add_domain() with GPIOLIB_IRQCHIP
	ALSA: core: remove redundant spin_lock pair in snd_card_disconnect
	net: phy: lan87xx: fix access to wrong register of LAN87xx
	udp: never accept GSO_FRAGLIST packets
	powerpc/pseries: Only register vio drivers if vio bus exists
	net/tipc: fix missing destroy_workqueue() on error in tipc_crypto_start()
	bug: Remove redundant condition check in report_bug
	RDMA/core: Fix corrupted SL on passive side
	nfc: pn533: prevent potential memory corruption
	net: hns3: Limiting the scope of vector_ring_chain variable
	mips: bmips: fix syscon-reboot nodes
	iommu/vt-d: Don't set then clear private data in prq_event_thread()
	iommu: Fix a boundary issue to avoid performance drop
	iommu/vt-d: Report right snoop capability when using FL for IOVA
	iommu/vt-d: Report the right page fault address
	iommu/vt-d: Preset Access/Dirty bits for IOVA over FL
	iommu/vt-d: Remove WO permissions on second-level paging entries
	iommu/vt-d: Invalidate PASID cache when root/context entry changed
	ALSA: usb-audio: Add error checks for usb_driver_claim_interface() calls
	HID: lenovo: Use brightness_set_blocking callback for setting LEDs brightness
	HID: lenovo: Fix lenovo_led_set_tp10ubkbd() error handling
	HID: lenovo: Check hid_get_drvdata() returns non NULL in lenovo_event()
	HID: lenovo: Map mic-mute button to KEY_F20 instead of KEY_MICMUTE
	KVM: arm64: Initialize VCPU mdcr_el2 before loading it
	ASoC: simple-card: fix possible uninitialized single_cpu local variable
	liquidio: Fix unintented sign extension of a left shift of a u16
	IB/hfi1: Use kzalloc() for mmu_rb_handler allocation
	powerpc/64s: Fix pte update for kernel memory on radix
	powerpc/perf: Fix PMU constraint check for EBB events
	powerpc: iommu: fix build when neither PCI or IBMVIO is set
	mac80211: bail out if cipher schemes are invalid
	perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric
	xfs: fix return of uninitialized value in variable error
	rtw88: Fix an error code in rtw_debugfs_set_rsvd_page()
	mt7601u: fix always true expression
	mt76: mt7615: fix tx skb dma unmap
	mt76: mt7915: fix tx skb dma unmap
	mt76: mt7915: fix aggr len debugfs node
	mt76: mt7615: fix mib stats counter reporting to mac80211
	mt76: mt7915: fix mib stats counter reporting to mac80211
	mt76: mt7663s: make all of packets 4-bytes aligned in sdio tx aggregation
	mt76: mt7663s: fix the possible device hang in high traffic
	KVM: PPC: Book3S HV P9: Restore host CTRL SPR after guest exit
	ovl: invalidate readdir cache on changes to dir with origin
	RDMA/qedr: Fix error return code in qedr_iw_connect()
	IB/hfi1: Fix error return code in parse_platform_config()
	RDMA/bnxt_re: Fix error return code in bnxt_qplib_cq_process_terminal()
	cxgb4: Fix unintentional sign extension issues
	net: thunderx: Fix unintentional sign extension issue
	RDMA/srpt: Fix error return code in srpt_cm_req_recv()
	RDMA/rtrs-clt: destroy sysfs after removing session from active list
	i2c: cadence: fix reference leak when pm_runtime_get_sync fails
	i2c: img-scb: fix reference leak when pm_runtime_get_sync fails
	i2c: imx-lpi2c: fix reference leak when pm_runtime_get_sync fails
	i2c: imx: fix reference leak when pm_runtime_get_sync fails
	i2c: omap: fix reference leak when pm_runtime_get_sync fails
	i2c: sprd: fix reference leak when pm_runtime_get_sync fails
	i2c: stm32f7: fix reference leak when pm_runtime_get_sync fails
	i2c: xiic: fix reference leak when pm_runtime_get_sync fails
	i2c: cadence: add IRQ check
	i2c: emev2: add IRQ check
	i2c: jz4780: add IRQ check
	i2c: mlxbf: add IRQ check
	i2c: rcar: make sure irq is not threaded on Gen2 and earlier
	i2c: rcar: protect against supurious interrupts on V3U
	i2c: rcar: add IRQ check
	i2c: sh7760: add IRQ check
	powerpc/xive: Drop check on irq_data in xive_core_debug_show()
	powerpc/xive: Fix xmon command "dxi"
	ASoC: ak5558: correct reset polarity
	net/mlx5: Fix bit-wise and with zero
	net/packet: make packet_fanout.arr size configurable up to 64K
	net/packet: remove data races in fanout operations
	drm/i915/gvt: Fix error code in intel_gvt_init_device()
	iommu/amd: Put newline after closing bracket in warning
	perf beauty: Fix fsconfig generator
	drm/amd/pm: fix error code in smu_set_power_limit()
	MIPS: pci-legacy: stop using of_pci_range_to_resource
	powerpc/pseries: extract host bridge from pci_bus prior to bus removal
	powerpc/smp: Reintroduce cpu_core_mask
	KVM: x86: dump_vmcs should not assume GUEST_IA32_EFER is valid
	rtlwifi: 8821ae: upgrade PHY and RF parameters
	wlcore: fix overlapping snprintf arguments in debugfs
	i2c: sh7760: fix IRQ error path
	i2c: mediatek: Fix wrong dma sync flag
	mwl8k: Fix a double Free in mwl8k_probe_hw
	netfilter: nft_payload: fix C-VLAN offload support
	netfilter: nftables_offload: VLAN id needs host byteorder in flow dissector
	netfilter: nftables_offload: special ethertype handling for VLAN
	vsock/vmci: log once the failed queue pair allocation
	libbpf: Initialize the bpf_seq_printf parameters array field by field
	net: ethernet: ixp4xx: Set the DMA masks explicitly
	gro: fix napi_gro_frags() Fast GRO breakage due to IP alignment check
	RDMA/cxgb4: add missing qpid increment
	RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
	ALSA: usb: midi: don't return -ENOMEM when usb_urb_ep_type_check fails
	sfc: ef10: fix TX queue lookup in TX event handling
	vsock/virtio: free queued packets when closing socket
	net: marvell: prestera: fix port event handling on init
	net: davinci_emac: Fix incorrect masking of tx and rx error channel
	mt76: mt7615: fix memleak when mt7615_unregister_device()
	crypto: ccp: Detect and reject "invalid" addresses destined for PSP
	nfp: devlink: initialize the devlink port attribute "lanes"
	net: stmmac: fix TSO and TBS feature enabling during driver open
	net: renesas: ravb: Fix a stuck issue when a lot of frames are received
	net: phy: intel-xway: enable integrated led functions
	RDMA/rxe: Fix a bug in rxe_fill_ip_info()
	RDMA/core: Add CM to restrack after successful attachment to a device
	powerpc/64: Fix the definition of the fixmap area
	ath9k: Fix error check in ath9k_hw_read_revisions() for PCI devices
	ath10k: Fix a use after free in ath10k_htc_send_bundle
	ath10k: Fix ath10k_wmi_tlv_op_pull_peer_stats_info() unlock without lock
	wlcore: Fix buffer overrun by snprintf due to incorrect buffer size
	powerpc/perf: Fix the threshold event selection for memory events in power10
	powerpc/52xx: Fix an invalid ASM expression ('addi' used instead of 'add')
	net: phy: marvell: fix m88e1011_set_downshift
	net: phy: marvell: fix m88e1111_set_downshift
	net: enetc: fix link error again
	bnxt_en: fix ternary sign extension bug in bnxt_show_temp()
	ARM: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
	arm64: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
	net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb
	selftests: net: mirror_gre_vlan_bridge_1q: Make an FDB entry static
	selftests: mlxsw: Remove a redundant if statement in tc_flower_scale test
	bnxt_en: Fix RX consumer index logic in the error path.
	KVM: VMX: Intercept FS/GS_BASE MSR accesses for 32-bit KVM
	net:emac/emac-mac: Fix a use after free in emac_mac_tx_buf_send
	selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro
	selftests/bpf: Fix field existence CO-RE reloc tests
	selftests/bpf: Fix core_reloc test runner
	bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds
	RDMA/siw: Fix a use after free in siw_alloc_mr
	RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
	net: bridge: mcast: fix broken length + header check for MRDv6 Adv.
	net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
	perf tools: Change fields type in perf_record_time_conv
	perf jit: Let convert_timestamp() to be backwards-compatible
	perf session: Add swap operation for event TIME_CONV
	ia64: fix EFI_DEBUG build
	kfifo: fix ternary sign extension bugs
	mm/sl?b.c: remove ctor argument from kmem_cache_flags
	mm: memcontrol: slab: fix obtain a reference to a freeing memcg
	mm/sparse: add the missing sparse_buffer_fini() in error branch
	mm/memory-failure: unnecessary amount of unmapping
	afs: Fix speculative status fetches
	bpf: Fix alu32 const subreg bound tracking on bitwise operations
	bpf, ringbuf: Deny reserve of buffers larger than ringbuf
	bpf: Prevent writable memory-mapping of read-only ringbuf pages
	arm64: Remove arm64_dma32_phys_limit and its uses
	net: Only allow init netns to set default tcp cong to a restricted algo
	smp: Fix smp_call_function_single_async prototype
	Revert "net/sctp: fix race condition in sctp_destroy_sock"
	sctp: delay auto_asconf init until binding the first addr
	Linux 5.10.37

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie783c33e5795514a40672a8273d584ea970fb549
2021-05-14 10:31:07 +02:00
Arnd Bergmann
41f1aed56d smp: Fix smp_call_function_single_async prototype
commit 1139aeb1c5 upstream.

As of commit 966a967116 ("smp: Avoid using two cache lines for struct
call_single_data"), the smp code prefers 32-byte aligned call_single_data
objects for performance reasons, but the block layer includes an instance
of this structure in the main 'struct request' that is more senstive
to size than to performance here, see 4ccafe0320 ("block: unalign
call_single_data in struct request").

The result is a violation of the calling conventions that clang correctly
points out:

block/blk-mq.c:630:39: warning: passing 8-byte aligned argument to 32-byte aligned parameter 2 of 'smp_call_function_single_async' may result in an unaligned pointer access [-Walign-mismatch]
                smp_call_function_single_async(cpu, &rq->csd);

It does seem that the usage of the call_single_data without cache line
alignment should still be allowed by the smp code, so just change the
function prototype so it accepts both, but leave the default alignment
unchanged for the other users. This seems better to me than adding
a local hack to shut up an otherwise correct warning in the caller.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Link: https://lkml.kernel.org/r/20210505211300.3174456-1-arnd@kernel.org
[nc: Fix conflicts, modify rq_csd_init]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-14 09:50:46 +02:00
Andrii Nakryiko
00d9f429af bpf: Prevent writable memory-mapping of read-only ringbuf pages
commit 04ea3086c4 upstream.

Only the very first page of BPF ringbuf that contains consumer position
counter is supposed to be mapped as writeable by user-space. Producer
position is read-only and can be modified only by the kernel code. BPF ringbuf
data pages are read-only as well and are not meant to be modified by
user-code to maintain integrity of per-record headers.

This patch allows to map only consumer position page as writeable and
everything else is restricted to be read-only. remap_vmalloc_range()
internally adds VM_DONTEXPAND, so all the established memory mappings can't be
extended, which prevents any future violations through mremap()'ing.

Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Ryota Shiga (Flatt Security)
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-14 09:50:46 +02:00
Thadeu Lima de Souza Cascardo
1ca284f086 bpf, ringbuf: Deny reserve of buffers larger than ringbuf
commit 4b81ccebae upstream.

A BPF program might try to reserve a buffer larger than the ringbuf size.
If the consumer pointer is way ahead of the producer, that would be
successfully reserved, allowing the BPF program to read or write out of
the ringbuf allocated area.

Reported-by: Ryota Shiga (Flatt Security)
Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-14 09:50:45 +02:00
Daniel Borkmann
282bfc8848 bpf: Fix alu32 const subreg bound tracking on bitwise operations
commit 049c4e1371 upstream.

Fix a bug in the verifier's scalar32_min_max_*() functions which leads to
incorrect tracking of 32 bit bounds for the simulation of and/or/xor bitops.
When both the src & dst subreg is a known constant, then the assumption is
that scalar_min_max_*() will take care to update bounds correctly. However,
this is not the case, for example, consider a register R2 which has a tnum
of 0xffffffff00000000, meaning, lower 32 bits are known constant and in this
case of value 0x00000001. R2 is then and'ed with a register R3 which is a
64 bit known constant, here, 0x100000002.

What can be seen in line '10:' is that 32 bit bounds reach an invalid state
where {u,s}32_min_value > {u,s}32_max_value. The reason is scalar32_min_max_*()
delegates 32 bit bounds updates to scalar_min_max_*(), however, that really
only takes place when both the 64 bit src & dst register is a known constant.
Given scalar32_min_max_*() is intended to be designed as closely as possible
to scalar_min_max_*(), update the 32 bit bounds in this situation through
__mark_reg32_known() which will set all {u,s}32_{min,max}_value to the correct
constant, which is 0x00000000 after the fix (given 0x00000001 & 0x00000002 in
32 bit space). This is possible given var32_off already holds the final value
as dst_reg->var_off is updated before calling scalar32_min_max_*().

Before fix, invalid tracking of R2:

  [...]
  9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
  9: (5f) r2 &= r3
  10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=1,s32_max_value=0,u32_min_value=1,u32_max_value=0) R3_w=inv4294967298 R10=fp0
  [...]

After fix, correct tracking of R2:

  [...]
  9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
  9: (5f) r2 &= r3
  10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=0,s32_max_value=0,u32_min_value=0,u32_max_value=0) R3_w=inv4294967298 R10=fp0
  [...]

Fixes: 3f50f132d8 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Fixes: 2921c90d47 ("bpf: Fix a verifier failure with xor")
Reported-by: Manfred Paul (@_manfp)
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-14 09:50:45 +02:00
Daniel Borkmann
4394be0a18 bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds
[ Upstream commit 10bf4e8316 ]

Similarly as b02709587e ("bpf: Fix propagation of 32-bit signed bounds
from 64-bit bounds."), we also need to fix the propagation of 32 bit
unsigned bounds from 64 bit counterparts. That is, really only set the
u32_{min,max}_value when /both/ {umin,umax}_value safely fit in 32 bit
space. For example, the register with a umin_value == 1 does /not/ imply
that u32_min_value is also equal to 1, since umax_value could be much
larger than 32 bit subregister can hold, and thus u32_min_value is in
the interval [0,1] instead.

Before fix, invalid tracking result of R2_w=inv1:

  [...]
  5: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0) R10=fp0
  5: (35) if r2 >= 0x1 goto pc+1
  [...] // goto path
  7: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umin_value=1) R10=fp0
  7: (b6) if w2 <= 0x1 goto pc+1
  [...] // goto path
  9: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,smin_value=-9223372036854775807,smax_value=9223372032559808513,umin_value=1,umax_value=18446744069414584321,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_max_value=1) R10=fp0
  9: (bc) w2 = w2
  10: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv1 R10=fp0
  [...]

After fix, correct tracking result of R2_w=inv(id=0,umax_value=1,var_off=(0x0; 0x1)):

  [...]
  5: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0) R10=fp0
  5: (35) if r2 >= 0x1 goto pc+1
  [...] // goto path
  7: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umin_value=1) R10=fp0
  7: (b6) if w2 <= 0x1 goto pc+1
  [...] // goto path
  9: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,smax_value=9223372032559808513,umax_value=18446744069414584321,var_off=(0x0; 0xffffffff00000001),s32_min_value=0,s32_max_value=1,u32_max_value=1) R10=fp0
  9: (bc) w2 = w2
  10: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,umax_value=1,var_off=(0x0; 0x1)) R10=fp0
  [...]

Thus, same issue as in b02709587e holds for unsigned subregister tracking.
Also, align __reg64_bound_u32() similarly to __reg64_bound_s32() as done in
b02709587e to make them uniform again.

Fixes: 3f50f132d8 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Reported-by: Manfred Paul (@_manfp)
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14 09:50:44 +02:00
Waiman Long
94f1bdf01b sched/debug: Fix cgroup_path[] serialization
[ Upstream commit ad789f84c9 ]

The handling of sysrq key can be activated by echoing the key to
/proc/sysrq-trigger or via the magic key sequence typed into a terminal
that is connected to the system in some way (serial, USB or other mean).
In the former case, the handling is done in a user context. In the
latter case, it is likely to be in an interrupt context.

Currently in print_cpu() of kernel/sched/debug.c, sched_debug_lock is
taken with interrupt disabled for the whole duration of the calls to
print_*_stats() and print_rq() which could last for the quite some time
if the information dump happens on the serial console.

If the system has many cpus and the sched_debug_lock is somehow busy
(e.g. parallel sysrq-t), the system may hit a hard lockup panic
depending on the actually serial console implementation of the
system.

The purpose of sched_debug_lock is to serialize the use of the global
cgroup_path[] buffer in print_cpu(). The rests of the printk calls don't
need serialization from sched_debug_lock.

Calling printk() with interrupt disabled can still be problematic if
multiple instances are running. Allocating a stack buffer of PATH_MAX
bytes is not feasible because of the limited size of the kernel stack.

The solution implemented in this patch is to allow only one caller at a
time to use the full size group_path[], while other simultaneous callers
will have to use shorter stack buffers with the possibility of path
name truncation. A "..." suffix will be printed if truncation may have
happened.  The cgroup path name is provided for informational purpose
only, so occasional path name truncation should not be a big problem.

Fixes: efe25c2c7b ("sched: Reinstate group names in /proc/sched_debug")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210415195426.6677-1-longman@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14 09:50:28 +02:00
Zhouyi Zhou
7d81aff289 rcu: Remove spurious instrumentation_end() in rcu_nmi_enter()
[ Upstream commit 6494ccb932 ]

In rcu_nmi_enter(), there is an erroneous instrumentation_end() in the
second branch of the "if" statement.  Oddly enough, "objtool check -f
vmlinux.o" fails to complain because it is unable to correctly cover
all cases.  Instead, objtool visits the third branch first, which marks
following trace_rcu_dyntick() as visited.  This commit therefore removes
the spurious instrumentation_end().

Fixes: 04b25a495b ("rcu: Mark rcu_nmi_enter() call to rcu_cleanup_after_idle() noinstr")
Reported-by Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14 09:50:22 +02:00
Valentin Schneider
80862cbf76 sched/fair: Fix shift-out-of-bounds in load_balance()
[ Upstream commit 39a2a6eb5c ]

Syzbot reported a handful of occurrences where an sd->nr_balance_failed can
grow to much higher values than one would expect.

A successful load_balance() resets it to 0; a failed one increments
it. Once it gets to sd->cache_nice_tries + 3, this *should* trigger an
active balance, which will either set it to sd->cache_nice_tries+1 or reset
it to 0. However, in case the to-be-active-balanced task is not allowed to
run on env->dst_cpu, then the increment is done without any further
modification.

This could then be repeated ad nauseam, and would explain the absurdly high
values reported by syzbot (86, 149). VincentG noted there is value in
letting sd->cache_nice_tries grow, so the shift itself should be
fixed. That means preventing:

  """
  If the value of the right operand is negative or is greater than or equal
  to the width of the promoted left operand, the behavior is undefined.
  """

Thus we need to cap the shift exponent to
  BITS_PER_TYPE(typeof(lefthand)) - 1.

I had a look around for other similar cases via coccinelle:

  @expr@
  position pos;
  expression E1;
  expression E2;
  @@
  (
  E1 >> E2@pos
  |
  E1 >> E2@pos
  )

  @cst depends on expr@
  position pos;
  expression expr.E1;
  constant cst;
  @@
  (
  E1 >> cst@pos
  |
  E1 << cst@pos
  )

  @script:python depends on !cst@
  pos << expr.pos;
  exp << expr.E2;
  @@
  # Dirty hack to ignore constexpr
  if exp.upper() != exp:
     coccilib.report.print_report(pos[0], "Possible UB shift here")

The only other match in kernel/sched is rq_clock_thermal() which employs
sched_thermal_decay_shift, and that exponent is already capped to 10, so
that one is fine.

Fixes: 5a7f555904 ("sched/fair: Relax constraint on task's load during load balance")
Reported-by: syzbot+d7581744d5fd27c9fbe1@syzkaller.appspotmail.com
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: http://lore.kernel.org/r/000000000000ffac1205b9a2112f@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14 09:50:22 +02:00
Greg Kroah-Hartman
d6eb3d6a61 Merge 5.10.36 into android13-5.10
Changes in 5.10.36
	bus: mhi: core: Fix check for syserr at power_up
	bus: mhi: core: Clear configuration from channel context during reset
	bus: mhi: core: Sanity check values from remote device before use
	nitro_enclaves: Fix stale file descriptors on failed usercopy
	dyndbg: fix parsing file query without a line-range suffix
	s390/disassembler: increase ebpf disasm buffer size
	s390/zcrypt: fix zcard and zqueue hot-unplug memleak
	vhost-vdpa: fix vm_flags for virtqueue doorbell mapping
	tpm: acpi: Check eventlog signature before using it
	ACPI: custom_method: fix potential use-after-free issue
	ACPI: custom_method: fix a possible memory leak
	ftrace: Handle commands when closing set_ftrace_filter file
	ARM: 9056/1: decompressor: fix BSS size calculation for LLVM ld.lld
	arm64: dts: marvell: armada-37xx: add syscon compatible to NB clk node
	arm64: dts: mt8173: fix property typo of 'phys' in dsi node
	ecryptfs: fix kernel panic with null dev_name
	fs/epoll: restore waking from ep_done_scan()
	mtd: spi-nor: core: Fix an issue of releasing resources during read/write
	Revert "mtd: spi-nor: macronix: Add support for mx25l51245g"
	mtd: spinand: core: add missing MODULE_DEVICE_TABLE()
	mtd: rawnand: atmel: Update ecc_stats.corrected counter
	mtd: physmap: physmap-bt1-rom: Fix unintentional stack access
	erofs: add unsupported inode i_format check
	spi: stm32-qspi: fix pm_runtime usage_count counter
	spi: spi-ti-qspi: Free DMA resources
	scsi: qla2xxx: Fix crash in qla2xxx_mqueuecommand()
	scsi: mpt3sas: Block PCI config access from userspace during reset
	mmc: uniphier-sd: Fix an error handling path in uniphier_sd_probe()
	mmc: uniphier-sd: Fix a resource leak in the remove function
	mmc: sdhci: Check for reset prior to DMA address unmap
	mmc: sdhci-pci: Fix initialization of some SD cards for Intel BYT-based controllers
	mmc: sdhci-tegra: Add required callbacks to set/clear CQE_EN bit
	mmc: block: Update ext_csd.cache_ctrl if it was written
	mmc: block: Issue a cache flush only when it's enabled
	mmc: core: Do a power cycle when the CMD11 fails
	mmc: core: Set read only for SD cards with permanent write protect bit
	mmc: core: Fix hanging on I/O during system suspend for removable cards
	irqchip/gic-v3: Do not enable irqs when handling spurious interrups
	cifs: Return correct error code from smb2_get_enc_key
	cifs: fix out-of-bound memory access when calling smb3_notify() at mount point
	cifs: detect dead connections only when echoes are enabled.
	smb2: fix use-after-free in smb2_ioctl_query_info()
	btrfs: handle remount to no compress during compression
	x86/build: Disable HIGHMEM64G selection for M486SX
	btrfs: fix metadata extent leak after failure to create subvolume
	intel_th: pci: Add Rocket Lake CPU support
	btrfs: fix race between transaction aborts and fsyncs leading to use-after-free
	posix-timers: Preserve return value in clock_adjtime32()
	fbdev: zero-fill colormap in fbcmap.c
	cpuidle: tegra: Fix C7 idling state on Tegra114
	bus: ti-sysc: Probe for l4_wkup and l4_cfg interconnect devices first
	staging: wimax/i2400m: fix byte-order issue
	spi: ath79: always call chipselect function
	spi: ath79: remove spi-master setup and cleanup assignment
	bus: mhi: core: Destroy SBL devices when moving to mission mode
	crypto: api - check for ERR pointers in crypto_destroy_tfm()
	crypto: qat - fix unmap invalid dma address
	usb: gadget: uvc: add bInterval checking for HS mode
	usb: webcam: Invalid size of Processing Unit Descriptor
	x86/sev: Do not require Hypervisor CPUID bit for SEV guests
	crypto: hisilicon/sec - fixes a printing error
	genirq/matrix: Prevent allocation counter corruption
	usb: gadget: f_uac2: validate input parameters
	usb: gadget: f_uac1: validate input parameters
	usb: dwc3: gadget: Ignore EP queue requests during bus reset
	usb: xhci: Fix port minor revision
	kselftest/arm64: mte: Fix compilation with native compiler
	ARM: tegra: acer-a500: Rename avdd to vdda of touchscreen node
	PCI: PM: Do not read power state in pci_enable_device_flags()
	kselftest/arm64: mte: Fix MTE feature detection
	ARM: dts: BCM5301X: fix "reg" formatting in /memory node
	ARM: dts: ux500: Fix up TVK R3 sensors
	x86/build: Propagate $(CLANG_FLAGS) to $(REALMODE_FLAGS)
	x86/boot: Add $(CLANG_FLAGS) to compressed KBUILD_CFLAGS
	efi/libstub: Add $(CLANG_FLAGS) to x86 flags
	soc/tegra: pmc: Fix completion of power-gate toggling
	arm64: dts: imx8mq-librem5-r3: Mark buck3 as always on
	tee: optee: do not check memref size on return from Secure World
	soundwire: cadence: only prepare attached devices on clock stop
	perf/arm_pmu_platform: Use dev_err_probe() for IRQ errors
	perf/arm_pmu_platform: Fix error handling
	random: initialize ChaCha20 constants with correct endianness
	usb: xhci-mtk: support quirk to disable usb2 lpm
	fpga: dfl: pci: add DID for D5005 PAC cards
	xhci: check port array allocation was successful before dereferencing it
	xhci: check control context is valid before dereferencing it.
	xhci: fix potential array out of bounds with several interrupters
	bus: mhi: core: Clear context for stopped channels from remove()
	ARM: dts: at91: change the key code of the gpio key
	tools/power/x86/intel-speed-select: Increase string size
	platform/x86: ISST: Account for increased timeout in some cases
	spi: dln2: Fix reference leak to master
	spi: omap-100k: Fix reference leak to master
	spi: qup: fix PM reference leak in spi_qup_remove()
	usb: gadget: tegra-xudc: Fix possible use-after-free in tegra_xudc_remove()
	usb: musb: fix PM reference leak in musb_irq_work()
	usb: core: hub: Fix PM reference leak in usb_port_resume()
	usb: dwc3: gadget: Check for disabled LPM quirk
	tty: n_gsm: check error while registering tty devices
	intel_th: Consistency and off-by-one fix
	phy: phy-twl4030-usb: Fix possible use-after-free in twl4030_usb_remove()
	crypto: sun8i-ss - Fix PM reference leak when pm_runtime_get_sync() fails
	crypto: sun8i-ce - Fix PM reference leak in sun8i_ce_probe()
	crypto: stm32/hash - Fix PM reference leak on stm32-hash.c
	crypto: stm32/cryp - Fix PM reference leak on stm32-cryp.c
	crypto: sa2ul - Fix PM reference leak in sa_ul_probe()
	crypto: omap-aes - Fix PM reference leak on omap-aes.c
	platform/x86: intel_pmc_core: Don't use global pmcdev in quirks
	spi: sync up initial chipselect state
	btrfs: do proper error handling in create_reloc_root
	btrfs: do proper error handling in btrfs_update_reloc_root
	btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s
	drm: Added orientation quirk for OneGX1 Pro
	drm/qxl: do not run release if qxl failed to init
	drm/qxl: release shadow on shutdown
	drm/ast: Fix invalid usage of AST_MAX_HWC_WIDTH in cursor atomic_check
	drm/amd/display: changing sr exit latency
	drm/ast: fix memory leak when unload the driver
	drm/amd/display: Check for DSC support instead of ASIC revision
	drm/amd/display: Don't optimize bandwidth before disabling planes
	drm/amdgpu/display: buffer INTERRUPT_LOW_IRQ_CONTEXT interrupt work
	drm/amd/display/dc/dce/dce_aux: Remove duplicate line causing 'field overwritten' issue
	scsi: lpfc: Fix incorrect dbde assignment when building target abts wqe
	scsi: lpfc: Fix pt2pt connection does not recover after LOGO
	drm/amdgpu: Fix some unload driver issues
	sched/pelt: Fix task util_est update filtering
	kvfree_rcu: Use same set of GFP flags as does single-argument
	scsi: target: pscsi: Fix warning in pscsi_complete_cmd()
	media: ite-cir: check for receive overflow
	media: drivers: media: pci: sta2x11: fix Kconfig dependency on GPIOLIB
	media: imx: capture: Return -EPIPE from __capture_legacy_try_fmt()
	atomisp: don't let it go past pipes array
	power: supply: bq27xxx: fix power_avg for newer ICs
	extcon: arizona: Fix some issues when HPDET IRQ fires after the jack has been unplugged
	extcon: arizona: Fix various races on driver unbind
	media: media/saa7164: fix saa7164_encoder_register() memory leak bugs
	media: gspca/sq905.c: fix uninitialized variable
	power: supply: Use IRQF_ONESHOT
	backlight: qcom-wled: Use sink_addr for sync toggle
	backlight: qcom-wled: Fix FSC update issue for WLED5
	drm/amdgpu: mask the xgmi number of hops reported from psp to kfd
	drm/amdkfd: Fix UBSAN shift-out-of-bounds warning
	drm/amdgpu : Fix asic reset regression issue introduce by 8f211fe8ac
	drm/amd/pm: fix workload mismatch on vega10
	drm/amd/display: Fix UBSAN warning for not a valid value for type '_Bool'
	drm/amd/display: DCHUB underflow counter increasing in some scenarios
	drm/amd/display: fix dml prefetch validation
	scsi: qla2xxx: Always check the return value of qla24xx_get_isp_stats()
	drm/vkms: fix misuse of WARN_ON
	scsi: qla2xxx: Fix use after free in bsg
	mmc: sdhci-esdhc-imx: validate pinctrl before use it
	mmc: sdhci-pci: Add PCI IDs for Intel LKF
	mmc: sdhci-brcmstb: Remove CQE quirk
	ata: ahci: Disable SXS for Hisilicon Kunpeng920
	drm/komeda: Fix bit check to import to value of proper type
	nvmet: return proper error code from discovery ctrl
	selftests/resctrl: Enable gcc checks to detect buffer overflows
	selftests/resctrl: Fix compilation issues for global variables
	selftests/resctrl: Fix compilation issues for other global variables
	selftests/resctrl: Clean up resctrl features check
	selftests/resctrl: Fix missing options "-n" and "-p"
	selftests/resctrl: Use resctrl/info for feature detection
	selftests/resctrl: Fix incorrect parsing of iMC counters
	selftests/resctrl: Fix checking for < 0 for unsigned values
	power: supply: cpcap-charger: Add usleep to cpcap charger to avoid usb plug bounce
	scsi: smartpqi: Use host-wide tag space
	scsi: smartpqi: Correct request leakage during reset operations
	scsi: smartpqi: Add new PCI IDs
	scsi: scsi_dh_alua: Remove check for ASC 24h in alua_rtpg()
	media: em28xx: fix memory leak
	media: vivid: update EDID
	drm/msm/dp: Fix incorrect NULL check kbot warnings in DP driver
	clk: socfpga: arria10: Fix memory leak of socfpga_clk on error return
	power: supply: generic-adc-battery: fix possible use-after-free in gab_remove()
	power: supply: s3c_adc_battery: fix possible use-after-free in s3c_adc_bat_remove()
	media: tc358743: fix possible use-after-free in tc358743_remove()
	media: adv7604: fix possible use-after-free in adv76xx_remove()
	media: i2c: adv7511-v4l2: fix possible use-after-free in adv7511_remove()
	media: i2c: tda1997: Fix possible use-after-free in tda1997x_remove()
	media: i2c: adv7842: fix possible use-after-free in adv7842_remove()
	media: platform: sti: Fix runtime PM imbalance in regs_show
	media: sun8i-di: Fix runtime PM imbalance in deinterlace_start_streaming
	media: dvb-usb: fix memory leak in dvb_usb_adapter_init
	media: gscpa/stv06xx: fix memory leak
	sched/fair: Ignore percpu threads for imbalance pulls
	drm/msm/mdp5: Configure PP_SYNC_HEIGHT to double the vtotal
	drm/msm/mdp5: Do not multiply vclk line count by 100
	drm/amdgpu/ttm: Fix memory leak userptr pages
	drm/radeon/ttm: Fix memory leak userptr pages
	drm/amd/display: Fix debugfs link_settings entry
	drm/amd/display: Fix UBSAN: shift-out-of-bounds warning
	drm/amdkfd: Fix cat debugfs hang_hws file causes system crash bug
	amdgpu: avoid incorrect %hu format string
	drm/amd/display: Try YCbCr420 color when YCbCr444 fails
	drm/amdgpu: fix NULL pointer dereference
	scsi: lpfc: Fix crash when a REG_RPI mailbox fails triggering a LOGO response
	scsi: lpfc: Fix error handling for mailboxes completed in MBX_POLL mode
	scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logic
	mfd: intel-m10-bmc: Fix the register access range
	mfd: da9063: Support SMBus and I2C mode
	mfd: arizona: Fix rumtime PM imbalance on error
	scsi: libfc: Fix a format specifier
	perf: Rework perf_event_exit_event()
	sched,fair: Alternative sched_slice()
	block/rnbd-clt: Fix missing a memory free when unloading the module
	s390/archrandom: add parameter check for s390_arch_random_generate
	sched,psi: Handle potential task count underflow bugs more gracefully
	power: supply: cpcap-battery: fix invalid usage of list cursor
	ALSA: emu8000: Fix a use after free in snd_emu8000_create_mixer
	ALSA: hda/conexant: Re-order CX5066 quirk table entries
	ALSA: sb: Fix two use after free in snd_sb_qsound_build
	ALSA: usb-audio: Explicitly set up the clock selector
	ALSA: usb-audio: Add dB range mapping for Sennheiser Communications Headset PC 8
	ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 445 G7
	ALSA: hda/realtek: GA503 use same quirks as GA401
	ALSA: hda/realtek: fix mic boost on Intel NUC 8
	ALSA: hda/realtek - Headset Mic issue on HP platform
	ALSA: hda/realtek: fix static noise on ALC285 Lenovo laptops
	ALSA: hda/realtek: Add quirk for Intel Clevo PCx0Dx
	tools/power/turbostat: Fix turbostat for AMD Zen CPUs
	btrfs: fix race when picking most recent mod log operation for an old root
	arm64/vdso: Discard .note.gnu.property sections in vDSO
	Makefile: Move -Wno-unused-but-set-variable out of GCC only block
	fs: fix reporting supported extra file attributes for statx()
	virtiofs: fix memory leak in virtio_fs_probe()
	kcsan, debugfs: Move debugfs file creation out of early init
	ubifs: Only check replay with inode type to judge if inode linked
	f2fs: fix error handling in f2fs_end_enable_verity()
	f2fs: fix to avoid out-of-bounds memory access
	mlxsw: spectrum_mr: Update egress RIF list before route's action
	openvswitch: fix stack OOB read while fragmenting IPv4 packets
	ACPI: GTDT: Don't corrupt interrupt mappings on watchdow probe failure
	NFS: fs_context: validate UDP retrans to prevent shift out-of-bounds
	NFS: Don't discard pNFS layout segments that are marked for return
	NFSv4: Don't discard segments marked for return in _pnfs_return_layout()
	Input: ili210x - add missing negation for touch indication on ili210x
	jffs2: Fix kasan slab-out-of-bounds problem
	jffs2: Hook up splice_write callback
	powerpc/powernv: Enable HAIL (HV AIL) for ISA v3.1 processors
	powerpc/eeh: Fix EEH handling for hugepages in ioremap space.
	powerpc/kexec_file: Use current CPU info while setting up FDT
	powerpc/32: Fix boot failure with CONFIG_STACKPROTECTOR
	powerpc: fix EDEADLOCK redefinition error in uapi/asm/errno.h
	intel_th: pci: Add Alder Lake-M support
	tpm: efi: Use local variable for calculating final log size
	tpm: vtpm_proxy: Avoid reading host log when using a virtual device
	crypto: arm/curve25519 - Move '.fpu' after '.arch'
	crypto: rng - fix crypto_rng_reset() refcounting when !CRYPTO_STATS
	md/raid1: properly indicate failure when ending a failed write request
	dm raid: fix inconclusive reshape layout on fast raid4/5/6 table reload sequences
	fuse: fix write deadlock
	exfat: fix erroneous discard when clear cluster bit
	sfc: farch: fix TX queue lookup in TX flush done handling
	sfc: farch: fix TX queue lookup in TX event handling
	security: commoncap: fix -Wstringop-overread warning
	Fix misc new gcc warnings
	jffs2: check the validity of dstlen in jffs2_zlib_compress()
	smb3: when mounting with multichannel include it in requested capabilities
	smb3: do not attempt multichannel to server which does not support it
	Revert 337f13046f ("futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op")
	futex: Do not apply time namespace adjustment on FUTEX_LOCK_PI
	x86/cpu: Initialize MSR_TSC_AUX if RDTSCP *or* RDPID is supported
	kbuild: update config_data.gz only when the content of .config is changed
	ext4: annotate data race in start_this_handle()
	ext4: annotate data race in jbd2_journal_dirty_metadata()
	ext4: fix check to prevent false positive report of incorrect used inodes
	ext4: do not set SB_ACTIVE in ext4_orphan_cleanup()
	ext4: fix error code in ext4_commit_super
	ext4: fix ext4_error_err save negative errno into superblock
	ext4: fix error return code in ext4_fc_perform_commit()
	ext4: allow the dax flag to be set and cleared on inline directories
	ext4: Fix occasional generic/418 failure
	media: dvbdev: Fix memory leak in dvb_media_device_free()
	media: dvb-usb: Fix use-after-free access
	media: dvb-usb: Fix memory leak at error in dvb_usb_device_init()
	media: staging/intel-ipu3: Fix memory leak in imu_fmt
	media: staging/intel-ipu3: Fix set_fmt error handling
	media: staging/intel-ipu3: Fix race condition during set_fmt
	media: v4l2-ctrls: fix reference to freed memory
	media: venus: hfi_parser: Don't initialize parser on v1
	usb: gadget: dummy_hcd: fix gpf in gadget_setup
	usb: gadget: Fix double free of device descriptor pointers
	usb: gadget/function/f_fs string table fix for multiple languages
	usb: dwc3: gadget: Remove FS bInterval_m1 limitation
	usb: dwc3: gadget: Fix START_TRANSFER link state check
	usb: dwc3: core: Do core softreset when switch mode
	usb: dwc2: Fix session request interrupt handler
	tty: fix memory leak in vc_deallocate
	rsi: Use resume_noirq for SDIO
	tools/power turbostat: Fix offset overflow issue in index converting
	tracing: Map all PIDs to command lines
	tracing: Restructure trace_clock_global() to never block
	dm persistent data: packed struct should have an aligned() attribute too
	dm space map common: fix division bug in sm_ll_find_free_block()
	dm integrity: fix missing goto in bitmap_flush_interval error handling
	dm rq: fix double free of blk_mq_tag_set in dev remove after table load fails
	lib/vsprintf.c: remove leftover 'f' and 'F' cases from bstr_printf()
	thermal/drivers/cpufreq_cooling: Fix slab OOB issue
	thermal/core/fair share: Lock the thermal zone while looping over instances
	Linux 5.10.36

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9161a5537b9f1cafd9b75b9253f06559ce135306
2021-05-11 15:08:50 +02:00
Steven Rostedt (VMware)
a33614d52e tracing: Restructure trace_clock_global() to never block
commit aafe104aa9 upstream.

It was reported that a fix to the ring buffer recursion detection would
cause a hung machine when performing suspend / resume testing. The
following backtrace was extracted from debugging that case:

Call Trace:
 trace_clock_global+0x91/0xa0
 __rb_reserve_next+0x237/0x460
 ring_buffer_lock_reserve+0x12a/0x3f0
 trace_buffer_lock_reserve+0x10/0x50
 __trace_graph_return+0x1f/0x80
 trace_graph_return+0xb7/0xf0
 ? trace_clock_global+0x91/0xa0
 ftrace_return_to_handler+0x8b/0xf0
 ? pv_hash+0xa0/0xa0
 return_to_handler+0x15/0x30
 ? ftrace_graph_caller+0xa0/0xa0
 ? trace_clock_global+0x91/0xa0
 ? __rb_reserve_next+0x237/0x460
 ? ring_buffer_lock_reserve+0x12a/0x3f0
 ? trace_event_buffer_lock_reserve+0x3c/0x120
 ? trace_event_buffer_reserve+0x6b/0xc0
 ? trace_event_raw_event_device_pm_callback_start+0x125/0x2d0
 ? dpm_run_callback+0x3b/0xc0
 ? pm_ops_is_empty+0x50/0x50
 ? platform_get_irq_byname_optional+0x90/0x90
 ? trace_device_pm_callback_start+0x82/0xd0
 ? dpm_run_callback+0x49/0xc0

With the following RIP:

RIP: 0010:native_queued_spin_lock_slowpath+0x69/0x200

Since the fix to the recursion detection would allow a single recursion to
happen while tracing, this lead to the trace_clock_global() taking a spin
lock and then trying to take it again:

ring_buffer_lock_reserve() {
  trace_clock_global() {
    arch_spin_lock() {
      queued_spin_lock_slowpath() {
        /* lock taken */
        (something else gets traced by function graph tracer)
          ring_buffer_lock_reserve() {
            trace_clock_global() {
              arch_spin_lock() {
                queued_spin_lock_slowpath() {
                /* DEAD LOCK! */

Tracing should *never* block, as it can lead to strange lockups like the
above.

Restructure the trace_clock_global() code to instead of simply taking a
lock to update the recorded "prev_time" simply use it, as two events
happening on two different CPUs that calls this at the same time, really
doesn't matter which one goes first. Use a trylock to grab the lock for
updating the prev_time, and if it fails, simply try again the next time.
If it failed to be taken, that means something else is already updating
it.

Link: https://lkml.kernel.org/r/20210430121758.650b6e8a@gandalf.local.home

Cc: stable@vger.kernel.org
Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru>
Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Fixes: b02414c8f0 ("ring-buffer: Fix recursion protection transitions between interrupt context") # started showing the problem
Fixes: 14131f2f98 ("tracing: implement trace_clock_*() APIs") # where the bug happened
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212761
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11 14:47:40 +02:00
Steven Rostedt (VMware)
9e40ef5391 tracing: Map all PIDs to command lines
commit 785e3c0a3a upstream.

The default max PID is set by PID_MAX_DEFAULT, and the tracing
infrastructure uses this number to map PIDs to the comm names of the
tasks, such output of the trace can show names from the recorded PIDs in
the ring buffer. This mapping is also exported to user space via the
"saved_cmdlines" file in the tracefs directory.

But currently the mapping expects the PIDs to be less than
PID_MAX_DEFAULT, which is the default maximum and not the real maximum.
Recently, systemd will increases the maximum value of a PID on the system,
and when tasks are traced that have a PID higher than PID_MAX_DEFAULT, its
comm is not recorded. This leads to the entire trace to have "<...>" as
the comm name, which is pretty useless.

Instead, keep the array mapping the size of PID_MAX_DEFAULT, but instead
of just mapping the index to the comm, map a mask of the PID
(PID_MAX_DEFAULT - 1) to the comm, and find the full PID from the
map_cmdline_to_pid array (that already exists).

This bug goes back to the beginning of ftrace, but hasn't been an issue
until user space started increasing the maximum value of PIDs.

Link: https://lkml.kernel.org/r/20210427113207.3c601884@gandalf.local.home

Cc: stable@vger.kernel.org
Fixes: bc0c38d139 ("ftrace: latency tracer infrastructure")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11 14:47:40 +02:00
Masahiro Yamada
a27aad3217 kbuild: update config_data.gz only when the content of .config is changed
commit 46b41d5dd8 upstream.

If the timestamp of the .config file is updated, config_data.gz is
regenerated, then vmlinux is re-linked. This occurs even if the content
of the .config has not changed at all.

This issue was mitigated by commit 67424f61f8 ("kconfig: do not write
.config if the content is the same"); Kconfig does not update the
.config when it ends up with the identical configuration.

The issue is remaining when the .config is created by *_defconfig with
some config fragment(s) applied on top.

This is typical for powerpc and mips, where several *_defconfig targets
are constructed by using merge_config.sh.

One workaround is to have the copy of the .config. The filechk rule
updates the copy, kernel/config_data, by checking the content instead
of the timestamp.

With this commit, the second run with the same configuration avoids
the needless rebuilds.

  $ make ARCH=mips defconfig all
   [ snip ]
  $ make ARCH=mips defconfig all
  *** Default configuration is based on target '32r2el_defconfig'
  Using ./arch/mips/configs/generic_defconfig as base
  Merging arch/mips/configs/generic/32r2.config
  Merging arch/mips/configs/generic/el.config
  Merging ./arch/mips/configs/generic/board-boston.config
  Merging ./arch/mips/configs/generic/board-ni169445.config
  Merging ./arch/mips/configs/generic/board-ocelot.config
  Merging ./arch/mips/configs/generic/board-ranchu.config
  Merging ./arch/mips/configs/generic/board-sead-3.config
  Merging ./arch/mips/configs/generic/board-xilfpga.config
  #
  # configuration written to .config
  #
    SYNC    include/config/auto.conf
    CALL    scripts/checksyscalls.sh
    CALL    scripts/atomic/check-atomics.sh
    CHK     include/generated/compile.h
    CHK     include/generated/autoksyms.h

Reported-by: Elliot Berman <eberman@codeaurora.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11 14:47:37 +02:00
Thomas Gleixner
d19a456aca futex: Do not apply time namespace adjustment on FUTEX_LOCK_PI
commit cdf78db407 upstream.

FUTEX_LOCK_PI does not require to have the FUTEX_CLOCK_REALTIME bit set
because it has been using CLOCK_REALTIME based absolute timeouts
forever. Due to that, the time namespace adjustment which is applied when
FUTEX_CLOCK_REALTIME is not set, will wrongly take place for FUTEX_LOCK_PI
and wreckage the timeout.

Exclude it from that procedure.

Fixes: c2f7d08ccc ("futex: Adjust absolute futex timeouts with per time namespace offset")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210422194704.984540159@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11 14:47:37 +02:00
Thomas Gleixner
2543329485 Revert 337f13046f ("futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op")
commit 4fbf5d6837 upstream.

The FUTEX_WAIT operand has historically a relative timeout which means that
the clock id is irrelevant as relative timeouts on CLOCK_REALTIME are not
subject to wall clock changes and therefore are mapped by the kernel to
CLOCK_MONOTONIC for simplicity.

If a caller would set FUTEX_CLOCK_REALTIME for FUTEX_WAIT the timeout is
still treated relative vs. CLOCK_MONOTONIC and then the wait arms that
timeout based on CLOCK_REALTIME which is broken and obviously has never
been used or even tested.

Reject any attempt to use FUTEX_CLOCK_REALTIME with FUTEX_WAIT again.

The desired functionality can be achieved with FUTEX_WAIT_BITSET and a
FUTEX_BITSET_MATCH_ANY argument.

Fixes: 337f13046f ("futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210422194704.834797921@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11 14:47:37 +02:00
Marco Elver
5a876a46d7 kcsan, debugfs: Move debugfs file creation out of early init
commit e36299efe7 upstream.

Commit 56348560d4 ("debugfs: do not attempt to create a new file
before the filesystem is initalized") forbids creating new debugfs files
until debugfs is fully initialized.  This means that KCSAN's debugfs
file creation, which happened at the end of __init(), no longer works.
And was apparently never supposed to work!

However, there is no reason to create KCSAN's debugfs file so early.
This commit therefore moves its creation to a late_initcall() callback.

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: stable <stable@vger.kernel.org>
Fixes: 56348560d4 ("debugfs: do not attempt to create a new file before the filesystem is initalized")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11 14:47:33 +02:00
Charan Teja Reddy
a15f68a5d5 sched,psi: Handle potential task count underflow bugs more gracefully
[ Upstream commit 9d10a13d1e ]

psi_group_cpu->tasks, represented by the unsigned int, stores the
number of tasks that could be stalled on a psi resource(io/mem/cpu).
Decrementing these counters at zero leads to wrapping which further
leads to the psi_group_cpu->state_mask is being set with the
respective pressure state. This could result into the unnecessary time
sampling for the pressure state thus cause the spurious psi events.
This can further lead to wrong actions being taken at the user land
based on these psi events.

Though psi_bug is set under these conditions but that just for debug
purpose. Fix it by decrementing the ->tasks count only when it is
non-zero.

Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lkml.kernel.org/r/1618585336-37219-1-git-send-email-charante@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11 14:47:31 +02:00
Peter Zijlstra
ae7fe4794d sched,fair: Alternative sched_slice()
[ Upstream commit 0c2de3f054 ]

The current sched_slice() seems to have issues; there's two possible
things that could be improved:

 - the 'nr_running' used for __sched_period() is daft when cgroups are
   considered. Using the RQ wide h_nr_running seems like a much more
   consistent number.

 - (esp) cgroups can slice it real fine, which makes for easy
   over-scheduling, ensure min_gran is what the name says.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210412102001.611897312@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11 14:47:31 +02:00
Peter Zijlstra
94902ee299 perf: Rework perf_event_exit_event()
[ Upstream commit ef54c1a476 ]

Make perf_event_exit_event() more robust, such that we can use it from
other contexts. Specifically the up and coming remove_on_exec.

For this to work we need to address a few issues. Remove_on_exec will
not destroy the entire context, so we cannot rely on TASK_TOMBSTONE to
disable event_function_call() and we thus have to use
perf_remove_from_context().

When using perf_remove_from_context(), there's two races to consider.
The first is against close(), where we can have concurrent tear-down
of the event. The second is against child_list iteration, which should
not find a half baked event.

To address this, teach perf_remove_from_context() to special case
!ctx->is_active and about DETACH_CHILD.

[ elver@google.com: fix racing parent/child exit in sync_child_event(). ]
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210408103605.1676875-2-elver@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11 14:47:31 +02:00
Lingutla Chandrasekhar
2f5f4cce49 sched/fair: Ignore percpu threads for imbalance pulls
[ Upstream commit 9bcb959d05 ]

During load balance, LBF_SOME_PINNED will be set if any candidate task
cannot be detached due to CPU affinity constraints. This can result in
setting env->sd->parent->sgc->group_imbalance, which can lead to a group
being classified as group_imbalanced (rather than any of the other, lower
group_type) when balancing at a higher level.

In workloads involving a single task per CPU, LBF_SOME_PINNED can often be
set due to per-CPU kthreads being the only other runnable tasks on any
given rq. This results in changing the group classification during
load-balance at higher levels when in reality there is nothing that can be
done for this affinity constraint: per-CPU kthreads, as the name implies,
don't get to move around (modulo hotplug shenanigans).

It's not as clear for userspace tasks - a task could be in an N-CPU cpuset
with N-1 offline CPUs, making it an "accidental" per-CPU task rather than
an intended one. KTHREAD_IS_PER_CPU gives us an indisputable signal which
we can leverage here to not set LBF_SOME_PINNED.

Note that the aforementioned classification to group_imbalance (when
nothing can be done) is especially problematic on big.LITTLE systems, which
have a topology the likes of:

  DIE [          ]
  MC  [    ][    ]
       0  1  2  3
       L  L  B  B

  arch_scale_cpu_capacity(L) < arch_scale_cpu_capacity(B)

Here, setting LBF_SOME_PINNED due to a per-CPU kthread when balancing at MC
level on CPUs [0-1] will subsequently prevent CPUs [2-3] from classifying
the [0-1] group as group_misfit_task when balancing at DIE level. Thus, if
CPUs [0-1] are running CPU-bound (misfit) tasks, ill-timed per-CPU kthreads
can significantly delay the upgmigration of said misfit tasks. Systems
relying on ASYM_PACKING are likely to face similar issues.

Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org>
[Use kthread_is_per_cpu() rather than p->nr_cpus_allowed]
[Reword changelog]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210407220628.3798191-2-valentin.schneider@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11 14:47:29 +02:00
Uladzislau Rezki (Sony)
09a27d6620 kvfree_rcu: Use same set of GFP flags as does single-argument
[ Upstream commit ee6ddf5847 ]

Running an rcuscale stress-suite can lead to "Out of memory" of a
system. This can happen under high memory pressure with a small amount
of physical memory.

For example, a KVM test configuration with 64 CPUs and 512 megabytes
can result in OOM when running rcuscale with below parameters:

../kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig CONFIG_NR_CPUS=64 \
--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 \
  rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" --trust-make

<snip>
[   12.054448] kworker/1:1H invoked oom-killer: gfp_mask=0x2cc0(GFP_KERNEL|__GFP_NOWARN), order=0, oom_score_adj=0
[   12.055303] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510
[   12.055416] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
[   12.056485] Workqueue: events_highpri fill_page_cache_func
[   12.056485] Call Trace:
[   12.056485]  dump_stack+0x57/0x6a
[   12.056485]  dump_header+0x4c/0x30a
[   12.056485]  ? del_timer_sync+0x20/0x30
[   12.056485]  out_of_memory.cold.47+0xa/0x7e
[   12.056485]  __alloc_pages_slowpath.constprop.123+0x82f/0xc00
[   12.056485]  __alloc_pages_nodemask+0x289/0x2c0
[   12.056485]  __get_free_pages+0x8/0x30
[   12.056485]  fill_page_cache_func+0x39/0xb0
[   12.056485]  process_one_work+0x1ed/0x3b0
[   12.056485]  ? process_one_work+0x3b0/0x3b0
[   12.060485]  worker_thread+0x28/0x3c0
[   12.060485]  ? process_one_work+0x3b0/0x3b0
[   12.060485]  kthread+0x138/0x160
[   12.060485]  ? kthread_park+0x80/0x80
[   12.060485]  ret_from_fork+0x22/0x30
[   12.062156] Mem-Info:
[   12.062350] active_anon:0 inactive_anon:0 isolated_anon:0
[   12.062350]  active_file:0 inactive_file:0 isolated_file:0
[   12.062350]  unevictable:0 dirty:0 writeback:0
[   12.062350]  slab_reclaimable:2797 slab_unreclaimable:80920
[   12.062350]  mapped:1 shmem:2 pagetables:8 bounce:0
[   12.062350]  free:10488 free_pcp:1227 free_cma:0
...
[   12.101610] Out of memory and no killable processes...
[   12.102042] Kernel panic - not syncing: System is deadlocked on memory
[   12.102583] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510
[   12.102600] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
<snip>

Because kvfree_rcu() has a fallback path, memory allocation failure is
not the end of the world.  Furthermore, the added overhead of aggressive
GFP settings must be balanced against the overhead of the fallback path,
which is a cache miss for double-argument kvfree_rcu() and a call to
synchronize_rcu() for single-argument kvfree_rcu().  The current choice
of GFP_KERNEL|__GFP_NOWARN can result in longer latencies than a call
to synchronize_rcu(), so less-tenacious GFP flags would be helpful.

Here is the tradeoff that must be balanced:
    a) Minimize use of the fallback path,
    b) Avoid pushing the system into OOM,
    c) Bound allocation latency to that of synchronize_rcu(), and
    d) Leave the emergency reserves to use cases lacking fallbacks.

This commit therefore changes GFP flags from GFP_KERNEL|__GFP_NOWARN to
GFP_KERNEL|__GFP_NORETRY|__GFP_NOMEMALLOC|__GFP_NOWARN.  This combination
leaves the emergency reserves alone and can initiate reclaim, but will
not invoke the OOM killer.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11 14:47:23 +02:00
Vincent Donnefort
661af9371c sched/pelt: Fix task util_est update filtering
[ Upstream commit b89997aa88 ]

Being called for each dequeue, util_est reduces the number of its updates
by filtering out when the EWMA signal is different from the task util_avg
by less than 1%. It is a problem for a sudden util_avg ramp-up. Due to the
decay from a previous high util_avg, EWMA might now be close enough to
the new util_avg. No update would then happen while it would leave
ue.enqueued with an out-of-date value.

Taking into consideration the two util_est members, EWMA and enqueued for
the filtering, ensures, for both, an up-to-date value.

This is for now an issue only for the trace probe that might return the
stale value. Functional-wise, it isn't a problem, as the value is always
accessed through max(enqueued, ewma).

This problem has been observed using LISA's UtilConvergence:test_means on
the sd845c board.

No regression observed with Hackbench on sd845c and Perf-bench sched pipe
on hikey/hikey960.

Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210225165820.1377125-1-vincent.donnefort@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11 14:47:23 +02:00
Vitaly Kuznetsov
df7452f03b genirq/matrix: Prevent allocation counter corruption
[ Upstream commit c93a5e20c3 ]

When irq_matrix_free() is called for an unallocated vector the
managed_allocated and total_allocated counters get out of sync with the
real state of the matrix. Later, when the last interrupt is freed, these
counters will underflow resulting in UINTMAX because the counters are
unsigned.

While this is certainly a problem of the calling code, this can be catched
in the allocator by checking the allocation bit for the to be freed vector
which simplifies debugging.

An example of the problem described above:
https://lore.kernel.org/lkml/20210318192819.636943062@linutronix.de/

Add the missing sanity check and emit a warning when it triggers.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210319111823.1105248-1-vkuznets@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11 14:47:17 +02:00