In order to avoid exposing hypervisor (EL2) data structures directly to
the host, generate hyp_constants.h to provide constants such as structure
sizes to the host without dragging in the definitions themselves.
Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211202171048.26924-3-will@kernel.org
(cherry picked from commit ed4ed15d57
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I24957ea3ef1da8863a60dcf53c146b3a78f56fa5
asm/mmu.h refers to cpus_have_const_cap() in the definition of
arm64_kernel_unmapped_at_el0() so include asm/cpufeature.h directly
rather than force all users of the header to do it themselves.
Signed-off-by: Will Deacon <will@kernel.org>
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211202171048.26924-2-will@kernel.org
(cherry picked from commit 7e04f05984
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Iaa42070f8f41255406b1031e5a59f58c06f47f5d
The only usage of kvm_io_gic_ops is to make a comparison with its
address and to pass its address to kvm_iodevice_init() which takes a
pointer to const kvm_io_device_ops as input. Make it const to allow the
compiler to put it in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211204213518.83642-1-rikard.falkeborn@gmail.com
(cherry picked from commit 636dcd0204
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I057a166181bea5855dd19be14971ac086e02ec12
When running a KVM guest hosted on an ARMv8.7 machine, the host
kernel complains that it doesn't know about the architected number
of events.
Fix it by adding the PMUver code corresponding to PMUv3 for ARMv8.7.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211126115533.217903-1-maz@kernel.org
(cherry picked from commit 00e228b315
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I705efed6bcdd2000a57901bd04ba080a36527ad4
With the transition to kvm_arch_vcpu_run_pid_change() to handle
the "run once" activities, it becomes obvious that has_run_once
is now an exact shadow of vcpu->pid.
Replace vcpu->arch.has_run_once with a new vcpu_has_run_once()
helper that directly checks for vcpu->pid, and get rid of the
now unused field.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit cc5705fb1b
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Iaecd0c5440ae929775fd43b7e9cfe71168b45911
The kvm_arch_vcpu_run_pid_change() helper gets called on each PID
change. The kvm_vcpu_first_run_init() helper gets run on the...
first run(!) of a vcpu.
As it turns out, the first run of a vcpu also triggers a PID change
event (vcpu->pid is initially NULL).
Use this property to merge these two helpers and get rid of another
arm64-specific oddity.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit b5aa368abf
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ie65247a0f1fb3bef49c2cdc1d6226836071554f0
Restructure kvm_vcpu_first_run_init() to set the has_run_once
flag after having completed all the "run once" activities.
This includes moving the flip of the userspace irqchip static key
to a point where nothing can fail.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 1408e73d21
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I034562031b0ad89815d2623da1fff8930b964694
Having kvm_arch_vcpu_run_pid_change() inline doesn't bring anything
to the table. Move it next to kvm_vcpu_first_run_init(), which will
be convenient for what is next to come.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 052f064d42
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I78e24d8bbfa44a4ebd96f6e1f1441079a627476a
We currently map the SVE state to HYP on detection of a PID change.
Although this matches what we do for FPSIMD, this is pretty pointless
for SVE, as the buffer is per-vcpu and has nothing to do with the
thread that is being run.
Move the mapping of the SVE state to finalize-time, which is where
we allocate the state memory, and thus the most logical place to
do this.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit bff01a61af
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
[willdeacon@: Fixed context conflict due to removal of EL2 thread_info mapping]
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I672f411b50a827a45d30ac5fb154c7f1a5102d7d
The bit of documentation that talks about TIF_FOREIGN_FPSTATE
does not mention the ungodly tricks that KVM plays with this flag.
Try and document this for the posterity.
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 31aa126de8
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Iec0b06e35ad286d6bcea15745f2a1b160ff967cc
Now that we can track an equivalent of TIF_FOREIGN_FPSTATE, drop
the mapping of current's thread_info at EL2.
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit bee14bca73
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I8d113a0f7551302a03446f9cfac1248b0a975184
We currently have to maintain a mapping the thread_info structure
at EL2 in order to be able to check the TIF_FOREIGN_FPSTATE flag.
In order to eventually get rid of this, start with a vcpu flag that
shadows the thread flag on each entry into the hypervisor.
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit af9a0e21d8
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I3a59991de7eca3a08fc3de9ddb11213d889165b5
Now that we don't have any users left for __sve_save_state, remove
it altogether. Should we ever need to save the SVE state from the
hypervisor again, we can always re-introduce it.
Suggested-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit e66425fc9b
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
[willdeacon@: Resolved conflict due to different __sve_save_state code]
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ie6a95dfad3e510361730713fa92a61fcf9f22a7e
The SVE host tracking in KVM is pretty involved. It relies on a
set of flags tracking the ownership of the SVE register, as well
as that of the EL0 access.
It is also pretty scary: __hyp_sve_save_host() computes
a thread_struct pointer and obtains a sve_state which gets directly
accessed without further ado, even on nVHE. How can this even work?
The answer to that is that it doesn't, and that this is mostly dead
code. Closer examination shows that on executing a syscall, userspace
loses its SVE state entirely. This is part of the ABI. Another
thing to notice is that although the kernel provides helpers such as
kernel_neon_begin()/end(), they only deal with the FP/NEON state,
and not SVE.
Given that you can only execute a guest as the result of a syscall,
and that the kernel cannot use SVE by itself, it becomes pretty
obvious that there is never any host SVE state to save, and that
this code is only there to increase confusion.
Get rid of the TIF_SVE tracking and host save infrastructure altogether.
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 8383741ab2
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I8f26c83393bac40056ce849a1082b7516130cb0a
The vcpu arch flags are in an interesting, semi random order.
As I have made the mistake of reusing a flag once, let's rework
this in an order that I find a bit less confusing.
Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 892fd259cb
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next)
Bug: 209777660
[willdeacon@: Remove KVM_GUESTDBG_VALID_MASK definition from guest.c]
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I79f0f8de29bb111d95a923a744055d69e0dbad60
If the UFS Device WLUN is runtime suspended and is in the same power mode,
link state, and b_rpm_dev_flush_capable (BKOP or WB buffer flush etc)
state, then it can remain runtime suspended instead of being runtime
resumed and then system suspended.
The following patch has cleared the way for that to happen:
scsi: core: pm: Only runtime resume if necessary
So amend the logic accordingly.
Note, the ufs-hisi driver uses different RPM and SPM, but it is made
explicit by a new parameter to suspend prepare.
Link: https://lore.kernel.org/r/20211027130614.406985-2-adrian.hunter@intel.com
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit ddba1cf7a5)
Bug: 209633402
Change-Id: I05bfc6091323dc5283763f1e192fce58dbda217a
Signed-off-by: Bart Van Assche <bvanassche@google.com>
The following query shows which drivers define callbacks that are called by
the power management support code in the SCSI core (scsi_pm.c):
$ git grep -nHEwA16 "$(echo $(git grep -h 'scsi_register_driver(&' |
sed 's/.*&//;s/\..*//') | sed 's/ /|/g')" |
grep '\.pm[[:blank:]]*=[[:blank:]]'
drivers/scsi/sd.c-620- .pm = &sd_pm_ops,
drivers/scsi/sr.c-100- .pm = &sr_pm_ops,
drivers/scsi/ufs/ufshcd.c-9765- .pm = &ufshcd_wl_pm_ops,
Since unconditionally runtime resuming a device during system resume is not
necessary, remove that code. Modify the SCSI disk (sd) driver such that it
follows the same approach as the UFS driver, namely to skip system suspend
and resume for devices that are runtime suspended. The CD-ROM code does not
need to be updated since its PM callbacks do not affect the device power
state.
This patch has been tested as follows:
[ shell 1 ]
cd /sys/kernel/debug/tracing
grep -E 'blk_(pre|post)_runtime|runtime_(suspend|resume)|autosuspend_delay|pm_runtime_(get|put)' available_filter_functions |
while read a b; do echo "$a"; done |
grep -v __pm_runtime_resume >set_ftrace_filter
echo function > current_tracer
echo 1 > tracing_on
cat trace_pipe
[ shell 2 ]
cd /sys/block/sr0
# Increase the event poll interval to make it easier to derive from the
# tracing output whether runtime power actions are the result of sg_inq.
echo 30000 > events_poll_msecs
cd device/power
# Enable runtime power management.
echo auto > control
echo 1000 > autosuspend_delay_ms
sleep 1
# Verify in shell 1 that sr0 has been runtime suspended
sg_inq /dev/sr0
eject /dev/sr0
sg_inq /dev/sr0
# Disable runtime power management.
echo on > control
cd /sys/block/sda/device/power
echo auto > control
echo 1000 > autosuspend_delay_ms
sleep 1
# Verify in shell 1 that sr0 has been runtime suspended
sg_inq /dev/sda
Link: https://lore.kernel.org/r/20211006215453.3318929-4-bvanassche@acm.org
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 9131bff6a9)
Bug: 209633402
Test: cd /sys/devices/platform/14700000.ufs/host0/target0:0:0 && for f in */power/control; do echo auto >$f; done && for f in */power/autosuspend_delay_ms; do echo 100 >$f; done && cd /sys/kernel/tracing/events/rpm && for e in rpm_idle rpm_suspend rpm_resume; do echo 1 > $e/enable && echo 'name ~ "target*" || name ~ "?:*"' > $e/filter; done && cd ../.. && echo 1 > tracing_on && cat trace_pipe
Change-Id: I4f43b7df426395009cc555509575e58967b5e040
Signed-off-by: Bart Van Assche <bvanassche@google.com>
For SD card reader devices that have the BLIST_IGN_MEDIA_CHANGE flag
set, a MEDIUM MAY HAVE CHANGED unit attention is established after
resuming from runtime suspend. Send a REQUEST SENSE to consume the UA.
The "downside" is that for these devices we now rely on users to not
change the medium (SD card) *during* a runtime suspend/resume cycle,
i.e. when not unmounting.
To enable runtime PM for an SD cardreader (device number 0:0:0:0), do:
echo 0 > /sys/module/block/parameters/events_dfl_poll_msecs
echo 1000 > /sys/bus/scsi/devices/0:0:0:0/power/autosuspend_delay_ms
echo auto > /sys/bus/scsi/devices/0:0:0:0/power/control
[mkp: use scsi_device flag instead of poking at BLIST]
Link: https://lore.kernel.org/r/20210704075403.147114-3-martin.kepplinger@puri.sm
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit ed4246d37f)
Bug: 209633402
Change-Id: I2eb06ce9d972870bc4cf7d3fe3a1a586af6f7242
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Patch https://android-review.googlesource.com/c/kernel/common/+/1763479
reverted an upstream patch because at the time of the revert the UFS
driver did not yet support runtime power management (RPM) correctly.
Now that the UFS driver supports RPM, switch back to the upstream
implementation of RPM.
Bug: 209633402
Change-Id: I0b51043f2cd13550c511cbdc80c5684917259d01
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Changes in 5.10.84
NFSv42: Fix pagecache invalidation after COPY/CLONE
can: j1939: j1939_tp_cmd_recv(): check the dst address of TP.CM_BAM
ovl: simplify file splice
ovl: fix deadlock in splice write
gfs2: release iopen glock early in evict
gfs2: Fix length of holes reported at end-of-file
powerpc/pseries/ddw: Revert "Extend upper limit for huge DMA window for persistent memory"
drm/sun4i: fix unmet dependency on RESET_CONTROLLER for PHY_SUN6I_MIPI_DPHY
mac80211: do not access the IV when it was stripped
net/smc: Transfer remaining wait queue entries during fallback
atlantic: Fix OOB read and write in hw_atl_utils_fw_rpc_wait
net: return correct error code
platform/x86: thinkpad_acpi: Add support for dual fan control
platform/x86: thinkpad_acpi: Fix WWAN device disabled issue after S3 deep
s390/setup: avoid using memblock_enforce_memory_limit
btrfs: check-integrity: fix a warning on write caching disabled disk
thermal: core: Reset previous low and high trip during thermal zone init
scsi: iscsi: Unblock session then wake up error handler
drm/amd/amdkfd: Fix kernel panic when reset failed and been triggered again
drm/amd/amdgpu: fix potential memleak
ata: ahci: Add Green Sardine vendor ID as board_ahci_mobile
ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge_srst_by_port()
ipv6: check return value of ipv6_skip_exthdr
net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound
net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock()
perf inject: Fix ARM SPE handling
perf hist: Fix memory leak of a perf_hpp_fmt
perf report: Fix memory leaks around perf_tip()
net/smc: Avoid warning of possible recursive locking
ACPI: Add stubs for wakeup handler functions
vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
kprobes: Limit max data_size of the kretprobe instances
rt2x00: do not mark device gone on EPROTO errors during start
ipmi: Move remove_work to dedicated workqueue
cpufreq: Fix get_cpu_device() failure in add_cpu_dev_symlink()
s390/pci: move pseudo-MMIO to prevent MIO overlap
fget: check that the fd still exists after getting a ref to it
sata_fsl: fix UAF in sata_fsl_port_stop when rmmod sata_fsl
sata_fsl: fix warning in remove_proc_entry when rmmod sata_fsl
ipv6: fix memory leak in fib6_rule_suppress
drm/amd/display: Allow DSC on supported MST branch devices
KVM: Disallow user memslot with size that exceeds "unsigned long"
KVM: nVMX: Flush current VPID (L1 vs. L2) for KVM_REQ_TLB_FLUSH_GUEST
KVM: x86: Use a stable condition around all VT-d PI paths
KVM: arm64: Avoid setting the upper 32 bits of TCR_EL2 and CPTR_EL2 to 1
KVM: X86: Use vcpu->arch.walk_mmu for kvm_mmu_invlpg()
tracing/histograms: String compares should not care about signed values
wireguard: selftests: increase default dmesg log size
wireguard: allowedips: add missing __rcu annotation to satisfy sparse
wireguard: selftests: actually test for routing loops
wireguard: selftests: rename DEBUG_PI_LIST to DEBUG_PLIST
wireguard: device: reset peer src endpoint when netns exits
wireguard: receive: use ring buffer for incoming handshakes
wireguard: receive: drop handshakes if queue lock is contended
wireguard: ratelimiter: use kvcalloc() instead of kvzalloc()
i2c: stm32f7: flush TX FIFO upon transfer errors
i2c: stm32f7: recover the bus on access timeout
i2c: stm32f7: stop dma transfer in case of NACK
i2c: cbus-gpio: set atomic transfer callback
natsemi: xtensa: fix section mismatch warnings
tcp: fix page frag corruption on page fault
net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()
net: mpls: Fix notifications when deleting a device
siphash: use _unaligned version by default
arm64: ftrace: add missing BTIs
net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
selftests: net: Correct case name
mt76: mt7915: fix NULL pointer dereference in mt7915_get_phy_mode
ASoC: tegra: Fix wrong value type in ADMAIF
ASoC: tegra: Fix wrong value type in I2S
ASoC: tegra: Fix wrong value type in DMIC
ASoC: tegra: Fix wrong value type in DSPK
ASoC: tegra: Fix kcontrol put callback in ADMAIF
ASoC: tegra: Fix kcontrol put callback in I2S
ASoC: tegra: Fix kcontrol put callback in DMIC
ASoC: tegra: Fix kcontrol put callback in DSPK
ASoC: tegra: Fix kcontrol put callback in AHUB
rxrpc: Fix rxrpc_peer leak in rxrpc_look_up_bundle()
rxrpc: Fix rxrpc_local leak in rxrpc_lookup_peer()
ALSA: intel-dsp-config: add quirk for CML devices based on ES8336 codec
net: usb: lan78xx: lan78xx_phy_init(): use PHY_POLL instead of "0" if no IRQ is available
net: marvell: mvpp2: Fix the computation of shared CPUs
dpaa2-eth: destroy workqueue at the end of remove function
net: annotate data-races on txq->xmit_lock_owner
ipv4: convert fib_num_tclassid_users to atomic_t
net/smc: fix wrong list_del in smc_lgr_cleanup_early
net/rds: correct socket tunable error in rds_tcp_tune()
net/smc: Keep smc_close_final rc during active close
drm/msm/a6xx: Allocate enough space for GMU registers
drm/msm: Do hw_init() before capturing GPU state
atlantic: Increase delay for fw transactions
atlatnic: enable Nbase-t speeds with base-t
atlantic: Fix to display FW bundle version instead of FW mac version.
atlantic: Add missing DIDs and fix 115c.
Remove Half duplex mode speed capabilities.
atlantic: Fix statistics logic for production hardware
atlantic: Remove warn trace message.
KVM: x86/pmu: Fix reserved bits for AMD PerfEvtSeln register
KVM: VMX: Set failure code in prepare_vmcs02()
x86/sev: Fix SEV-ES INS/OUTS instructions for word, dword, and qword
x86/entry: Use the correct fence macro after swapgs in kernel CR3
x86/xen: Add xenpv_restore_regs_and_return_to_usermode()
sched/uclamp: Fix rq->uclamp_max not set on first enqueue
x86/pv: Switch SWAPGS to ALTERNATIVE
x86/entry: Add a fence for kernel entry SWAPGS in paranoid_entry()
parisc: Fix KBUILD_IMAGE for self-extracting kernel
parisc: Fix "make install" on newer debian releases
vgacon: Propagate console boot parameters before calling `vc_resize'
xhci: Fix commad ring abort, write all 64 bits to CRCR register.
USB: NO_LPM quirk Lenovo Powered USB-C Travel Hub
usb: typec: tcpm: Wait in SNK_DEBOUNCED until disconnect
x86/tsc: Add a timer to make sure TSC_adjust is always checked
x86/tsc: Disable clocksource watchdog for TSC on qualified platorms
x86/64/mm: Map all kernel memory into trampoline_pgd
tty: serial: msm_serial: Deactivate RX DMA for polling support
serial: pl011: Add ACPI SBSA UART match id
serial: tegra: Change lower tolerance baud rate limit for tegra20 and tegra30
serial: core: fix transmit-buffer reset and memleak
serial: 8250_pci: Fix ACCES entries in pci_serial_quirks array
serial: 8250_pci: rewrite pericom_do_set_divisor()
serial: 8250: Fix RTS modem control while in rs485 mode
iwlwifi: mvm: retry init flow if failed
parisc: Mark cr16 CPU clocksource unstable on all SMP machines
net/tls: Fix authentication failure in CCM mode
ipmi: msghandler: Make symbol 'remove_work_wq' static
Linux 5.10.84
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I90caaa6bd343e4180abcf8904a06c7ccc7b7b582
commit 5961060692 upstream.
When the TLS cipher suite uses CCM mode, including AES CCM and
SM4 CCM, the first byte of the B0 block is flags, and the real
IV starts from the second byte. The XOR operation of the IV and
rec_seq should be skip this byte, that is, add the iv_offset.
Fixes: f295b3ae9f ("net/tls: Add support of AES128-CCM based ciphers")
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Cc: Vakul Garg <vakul.garg@nxp.com>
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit afdb4a5b1d upstream.
In commit c8c3735997 ("parisc: Enhance detection of synchronous cr16
clocksources") I assumed that CPUs on the same physical core are syncronous.
While booting up the kernel on two different C8000 machines, one with a
dual-core PA8800 and one with a dual-core PA8900 CPU, this turned out to be
wrong. The symptom was that I saw a jump in the internal clocks printed to the
syslog and strange overall behaviour. On machines which have 4 cores (2
dual-cores) the problem isn't visible, because the current logic already marked
the cr16 clocksource unstable in this case.
This patch now marks the cr16 interval timers unstable if we have more than one
CPU in the system, and it fixes this issue.
Fixes: c8c3735997 ("parisc: Enhance detection of synchronous cr16 clocksources")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.15+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f85e04503f upstream.
Commit f45709df77 ("serial: 8250: Don't touch RTS modem control while
in rs485 mode") sought to prevent user space from interfering with rs485
communication by ignoring a TIOCMSET ioctl() which changes RTS polarity.
It did so in serial8250_do_set_mctrl(), which turns out to be too deep
in the call stack: When a uart_port is opened, RTS polarity is set by
the rs485-aware function uart_port_dtr_rts(). It calls down to
serial8250_do_set_mctrl() and that particular RTS polarity change should
*not* be ignored.
The user-visible result is that on 8250_omap ports which use rs485 with
inverse polarity (RTS bit in MCR register is 1 to receive, 0 to send),
a newly opened port initially sets up RTS for sending instead of
receiving. That's because omap_8250_startup() sets the cached value
up->mcr to 0 and omap_8250_restore_regs() subsequently writes it to the
MCR register. Due to the commit, serial8250_do_set_mctrl() preserves
that incorrect register value:
do_sys_openat2
do_filp_open
path_openat
vfs_open
do_dentry_open
chrdev_open
tty_open
uart_open
tty_port_open
uart_port_activate
uart_startup
uart_port_startup
serial8250_startup
omap_8250_startup # up->mcr = 0
uart_change_speed
serial8250_set_termios
omap_8250_set_termios
omap_8250_restore_regs
serial8250_out_MCR # up->mcr written
tty_port_block_til_ready
uart_dtr_rts
uart_port_dtr_rts
serial8250_set_mctrl
omap8250_set_mctrl
serial8250_do_set_mctrl # mcr[1] = 1 ignored
Fix by intercepting RTS changes from user space in uart_tiocmset()
instead.
Link: https://lore.kernel.org/linux-serial/20211027111644.1996921-1-baocheng.su@siemens.com/
Fixes: f45709df77 ("serial: 8250: Don't touch RTS modem control while in rs485 mode")
Cc: Chao Zeng <chao.zeng@siemens.com>
Cc: stable@vger.kernel.org # v5.7+
Reported-by: Su Bao Cheng <baocheng.su@siemens.com>
Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Tested-by: Su Bao Cheng <baocheng.su@siemens.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/r/21170e622a1aaf842a50b32146008b5374b3dd1d.1637596432.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 00de977f9e upstream.
Commit 761ed4a945 ("tty: serial_core: convert uart_close to use
tty_port_close") converted serial core to use tty_port_close() but
failed to notice that the transmit buffer still needs to be freed on
final close.
Not freeing the transmit buffer means that the buffer is no longer
cleared on next open so that any ioctl() waiting for the buffer to drain
might wait indefinitely (e.g. on termios changes) or that stale data can
end up being transmitted in case tx is restarted.
Furthermore, the buffer of any port that has been opened would leak on
driver unbind.
Note that the port lock is held when clearing the buffer pointer due to
the ldisc race worked around by commit a5ba1d95e4 ("uart: fix race
between uart_put_char() and uart_shutdown()").
Also note that the tty-port shutdown() callback is not called for
console ports so it is not strictly necessary to free the buffer page
after releasing the lock (cf. d72402145a ("tty/serial: do not free
trasnmit buffer page under port lock")).
Link: https://lore.kernel.org/r/319321886d97c456203d5c6a576a5480d07c3478.1635781688.git.baruch@tkos.co.il
Fixes: 761ed4a945 ("tty: serial_core: convert uart_close to use tty_port_close")
Cc: stable@vger.kernel.org # 4.9
Cc: Rob Herring <robh@kernel.org>
Reported-by: Baruch Siach <baruch@tkos.co.il>
Tested-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211108085431.12637-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b40de7469e upstream.
The current implementation uses 0 as lower limit for the baud rate
tolerance for tegra20 and tegra30 chips which causes isses on UART
initialization as soon as baud rate clock is lower than required even
when within the standard UART tolerance of +/- 4%.
This fix aligns the implementation with the initial commit description
of +/- 4% tolerance for tegra chips other than tegra186 and
tegra194.
Fixes: d781ec21ba ("serial: tegra: report clk rate errors")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Patrik John <patrik.john@u-blox.com>
Link: https://lore.kernel.org/r/sig.19614244f8.20211123132737.88341-1-patrik.john@u-blox.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac442a077a upstream.
The document 'ACPI for Arm Components 1.0' defines the following
_HID mappings:
-'Prime cell UART (PL011)': ARMH0011
-'SBSA UART': ARMHB000
Use the sbsa-uart driver when a device is described with
the 'ARMHB000' _HID.
Note:
PL011 devices currently use the sbsa-uart driver instead of the
uart-pl011 driver. Indeed, PL011 devices are not bound to a clock
in ACPI. It is not possible to change their baudrate.
Cc: <stable@vger.kernel.org>
Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com>
Link: https://lore.kernel.org/r/20211109172248.19061-1-Pierre.Gondois@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7492ffc90f upstream.
The CONSOLE_POLLING mode is used for tools like k(g)db. In this kind of
setup, it is often sharing a serial device with the normal system console.
This is usually no problem because the polling helpers can consume input
values directly (when in kgdb context) and the normal Linux handlers can
only consume new input values after kgdb switched back.
This is not true anymore when RX DMA is enabled for UARTDM controllers.
Single input values can no longer be received correctly. Instead following
seems to happen:
* on 1. input, some old input is read (continuously)
* on 2. input, two old inputs are read (continuously)
* on 3. input, three old input values are read (continuously)
* on 4. input, 4 previous inputs are received
This repeats then for each group of 4 input values.
This behavior changes slightly depending on what state the controller was
when the first input was received. But this makes working with kgdb
basically impossible because control messages are always corrupted when
kgdboc tries to parse them.
RX DMA should therefore be off when CONSOLE_POLLING is enabled to avoid
these kind of problems. No such problem was noticed for TX DMA.
Fixes: 9969394501 ("tty: serial: msm: Add RX DMA support")
Cc: stable@vger.kernel.org
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Link: https://lore.kernel.org/r/20211113121050.7266-1-sven@narfation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 51523ed1c2 upstream.
The trampoline_pgd only maps the 0xfffffff000000000-0xffffffffffffffff
range of kernel memory (with 4-level paging). This range contains the
kernel's text+data+bss mappings and the module mapping space but not the
direct mapping and the vmalloc area.
This is enough to get the application processors out of real-mode, but
for code that switches back to real-mode the trampoline_pgd is missing
important parts of the address space. For example, consider this code
from arch/x86/kernel/reboot.c, function machine_real_restart() for a
64-bit kernel:
#ifdef CONFIG_X86_32
load_cr3(initial_page_table);
#else
write_cr3(real_mode_header->trampoline_pgd);
/* Exiting long mode will fail if CR4.PCIDE is set. */
if (boot_cpu_has(X86_FEATURE_PCID))
cr4_clear_bits(X86_CR4_PCIDE);
#endif
/* Jump to the identity-mapped low memory code */
#ifdef CONFIG_X86_32
asm volatile("jmpl *%0" : :
"rm" (real_mode_header->machine_real_restart_asm),
"a" (type));
#else
asm volatile("ljmpl *%0" : :
"m" (real_mode_header->machine_real_restart_asm),
"D" (type));
#endif
The code switches to the trampoline_pgd, which unmaps the direct mapping
and also the kernel stack. The call to cr4_clear_bits() will find no
stack and crash the machine. The real_mode_header pointer below points
into the direct mapping, and dereferencing it also causes a crash.
The reason this does not crash always is only that kernel mappings are
global and the CR3 switch does not flush those mappings. But if theses
mappings are not in the TLB already, the above code will crash before it
can jump to the real-mode stub.
Extend the trampoline_pgd to contain all kernel mappings to prevent
these crashes and to make code which runs on this page-table more
robust.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20211202153226.22946-5-joro@8bytes.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b50db7095f upstream.
There are cases that the TSC clocksource is wrongly judged as unstable by
the clocksource watchdog mechanism which tries to validate the TSC against
HPET, PM_TIMER or jiffies. While there is hardly a general reliable way to
check the validity of a watchdog, Thomas Gleixner proposed [1]:
"I'm inclined to lift that requirement when the CPU has:
1) X86_FEATURE_CONSTANT_TSC
2) X86_FEATURE_NONSTOP_TSC
3) X86_FEATURE_NONSTOP_TSC_S3
4) X86_FEATURE_TSC_ADJUST
5) At max. 4 sockets
After two decades of horrors we're finally at a point where TSC seems
to be halfway reliable and less abused by BIOS tinkerers. TSC_ADJUST
was really key as we can now detect even small modifications reliably
and the important point is that we can cure them as well (not pretty
but better than all other options)."
As feature #3 X86_FEATURE_NONSTOP_TSC_S3 only exists on several generations
of Atom processorz, and is always coupled with X86_FEATURE_CONSTANT_TSC
and X86_FEATURE_NONSTOP_TSC, skip checking it, and also be more defensive
to use maximal 2 sockets.
The check is done inside tsc_init() before registering 'tsc-early' and
'tsc' clocksources, as there were cases that both of them had been
wrongly judged as unreliable.
For more background of tsc/watchdog, there is a good summary in [2]
[tglx} Update vs. jiffies:
On systems where the only remaining clocksource aside of TSC is jiffies
there is no way to make this work because that creates a circular
dependency. Jiffies accuracy depends on not missing a periodic timer
interrupt, which is not guaranteed. That could be detected by TSC, but as
TSC is not trusted this cannot be compensated. The consequence is a
circulus vitiosus which results in shutting down TSC and falling back to
the jiffies clocksource which is even more unreliable.
[1]. https://lore.kernel.org/lkml/87eekfk8bd.fsf@nanos.tec.linutronix.de/
[2]. https://lore.kernel.org/lkml/87a6pimt1f.ffs@nanos.tec.linutronix.de/
[ tglx: Refine comment and amend changelog ]
Fixes: 6e3cd95234 ("x86/hpet: Use another crystalball to evaluate HPET usability")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211117023751.24190-2-feng.tang@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09f736aa95 upstream.
Turns out some xHC controllers require all 64 bits in the CRCR register
to be written to execute a command abort.
The lower 32 bits containing the command abort bit is written first.
In case the command ring stops before we write the upper 32 bits then
hardware may use these upper bits to set the commnd ring dequeue pointer.
Solve this by making sure the upper 32 bits contain a valid command
ring dequeue pointer.
The original patch that only wrote the first 32 to stop the ring went
to stable, so this fix should go there as well.
Fixes: ff0e50d356 ("xhci: Fix command ring pointer corruption while aborting a command")
Cc: stable@vger.kernel.org
Tested-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20211126122340.1193239-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3dfac26e2e upstream.
Fix a division by zero in `vgacon_resize' with a backtrace like:
vgacon_resize
vc_do_resize
vgacon_init
do_bind_con_driver
do_unbind_con_driver
fbcon_fb_unbind
do_unregister_framebuffer
do_register_framebuffer
register_framebuffer
__drm_fb_helper_initial_config_and_unlock
drm_helper_hpd_irq_event
dw_hdmi_irq
irq_thread
kthread
caused by `c->vc_cell_height' not having been initialized. This has
only started to trigger with commit 860dafa902 ("vt: Fix character
height handling with VT_RESIZEX"), however the ultimate offender is
commit 50ec42edd9 ("[PATCH] Detaching fbcon: fix vgacon to allow
retaking of the console").
Said commit has added a call to `vc_resize' whenever `vgacon_init' is
called with the `init' argument set to 0, which did not happen before.
And the call is made before a key vgacon boot parameter retrieved in
`vgacon_startup' has been propagated in `vgacon_init' for `vc_resize' to
use to the console structure being worked on. Previously the parameter
was `c->vc_font.height' and now it is `c->vc_cell_height'.
In this particular scenario the registration of fbcon has failed and vt
resorts to vgacon. Now fbcon does have initialized `c->vc_font.height'
somehow, unlike `c->vc_cell_height', which is why this code did not
crash before, but either way the boot parameters should have been copied
to the console structure ahead of the call to `vc_resize' rather than
afterwards, so that first the call has a chance to use them and second
they do not change the console structure to something possibly different
from what was used by `vc_resize'.
Move the propagation of the vgacon boot parameters ahead of the call to
`vc_resize' then. Adjust the comment accordingly.
Fixes: 50ec42edd9 ("[PATCH] Detaching fbcon: fix vgacon to allow retaking of the console")
Cc: stable@vger.kernel.org # v2.6.18+
Reported-by: Wim Osterholt <wim@djo.tudelft.nl>
Reported-by: Pavel V. Panteleev <panteleev_p@mcst.ru>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2110252317110.58149@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f9fee4cde upstream.
On newer debian releases the debian-provided "installkernel" script is
installed in /usr/sbin. Fix the kernel install.sh script to look for the
script in this directory as well.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d7c29b777 upstream.
Default KBUILD_IMAGE to $(boot)/bzImage if a self-extracting
(CONFIG_PARISC_SELF_EXTRACT=y) kernel is to be built.
This fixes the bindeb-pkg make target.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c07e45553d ]
Commit
18ec54fdd6 ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
added FENCE_SWAPGS_{KERNEL|USER}_ENTRY for conditional SWAPGS. In
paranoid_entry(), it uses only FENCE_SWAPGS_KERNEL_ENTRY for both
branches. This is because the fence is required for both cases since the
CR3 write is conditional even when PTI is enabled.
But
96b2371413 ("x86/entry/64: Switch CR3 before SWAPGS in paranoid entry")
changed the order of SWAPGS and the CR3 write. And it missed the needed
FENCE_SWAPGS_KERNEL_ENTRY for the user gsbase case.
Add it back by changing the branches so that FENCE_SWAPGS_KERNEL_ENTRY
can cover both branches.
[ bp: Massage, fix typos, remove obsolete comment while at it. ]
Fixes: 96b2371413 ("x86/entry/64: Switch CR3 before SWAPGS in paranoid entry")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211126101209.8613-2-jiangshanlai@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 53c9d92409 ]
SWAPGS is used only for interrupts coming from user mode or for
returning to user mode. So there is no reason to use the PARAVIRT
framework, as it can easily be replaced by an ALTERNATIVE depending
on X86_FEATURE_XENPV.
There are several instances using the PV-aware SWAPGS macro in paths
which are never executed in a Xen PV guest. Replace those with the
plain swapgs instruction. For SWAPGS_UNSAFE_STACK the same applies.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210120135555.32594-5-jgross@suse.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 315c4f8848 ]
Commit d81ae8aac8 ("sched/uclamp: Fix initialization of struct
uclamp_rq") introduced a bug where uclamp_max of the rq is not reset to
match the woken up task's uclamp_max when the rq is idle.
The code was relying on rq->uclamp_max initialized to zero, so on first
enqueue
static inline void uclamp_rq_inc_id(struct rq *rq, struct task_struct *p,
enum uclamp_id clamp_id)
{
...
if (uc_se->value > READ_ONCE(uc_rq->value))
WRITE_ONCE(uc_rq->value, uc_se->value);
}
was actually resetting it. But since commit d81ae8aac8 changed the
default to 1024, this no longer works. And since rq->uclamp_flags is
also initialized to 0, neither above code path nor uclamp_idle_reset()
update the rq->uclamp_max on first wake up from idle.
This is only visible from first wake up(s) until the first dequeue to
idle after enabling the static key. And it only matters if the
uclamp_max of this task is < 1024 since only then its uclamp_max will be
effectively ignored.
Fix it by properly initializing rq->uclamp_flags = UCLAMP_FLAG_IDLE to
ensure uclamp_idle_reset() is called which then will update the rq
uclamp_max value as expected.
Fixes: d81ae8aac8 ("sched/uclamp: Fix initialization of struct uclamp_rq")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/20211202112033.1705279-1-qais.yousef@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5c8f6a2e31 ]
In the native case, PER_CPU_VAR(cpu_tss_rw + TSS_sp0) is the
trampoline stack. But XEN pv doesn't use trampoline stack, so
PER_CPU_VAR(cpu_tss_rw + TSS_sp0) is also the kernel stack.
In that case, source and destination stacks are identical, which means
that reusing swapgs_restore_regs_and_return_to_usermode() in XEN pv
would cause %rsp to move up to the top of the kernel stack and leave the
IRET frame below %rsp.
This is dangerous as it can be corrupted if #NMI / #MC hit as either of
these events occurring in the middle of the stack pushing would clobber
data on the (original) stack.
And, with XEN pv, swapgs_restore_regs_and_return_to_usermode() pushing
the IRET frame on to the original address is useless and error-prone
when there is any future attempt to modify the code.
[ bp: Massage commit message. ]
Fixes: 7f2590a110 ("x86/entry/64: Use a per-CPU trampoline stack for IDT entries")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lkml.kernel.org/r/20211126101209.8613-4-jiangshanlai@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>