linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-03 17:51:57 +09:00

Author	SHA1	Message	Date
Linus Torvalds	42e7a03d3b	Merge tag 'hyperv-fixes-signed-20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Correctly propagate coherence information for VMbus devices (Michael Kelley) - Disable balloon and memory hot-add on ARM64 temporarily (Boqun Feng) - Use barrier to prevent reording when reading ring buffer (Michael Kelley) - Use virt_store_mb in favour of smp_store_mb (Andrea Parri) - Fix VMbus device object initialization (Andrea Parri) - Deactivate sysctl_record_panic_msg on isolated guest (Andrea Parri) - Fix a crash when unloading VMbus module (Guilherme G. Piccoli) * tag 'hyperv-fixes-signed-20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb() Drivers: hv: balloon: Disable balloon and hot-add accordingly Drivers: hv: balloon: Support status report for larger page sizes Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer PCI: hv: Propagate coherence from VMbus device to PCI device Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device Drivers: hv: vmbus: Fix potential crash on module unload Drivers: hv: vmbus: Fix initialization of device object in vmbus_device_register() Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests	2022-04-07 06:35:34 -10:00
Linus Torvalds	3638bd90df	Merge tag 'random-5.18-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator fixes from Jason Donenfeld: - Another fixup to the fast_init/crng_init split, this time in how much entropy is being credited, from Jan Varho. - As discussed, we now opportunistically call try_to_generate_entropy() in /dev/urandom reads, as a replacement for the reverted commit. I opted to not do the more invasive wait_for_random_bytes() change at least for now, preferring to do something smaller and more obvious for the time being, but maybe that can be revisited as things evolve later. - Userspace can use FUSE or userfaultfd or simply move a process to idle priority in order to make a read from the random device never complete, which breaks forward secrecy, fixed by overwriting sensitive bytes early on in the function. - Jann Horn noticed that /dev/urandom reads were only checking for pending signals if need_resched() was true, a bug going back to the genesis commit, now fixed by always checking for signal_pending() and calling cond_resched(). This explains various noticeable signal delivery delays I've seen in programs over the years that do long reads from /dev/urandom. - In order to be more like other devices (e.g. /dev/zero) and to mitigate the impact of fixing the above bug, which has been around forever (users have never really needed to check the return value of read() for medium-sized reads and so perhaps many didn't), we now move signal checking to the bottom part of the loop, and do so every PAGE_SIZE-bytes. * tag 'random-5.18-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: check for signals every PAGE_SIZE chunk of /dev/[u]random random: check for signal_pending() outside of need_resched() check random: do not allow user to keep crng key around on stack random: opportunistically initialize on /dev/urandom reads random: do not split fast init input in add_hwgenerator_randomness()	2022-04-07 06:02:55 -10:00
Linus Torvalds	640b5037da	Merge tag 'ata-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fixes from Damien Le Moal: - Fix a compilation warning due to an uninitialized variable in ata_sff_lost_interrupt(), from me. - Fix invalid internal command tag handling in the sata_dwc_460ex driver, from Christian. - Disable READ LOG DMA EXT with Samsung 840 EVO SSDs as this command causes the drives to hang, from Christian. - Change the config option CONFIG_SATA_LPM_POLICY back to its original name CONFIG_SATA_LPM_MOBILE_POLICY to avoid potential problems with users losing their configuration (as discussed during the merge window), from Mario. * tag 'ata-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: ahci: Rename CONFIG_SATA_LPM_POLICY configuration item back ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs ata: sata_dwc_460ex: Fix crash due to OOB write ata: libata-sff: Fix compilation warning in ata_sff_lost_interrupt()	2022-04-07 05:56:54 -10:00
Jason A. Donenfeld	e3c1c4fd9e	random: check for signals every PAGE_SIZE chunk of /dev/[u]random In `1448769c9c` ("random: check for signal_pending() outside of need_resched() check"), Jann pointed out that we previously were only checking the TIF_NOTIFY_SIGNAL and TIF_SIGPENDING flags if the process had TIF_NEED_RESCHED set, which meant in practice, super long reads to /dev/[u]random would delay signal handling by a long time. I tried this using the below program, and indeed I wasn't able to interrupt a /dev/urandom read until after several megabytes had been read. The bug he fixed has always been there, and so code that reads from /dev/urandom without checking the return value of read() has mostly worked for a long time, for most sizes, not just for <= 256. Maybe it makes sense to keep that code working. The reason it was so small prior, ignoring the fact that it didn't work anyway, was likely because /dev/random used to block, and that could happen for pretty large lengths of time while entropy was gathered. But now, it's just a chacha20 call, which is extremely fast and is just operating on pure data, without having to wait for some external event. In that sense, /dev/[u]random is a lot more like /dev/zero. Taking a page out of /dev/zero's read_zero() function, it always returns at least one chunk, and then checks for signals after each chunk. Chunk sizes there are of length PAGE_SIZE. Let's just copy the same thing for /dev/[u]random, and check for signals and cond_resched() for every PAGE_SIZE amount of data. This makes the behavior more consistent with expectations, and should mitigate the impact of Jann's fix for the age-old signal check bug. ---- test program ---- #include <unistd.h> #include <signal.h> #include <stdio.h> #include <sys/random.h> static unsigned char x[~0U]; static void handle(int) { } int main(int argc, char *argv[]) { pid_t pid = getpid(), child; signal(SIGUSR1, handle); if (!(child = fork())) { for (;;) kill(pid, SIGUSR1); } pause(); printf("interrupted after reading %zd bytes\n", getrandom(x, sizeof(x), 0)); kill(child, SIGTERM); return 0; } Cc: Jann Horn <jannh@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-04-07 01:36:37 +02:00
Andrea Parri (Microsoft)	eaa03d3453	Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb() Following the recommendation in Documentation/memory-barriers.txt for virtual machine guests. Fixes: `8b6a877c06` ("Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels") Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20220328154457.100872-1-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>	2022-04-06 13:31:58 +00:00
Boqun Feng	be5802795c	Drivers: hv: balloon: Disable balloon and hot-add accordingly Currently there are known potential issues for balloon and hot-add on ARM64: * Unballoon requests from Hyper-V should only unballoon ranges that are guest page size aligned, otherwise guests cannot handle because it's impossible to partially free a page. This is a problem when guest page size > 4096 bytes. * Memory hot-add requests from Hyper-V should provide the NUMA node id of the added ranges or ARM64 should have a functional memory_add_physaddr_to_nid(), otherwise the node id is missing for add_memory(). These issues require discussions on design and implementation. In the meanwhile, post_status() is working and essential to guest monitoring. Therefore instead of disabling the entire hv_balloon driver, the ballooning (when page size > 4096 bytes) and hot-add are disabled accordingly for now. Once the issues are fixed, they can be re-enable in these cases. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220325023212.1570049-3-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>	2022-04-06 13:15:06 +00:00
Boqun Feng	b3d6dd09ff	Drivers: hv: balloon: Support status report for larger page sizes DM_STATUS_REPORT expects the numbers of pages in the unit of 4k pages (HV_HYP_PAGE) instead of guest pages, so to make it work when guest page sizes are larger than 4k, convert the numbers of guest pages into the numbers of HV_HYP_PAGEs. Note that the numbers of guest pages are still used for tracing because tracing is internal to the guest kernel. Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220325023212.1570049-2-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>	2022-04-06 13:15:06 +00:00
Jann Horn	1448769c9c	random: check for signal_pending() outside of need_resched() check signal_pending() checks TIF_NOTIFY_SIGNAL and TIF_SIGPENDING, which signal that the task should bail out of the syscall when possible. This is a separate concept from need_resched(), which checks TIF_NEED_RESCHED, signaling that the task should preempt. In particular, with the current code, the signal_pending() bailout probably won't work reliably. Change this to look like other functions that read lots of data, such as read_zero(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-04-06 15:09:33 +02:00
Jason A. Donenfeld	aba120cc10	random: do not allow user to keep crng key around on stack The fast key erasure RNG design relies on the key that's used to be used and then discarded. We do this, making judicious use of memzero_explicit(). However, reads to /dev/urandom and calls to getrandom() involve a copy_to_user(), and userspace can use FUSE or userfaultfd, or make a massive call, dynamically remap memory addresses as it goes, and set the process priority to idle, in order to keep a kernel stack alive indefinitely. By probing /proc/sys/kernel/random/entropy_avail to learn when the crng key is refreshed, a malicious userspace could mount this attack every 5 minutes thereafter, breaking the crng's forward secrecy. In order to fix this, we just overwrite the stack's key with the first 32 bytes of the "free" fast key erasure output. If we're returning <= 32 bytes to the user, then we can still return those bytes directly, so that short reads don't become slower. And for long reads, the difference is hopefully lost in the amortization, so it doesn't change much, with that amortization helping variously for medium reads. We don't need to do this for get_random_bytes() and the various kernel-space callers, and later, if we ever switch to always batching, this won't be necessary either, so there's no need to change the API of these functions. Cc: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jann Horn <jannh@google.com> Fixes: `c92e040d57` ("random: add backtracking protection to the CRNG") Fixes: `186873c549` ("random: use simpler fast key erasure flow on per-cpu keys") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-04-06 15:05:10 +02:00
Mario Limonciello	55b014159e	ata: ahci: Rename CONFIG_SATA_LPM_POLICY configuration item back CONFIG_SATA_LPM_MOBILE_POLICY was renamed to CONFIG_SATA_LPM_POLICY in commit `4dd4d3deb5` ("ata: ahci: Rename CONFIG_SATA_LPM_MOBILE_POLICY configuration item"). This can potentially cause problems as users would invisibly lose configuration policy defaults when they built the new kernel. To avoid such problems, switch back to the old name (even if it's wrong). Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-04-06 11:08:04 +09:00
Linus Torvalds	3e732ebf73	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio fixes from Michael Tsirkin: "Fixes and cleanups: - A couple of mlx5 fixes related to cvq - A couple of reverts dropping useless code (code that used it got reverted earlier)" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa: mlx5: synchronize driver status with CVQ vdpa: mlx5: prevent cvq work from hogging CPU Revert "virtio_config: introduce a new .enable_cbs method" Revert "virtio: use virtio_device_ready() in virtio_device_restore()"	2022-04-05 10:40:52 -07:00
Pawan Gupta	e2a1256b17	x86/speculation: Restore speculation related MSRs during S3 resume After resuming from suspend-to-RAM, the MSRs that control CPU's speculative execution behavior are not being restored on the boot CPU. These MSRs are used to mitigate speculative execution vulnerabilities. Not restoring them correctly may leave the CPU vulnerable. Secondary CPU's MSRs are correctly being restored at S3 resume by identify_secondary_cpu(). During S3 resume, restore these MSRs for boot CPU when restoring its processor state. Fixes: `772439717d` ("x86/bugs/intel: Set proper CPU features and setup RDS") Reported-by: Neelima Krishnan <neelima.krishnan@intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Acked-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-05 10:18:31 -07:00
Pawan Gupta	73924ec4d5	x86/pm: Save the MSR validity status at context setup The mechanism to save/restore MSRs during S3 suspend/resume checks for the MSR validity during suspend, and only restores the MSR if its a valid MSR. This is not optimal, as an invalid MSR will unnecessarily throw an exception for every suspend cycle. The more invalid MSRs, higher the impact will be. Check and save the MSR validity at setup. This ensures that only valid MSRs that are guaranteed to not throw an exception will be attempted during suspend. Fixes: `7a9c2dd08e` ("x86/pm: Introduce quirk framework to save/restore extra MSR registers around suspend/resume") Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-05 10:18:31 -07:00
Linus Torvalds	ce4c854ee8	Merge tag 'for-5.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - prevent deleting subvolume with active swapfile - fix qgroup reserve limit calculation overflow - remove device count in superblock and its item in one transaction so they cant't get out of sync - skip defragmenting an isolated sector, this could cause some extra IO - unify handling of mtime/permissions in hole punch with fallocate - zoned mode fixes: - remove assert checking for only single mode, we have the DUP mode implemented - fix potential lockdep warning while traversing devices when checking for zone activation * tag 'for-5.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: prevent subvol with swapfile from being deleted btrfs: do not warn for free space inode in cow_file_range btrfs: avoid defragging extents whose next extents are not targets btrfs: fix fallocate to use file_modified to update permissions consistently btrfs: remove device item and update super block in the same transaction btrfs: fix qgroup reserve overflow the qgroup limit btrfs: zoned: remove left over ASSERT checking for single profile btrfs: zoned: traverse devices under chunk_mutex in btrfs_can_activate_zone	2022-04-05 08:59:37 -07:00
Jason A. Donenfeld	48bff1053c	random: opportunistically initialize on /dev/urandom reads In `6f98a4bfee` ("random: block in /dev/urandom"), we tried to make a successful try_to_generate_entropy() call required if the RNG was not already initialized. Unfortunately, weird architectures and old userspaces combined in TCG test harnesses, making that change still not realistic, so it was reverted in `0313bc278d` ("Revert "random: block in /dev/urandom""). However, rather than making a successful try_to_generate_entropy() call required, we can instead make it best-effort. If try_to_generate_entropy() fails, it fails, and nothing changes from the current behavior. If it succeeds, then /dev/urandom becomes safe to use for free. This way, we don't risk the regression potential that led to us reverting the required-try_to_generate_entropy() call before. Practically speaking, this means that at least on x86, /dev/urandom becomes safe. Probably other architectures with working cycle counters will also become safe. And architectures with slow or broken cycle counters at least won't be affected at all by this change. So it may not be the glorious "all things are unified!" change we were hoping for initially, but practically speaking, it makes a positive impact. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-04-05 16:13:13 +02:00
Jan Varho	527a9867af	random: do not split fast init input in add_hwgenerator_randomness() add_hwgenerator_randomness() tries to only use the required amount of input for fast init, but credits all the entropy, rather than a fraction of it. Since it's hard to determine how much entropy is left over out of a non-unformly random sample, either give it all to fast init or credit it, but don't attempt to do both. In the process, we can clean up the injection code to no longer need to return a value. Signed-off-by: Jan Varho <jan.varho@gmail.com> [Jason: expanded commit message] Fixes: `73c7733f12` ("random: do not throw away excess input to crng_fast_load") Cc: stable@vger.kernel.org # 5.17+, requires `af704c856e` Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-04-04 19:34:49 +02:00
Christian Lamparter	5399752299	ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with the a message: "READ LOG DMA EXT failed, trying PIO" during boot. Initially this was discovered because it caused a crash with the sata_dwc_460ex controller on a WD MyBook Live DUO. The reporter "Tice Rex" which has the unique opportunity that he has two Samsung 840 EVO SSD! One with the older firmware "EXT0BB0Q" which booted fine and didn't expose "READ LOG DMA EXT". But the newer/latest firmware "EXT0DB6Q" caused the headaches. BugLink: https://github.com/openwrt/openwrt/issues/9505 Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-04-04 09:56:37 +09:00
Christian Lamparter	7aa8104a55	ata: sata_dwc_460ex: Fix crash due to OOB write the driver uses libata's "tag" values from in various arrays. Since the mentioned patch bumped the ATA_TAG_INTERNAL to 32, the value of the SATA_DWC_QCMD_MAX needs to account for that. Otherwise ATA_TAG_INTERNAL usage cause similar crashes like this as reported by Tice Rex on the OpenWrt Forum and reproduced (with symbols) here: \| BUG: Kernel NULL pointer dereference at 0x00000000 \| Faulting instruction address: 0xc03ed4b8 \| Oops: Kernel access of bad area, sig: 11 [#1] \| BE PAGE_SIZE=4K PowerPC 44x Platform \| CPU: 0 PID: 362 Comm: scsi_eh_1 Not tainted 5.4.163 #0 \| NIP: c03ed4b8 LR: c03d27e8 CTR: c03ed36c \| REGS: cfa59950 TRAP: 0300 Not tainted (5.4.163) \| MSR: 00021000 <CE,ME> CR: 42000222 XER: 00000000 \| DEAR: 00000000 ESR: 00000000 \| GPR00: c03d27e8 cfa59a08 cfa55fe0 00000000 0fa46bc0 [...] \| [..] \| NIP [c03ed4b8] sata_dwc_qc_issue+0x14c/0x254 \| LR [c03d27e8] ata_qc_issue+0x1c8/0x2dc \| Call Trace: \| [cfa59a08] [c003f4e0] __cancel_work_timer+0x124/0x194 (unreliable) \| [cfa59a78] [c03d27e8] ata_qc_issue+0x1c8/0x2dc \| [cfa59a98] [c03d2b3c] ata_exec_internal_sg+0x240/0x524 \| [cfa59b08] [c03d2e98] ata_exec_internal+0x78/0xe0 \| [cfa59b58] [c03d30fc] ata_read_log_page.part.38+0x1dc/0x204 \| [cfa59bc8] [c03d324c] ata_identify_page_supported+0x68/0x130 \| [...] This is because sata_dwc_dma_xfer_complete() NULLs the dma_pending's next neighbour "chan" (a *dma_chan struct) in this '32' case right here (line ~735): > hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_NONE; Then the next time, a dma gets issued; dma_dwc_xfer_setup() passes the NULL'd hsdevp->chan to the dmaengine_slave_config() which then causes the crash. With this patch, SATA_DWC_QCMD_MAX is now set to ATA_MAX_QUEUE + 1. This avoids the OOB. But please note, there was a worthwhile discussion on what ATA_TAG_INTERNAL and ATA_MAX_QUEUE is. And why there should not be a "fake" 33 command-long queue size. Ideally, the dw driver should account for the ATA_TAG_INTERNAL. In Damien Le Moal's words: "... having looked at the driver, it is a bigger change than just faking a 33rd "tag" that is in fact not a command tag at all." Fixes: `28361c4036` ("libata: add extra internal command") Cc: stable@kernel.org # 4.18+ BugLink: https://github.com/openwrt/openwrt/issues/9505 Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-04-04 09:56:34 +09:00
Damien Le Moal	76ed2f61ae	ata: libata-sff: Fix compilation warning in ata_sff_lost_interrupt() When returning false, ata_sff_altstatus() does not return any status value, resulting in a compilation warning in ata_sff_lost_interrupt() ("uninitialized symbol 'status'"). Fix this by initializing the local variable "status" to 0. Fixes: `03c0e84f9c` ("ata: libata-sff: refactor ata_sff_altstatus()") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-04-04 09:49:42 +09:00
Linus Torvalds	3123109284	Linux 5.18-rc1	2022-04-03 14:08:21 -07:00
Linus Torvalds	09bb8856d4	Merge tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull more tracing updates from Steven Rostedt: - Rename the staging files to give them some meaning. Just stage1,stag2,etc, does not show what they are for - Check for NULL from allocation in bootconfig - Hold event mutex for dyn_event call in user events - Mark user events to broken (to work on the API) - Remove eBPF updates from user events - Remove user events from uapi header to keep it from being installed. - Move ftrace_graph_is_dead() into inline as it is called from hot paths and also convert it into a static branch. * tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Move user_events.h temporarily out of include/uapi ftrace: Make ftrace_graph_is_dead() a static branch tracing: Set user_events to BROKEN tracing/user_events: Remove eBPF interfaces tracing/user_events: Hold event_mutex during dyn_event_add proc: bootconfig: Add null pointer check tracing: Rename the staging files for trace_events	2022-04-03 12:26:01 -07:00
Linus Torvalds	34a53ff911	Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fix from Stephen Boyd: "A single revert to fix a boot regression seen when clk_put() started dropping rate range requests. It's best to keep various systems booting so we'll kick this out and try again next time" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: Revert "clk: Drop the rate range on clk_put()"	2022-04-03 12:21:14 -07:00
Linus Torvalds	8b5656bc4e	Merge tag 'x86-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of x86 fixes and updates: - Make the prctl() for enabling dynamic XSTATE components correct so it adds the newly requested feature to the permission bitmap instead of overwriting it. Add a selftest which validates that. - Unroll string MMIO for encrypted SEV guests as the hypervisor cannot emulate it. - Handle supervisor states correctly in the FPU/XSTATE code so it takes the feature set of the fpstate buffer into account. The feature sets can differ between host and guest buffers. Guest buffers do not contain supervisor states. So far this was not an issue, but with enabling PASID it needs to be handled in the buffer offset calculation and in the permission bitmaps. - Avoid a gazillion of repeated CPUID invocations in by caching the values early in the FPU/XSTATE code. - Enable CONFIG_WERROR in x86 defconfig. - Make the X86 defconfigs more useful by adapting them to Y2022 reality" * tag 'x86-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu/xstate: Consolidate size calculations x86/fpu/xstate: Handle supervisor states in XSTATE permissions x86/fpu/xsave: Handle compacted offsets correctly with supervisor states x86/fpu: Cache xfeature flags from CPUID x86/fpu/xsave: Initialize offset/size cache early x86/fpu: Remove unused supervisor only offsets x86/fpu: Remove redundant XCOMP_BV initialization x86/sev: Unroll string mmio with CC_ATTR_GUEST_UNROLL_STRING_IO x86/config: Make the x86 defconfigs a bit more usable x86/defconfig: Enable WERROR selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test x86/fpu/xstate: Fix the ARCH_REQ_XCOMP_PERM implementation	2022-04-03 12:15:47 -07:00
Linus Torvalds	e235f4192f	Merge tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RT signal fix from Thomas Gleixner: "Revert the RT related signal changes. They need to be reworked and generalized" * tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels"	2022-04-03 12:08:26 -07:00
Linus Torvalds	63d12cc305	Merge tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping Pull more dma-mapping updates from Christoph Hellwig: - fix a regression in dma remap handling vs AMD memory encryption (me) - finally kill off the legacy PCI DMA API (Christophe JAILLET) * tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: move pgprot_decrypted out of dma_pgprot PCI/doc: cleanup references to the legacy PCI DMA API PCI: Remove the deprecated "pci-dma-compat.h" API	2022-04-03 10:31:00 -07:00
Linus Torvalds	5dee87215b	Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm Pull ARM fixes from Russell King: - avoid unnecessary rebuilds for library objects - fix return value of __setup handlers - fix invalid input check for "crashkernel=" kernel option - silence KASAN warnings in unwind_frame * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9191/1: arm/stacktrace, kasan: Silence KASAN warnings in unwind_frame() ARM: 9190/1: kdump: add invalid input check for 'crashkernel=0' ARM: 9187/1: JIVE: fix return value of __setup handler ARM: 9189/1: decompressor: fix unneeded rebuilds of library objects	2022-04-03 10:17:48 -07:00
Stephen Boyd	859c2c7b1d	Revert "clk: Drop the rate range on clk_put()" This reverts commit `7dabfa2bc4`. There are multiple reports that this breaks boot on various systems. The common theme is that orphan clks are having rates set on them when that isn't expected. Let's revert it out for now so that -rc1 boots. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Reported-by: Tony Lindgren <tony@atomide.com> Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://lore.kernel.org/r/366a0232-bb4a-c357-6aa8-636e398e05eb@samsung.com Cc: Maxime Ripard <maxime@cerno.tech> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20220403022818.39572-1-sboyd@kernel.org	2022-04-02 19:28:53 -07:00
Linus Torvalds	be2d3ecedd	Merge tag 'perf-tools-for-v5.18-2022-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools updates from Arnaldo Carvalho de Melo: - Avoid SEGV if core.cpus isn't set in 'perf stat'. - Stop depending on .git files for building PERF-VERSION-FILE, used in 'perf --version', fixing some perf tools build scenarios. - Convert tracepoint.py example to python3. - Update UAPI header copies from the kernel sources: socket, mman-common, msr-index, KVM, i915 and cpufeatures. - Update copy of libbpf's hashmap.c. - Directly return instead of using local ret variable in evlist__create_syswide_maps(), found by coccinelle. * tag 'perf-tools-for-v5.18-2022-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf python: Convert tracepoint.py example to python3 perf evlist: Directly return instead of using local ret variable perf cpumap: More cpu map reuse by merge. perf cpumap: Add is_subset function perf evlist: Rename cpus to user_requested_cpus perf tools: Stop depending on .git files for building PERF-VERSION-FILE tools headers cpufeatures: Sync with the kernel sources tools headers UAPI: Sync drm/i915_drm.h with the kernel sources tools headers UAPI: Sync linux/kvm.h with the kernel sources tools kvm headers arm64: Update KVM headers from the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers UAPI: Sync asm-generic/mman-common.h with the kernel perf beauty: Update copy of linux/socket.h with the kernel sources perf tools: Update copy of libbpf's hashmap.c perf stat: Avoid SEGV if core.cpus isn't set	2022-04-02 12:57:17 -07:00
Linus Torvalds	d897b68041	Merge tag 'kbuild-fixes-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix empty $(PYTHON) expansion. - Fix UML, which got broken by the attempt to suppress Clang warnings. - Fix warning message in modpost. * tag 'kbuild-fixes-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: modpost: restore the warning message for missing symbol versions Revert "um: clang: Strip out -mno-global-merge from USER_CFLAGS" kbuild: Remove '-mno-global-merge' kbuild: fix empty ${PYTHON} in scripts/link-vmlinux.sh kconfig: remove stale comment about removed kconfig_print_symbol()	2022-04-02 12:33:31 -07:00
Linus Torvalds	0b0fa57a27	Merge tag 'mips_5.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - build fix for gpio - fix crc32 build problems - check for failed memory allocations * tag 'mips_5.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: crypto: Fix CRC32 code MIPS: rb532: move GPIOD definition into C-files MIPS: lantiq: check the return value of kzalloc() mips: sgi-ip22: add a check for the return of kzalloc()	2022-04-02 12:14:38 -07:00
Linus Torvalds	38904911e8	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr - Documentation improvements - Prevent module exit until all VMs are freed - PMU Virtualization fixes - Fix for kvm_irq_delivery_to_apic_fast() NULL-pointer dereferences - Other miscellaneous bugfixes * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits) KVM: x86: fix sending PV IPI KVM: x86/mmu: do compare-and-exchange of gPTE via the user address KVM: x86: Remove redundant vm_entry_controls_clearbit() call KVM: x86: cleanup enter_rmode() KVM: x86: SVM: fix tsc scaling when the host doesn't support it kvm: x86: SVM: remove unused defines KVM: x86: SVM: move tsc ratio definitions to svm.h KVM: x86: SVM: fix avic spec based definitions again KVM: MIPS: remove reference to trap&emulate virtualization KVM: x86: document limitations of MSR filtering KVM: x86: Only do MSR filtering when access MSR by rdmsr/wrmsr KVM: x86/emulator: Emulate RDPID only if it is enabled in guest KVM: x86/pmu: Fix and isolate TSX-specific performance event logic KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs KVM: x86: Trace all APICv inhibit changes and capture overall status KVM: x86: Add wrappers for setting/clearing APICv inhibits KVM: x86: Make APICv inhibit reasons an enum and cleanup naming KVM: X86: Handle implicit supervisor access with SMAP KVM: X86: Rename variable smap to not_smap in permission_fault() ...	2022-04-02 12:09:02 -07:00
Masahiro Yamada	bf5c0c2231	modpost: restore the warning message for missing symbol versions This log message was accidentally chopped off. I was wondering why this happened, but checking the ML log, Mark precisely followed my suggestion [1]. I just used "..." because I was too lazy to type the sentence fully. Sorry for the confusion. [1]: https://lore.kernel.org/all/CAK7LNAR6bXXk9-ZzZYpTqzFqdYbQsZHmiWspu27rtsFxvfRuVA@mail.gmail.com/ Fixes: `4a6795933a` ("kbuild: modpost: Explicitly warn about unprototyped symbols") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Mark Brown <broonie@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>	2022-04-03 03:11:51 +09:00
Linus Torvalds	6f34f8c3d6	Merge tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-block Pull block driver fix from Jens Axboe: "Got two reports on nbd spewing warnings on load now, which is a regression from a commit that went into your tree yesterday. Revert the problematic change for now" * tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-block: Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()"	2022-04-02 11:03:03 -07:00
Linus Torvalds	9a212aaf95	Merge tag 'pci-v5.18-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fix from Bjorn Helgaas: - Fix Hyper-V "defined but not used" build issue added during merge window (YueHaibing) * tag 'pci-v5.18-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: hv: Remove unused hv_set_msi_entry_from_desc()	2022-04-02 10:54:52 -07:00
Linus Torvalds	02d4f8a3e0	Merge tag 'tag-chrome-platform-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux Pull chrome platform updates from Benson Leung: "cros_ec_typec: - Check for EC device - Fix a crash when using the cros_ec_typec driver on older hardware not capable of typec commands - Make try power role optional - Mux configuration reorganization series from Prashant cros_ec_debugfs: - Fix use after free. Thanks Tzung-bi sensorhub: - cros_ec_sensorhub fixup - Split trace include file misc: - Add new mailing list for chrome-platform development: chrome-platform@lists.linux.dev Now with patchwork!" * tag 'tag-chrome-platform-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: platform/chrome: cros_ec_debugfs: detach log reader wq from devm platform: chrome: Split trace include file platform/chrome: cros_ec_typec: Update mux flags during partner removal platform/chrome: cros_ec_typec: Configure muxes at start of port update platform/chrome: cros_ec_typec: Get mux state inside configure_mux platform/chrome: cros_ec_typec: Move mux flag checks platform/chrome: cros_ec_typec: Check for EC device platform/chrome: cros_ec_typec: Make try power role optional MAINTAINERS: platform-chrome: Add new chrome-platform@lists.linux.dev list	2022-04-02 10:44:18 -07:00
Jens Axboe	7198bfc201	Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()" This reverts commit `6d35d04a9e`. Both Gabriel and Borislav report that this commit casues a regression with nbd: sysfs: cannot create duplicate filename '/dev/block/43:0' Revert it before 5.18-rc1 and we'll investigage this separately in due time. Link: https://lore.kernel.org/all/YkiJTnFOt9bTv6A2@zn.tnic/ Reported-by: Gabriel L. Somlo <somlo@cmu.edu> Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-04-02 11:40:23 -06:00
Eric Dumazet	b490207017	watch_queue: Free the page array when watch_queue is dismantled Commit `7ea1a0124b` ("watch_queue: Free the alloc bitmap when the watch_queue is torn down") took care of the bitmap, but not the page array. BUG: memory leak unreferenced object 0xffff88810d9bc140 (size 32): comm "syz-executor335", pid 3603, jiffies 4294946994 (age 12.840s) hex dump (first 32 bytes): 40 a7 40 04 00 ea ff ff 00 00 00 00 00 00 00 00 @.@............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: kmalloc_array include/linux/slab.h:621 [inline] kcalloc include/linux/slab.h:652 [inline] watch_queue_set_size+0x12f/0x2e0 kernel/watch_queue.c:251 pipe_ioctl+0x82/0x140 fs/pipe.c:632 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl fs/ioctl.c:860 [inline] __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] Reported-by: syzbot+25ea042ae28f3888727a@syzkaller.appspotmail.com Fixes: `c73be61ced` ("pipe: Add general notification queue support") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Cc: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20220322004654.618274-1-eric.dumazet@gmail.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-02 10:37:39 -07:00
Steven Rostedt (Google)	1cd927ad6f	tracing: mark user_events as BROKEN After being merged, user_events become more visible to a wider audience that have concerns with the current API. It is too late to fix this for this release, but instead of a full revert, just mark it as BROKEN (which prevents it from being selected in make config). Then we can work finding a better API. If that fails, then it will need to be completely reverted. To not have the code silently bitrot, still allow building it with COMPILE_TEST. And to prevent the uapi header from being installed, then later changed, and then have an old distro user space see the old version, move the header file out of the uapi directory. Surround the include with CONFIG_COMPILE_TEST to the current location, but when the BROKEN tag is taken off, it will use the uapi directory, and fail to compile. This is a good way to remind us to move the header back. Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-02 10:32:14 -07:00
Steven Rostedt (Google)	5cfff569ca	tracing: Move user_events.h temporarily out of include/uapi While user_events API is under development and has been marked for broken to not let the API become fixed, move the header file out of the uapi directory. This is to prevent it from being installed, then later changed, and then have an old distro user space update with a new kernel, where applications see the user_events being available, but the old header is in place, and then they get compiled incorrectly. Also, surround the include with CONFIG_COMPILE_TEST to the current location, but when the BROKEN tag is taken off, it will use the uapi directory, and fail to compile. This is a good way to remind us to move the header back. Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com Link: https://lkml.kernel.org/r/20220401143903.188384f3@gandalf.local.home Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2022-04-02 08:40:10 -04:00
Christophe Leroy	18bfee3216	ftrace: Make ftrace_graph_is_dead() a static branch ftrace_graph_is_dead() is used on hot paths, it just reads a variable in memory and is not worth suffering function call constraints. For instance, at entry of prepare_ftrace_return(), inlining it avoids saving prepare_ftrace_return() parameters to stack and restoring them after calling ftrace_graph_is_dead(). While at it using a static branch is even more performant and is rather well adapted considering that the returned value will almost never change. Inline ftrace_graph_is_dead() and replace 'kill_ftrace_graph' bool by a static branch. The performance improvement is noticeable. Link: https://lkml.kernel.org/r/e0411a6a0ed3eafff0ad2bc9cd4b0e202b4617df.1648623570.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2022-04-02 08:40:09 -04:00
Steven Rostedt (Google)	fcbf591ced	tracing: Set user_events to BROKEN After being merged, user_events become more visible to a wider audience that have concerns with the current API. It is too late to fix this for this release, but instead of a full revert, just mark it as BROKEN (which prevents it from being selected in make config). Then we can work finding a better API. If that fails, then it will need to be completely reverted. Link: https://lore.kernel.org/all/2059213643.196683.1648499088753.JavaMail.zimbra@efficios.com/ Link: https://lkml.kernel.org/r/20220330155835.5e1f6669@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2022-04-02 08:40:09 -04:00
Beau Belgrave	768c1e7f1d	tracing/user_events: Remove eBPF interfaces Remove eBPF interfaces within user_events to ensure they are fully reviewed. Link: https://lore.kernel.org/all/20220329165718.GA10381@kbox/ Link: https://lkml.kernel.org/r/20220329173051.10087-1-beaub@linux.microsoft.com Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2022-04-02 08:40:09 -04:00
Beau Belgrave	efe34e99fc	tracing/user_events: Hold event_mutex during dyn_event_add Make sure the event_mutex is properly held during dyn_event_add call. This is required when adding dynamic events. Link: https://lkml.kernel.org/r/20220328223225.1992-1-beaub@linux.microsoft.com Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2022-04-02 08:40:09 -04:00
Lv Ruyi	bed5b60bf6	proc: bootconfig: Add null pointer check kzalloc is a memory allocation function which can return NULL when some internal memory errors happen. It is safer to add null pointer check. Link: https://lkml.kernel.org/r/20220329104004.2376879-1-lv.ruyi@zte.com.cn Cc: stable@vger.kernel.org Fixes: `c1a3c36017` ("proc: bootconfig: Add /proc/bootconfig to show boot config list") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2022-04-02 08:40:09 -04:00
Steven Rostedt (Google)	84055411d8	tracing: Rename the staging files for trace_events When looking for implementation of different phases of the creation of the TRACE_EVENT() macro, it is pretty useless when all helper macro redefinitions are in files labeled "stageX_defines.h". Rename them to state which phase the files are for. For instance, when looking for the defines that are used to create the event fields, seeing "stage4_event_fields.h" gives the developer a good idea that the defines are in that file. Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2022-04-02 08:40:04 -04:00
Li RongQing	c15e0ae42c	KVM: x86: fix sending PV IPI If apic_id is less than min, and (max - apic_id) is greater than KVM_IPI_CLUSTER_SIZE, then the third check condition is satisfied but the new apic_id does not fit the bitmask. In this case __send_ipi_mask should send the IPI. This is mostly theoretical, but it can happen if the apic_ids on three iterations of the loop are for example 1, KVM_IPI_CLUSTER_SIZE, 0. Fixes: `aaffcfd1e8` ("KVM: X86: Implement PV IPIs in linux guest") Signed-off-by: Li RongQing <lirongqing@baidu.com> Message-Id: <1646814944-51801-1-git-send-email-lirongqing@baidu.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-02 05:37:27 -04:00
Paolo Bonzini	2a8859f373	KVM: x86/mmu: do compare-and-exchange of gPTE via the user address FNAME(cmpxchg_gpte) is an inefficient mess. It is at least decent if it can go through get_user_pages_fast(), but if it cannot then it tries to use memremap(); that is not just terribly slow, it is also wrong because it assumes that the VM_PFNMAP VMA is contiguous. The right way to do it would be to do the same thing as hva_to_pfn_remapped() does since commit `add6a0cd1c` ("KVM: MMU: try to fix up page faults before giving up", 2016-07-05), using follow_pte() and fixup_user_fault() to determine the correct address to use for memremap(). To do this, one could for example extract hva_to_pfn() for use outside virt/kvm/kvm_main.c. But really there is no reason to do that either, because there is already a perfectly valid address to do the cmpxchg() on, only it is a userspace address. That means doing user_access_begin()/user_access_end() and writing the code in assembly to handle exceptions correctly. Worse, the guest PTE can be 8-byte even on i686 so there is the extra complication of using cmpxchg8b to account for. But at least it is an efficient mess. (Thanks to Linus for suggesting improvement on the inline assembly). Reported-by: Qiuhao Li <qiuhao@sysec.org> Reported-by: Gaoning Pan <pgn@zju.edu.cn> Reported-by: Yongkang Jia <kangel@zju.edu.cn> Reported-by: syzbot+6cde2282daa792c49ab8@syzkaller.appspotmail.com Debugged-by: Tadeusz Struk <tadeusz.struk@linaro.org> Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Cc: stable@vger.kernel.org Fixes: `bd53cb35a3` ("X86/KVM: Handle PFNs outside of kernel reach when touching GPTEs") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-02 05:37:27 -04:00
Zhenzhong Duan	4335edbbc1	KVM: x86: Remove redundant vm_entry_controls_clearbit() call When emulating exit from long mode, EFER_LMA is cleared with vmx_set_efer(). This will already unset the VM_ENTRY_IA32E_MODE control bit as requested by SDM, so there is no need to unset VM_ENTRY_IA32E_MODE again in exit_lmode() explicitly. In case EFER isn't supported by hardware, long mode isn't supported, so exit_lmode() cannot be reached. Note that, thanks to the shadow controls mechanism, this change doesn't eliminate vmread or vmwrite. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20220311102643.807507-3-zhenzhong.duan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-02 05:37:26 -04:00
Zhenzhong Duan	b76edfe91a	KVM: x86: cleanup enter_rmode() vmx_set_efer() sets uret->data but, in fact if the value of uret->data will be used vmx_setup_uret_msrs() will have rewritten it with the value returned by update_transition_efer(). uret->data is consumed if and only if uret->load_into_hardware is true, and vmx_setup_uret_msrs() takes care of (a) updating uret->data before setting uret->load_into_hardware to true (b) setting uret->load_into_hardware to false if uret->data isn't updated. Opportunistically use "vmx" directly instead of redoing to_vmx(). Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20220311102643.807507-2-zhenzhong.duan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-02 05:37:26 -04:00
Maxim Levitsky	8809931383	KVM: x86: SVM: fix tsc scaling when the host doesn't support it It was decided that when TSC scaling is not supported, the virtual MSR_AMD64_TSC_RATIO should still have the default '1.0' value. However in this case kvm_max_tsc_scaling_ratio is not set, which breaks various assumptions. Fix this by always calculating kvm_max_tsc_scaling_ratio regardless of host support. For consistency, do the same for VMX. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220322172449.235575-8-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-02 05:37:26 -04:00

1 2 3 4 5 ...

1088649 Commits