Merge 5.4-rc1-prerelease into android-mainline

To make the 5.4-rc1 merge easier, merge at a prerelease point in time
before the final release happens.

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I052c6a28528e10cdda89b6a20d320ac7562266b8
This commit is contained in:
Greg Kroah-Hartman
2019-10-02 18:36:30 +02:00
468 changed files with 9311 additions and 5592 deletions

View File

@@ -72,3 +72,37 @@ Description:
It is a read/write file. When read, the currently assigned
pretimeout governor is returned. When written, it sets
the pretimeout governor.
What: /sys/class/watchdog/watchdog1/access_cs0
Date: August 2019
Contact: Ivan Mikhaylov <i.mikhaylov@yadro.com>,
Alexander Amelkin <a.amelkin@yadro.com>
Description:
It is a read/write file. This attribute exists only if the
system has booted from the alternate flash chip due to
expiration of a watchdog timer of AST2400/AST2500 when
alternate boot function was enabled with 'aspeed,alt-boot'
devicetree option for that watchdog or with an appropriate
h/w strapping (for WDT2 only).
At alternate flash the 'access_cs0' sysfs node provides:
ast2400: a way to get access to the primary SPI flash
chip at CS0 after booting from the alternate
chip at CS1.
ast2500: a way to restore the normal address mapping
from (CS0->CS1, CS1->CS0) to (CS0->CS0,
CS1->CS1).
Clearing the boot code selection and timeout counter also
resets to the initial state the chip select line mapping. When
the SoC is in normal mapping state (i.e. booted from CS0),
clearing those bits does nothing for both versions of the SoC.
For alternate boot mode (booted from CS1 due to wdt2
expiration) the behavior differs as described above.
This option can be used with wdt2 (watchdog1) only.
When read, the current status of the boot code selection is
shown. When written with any non-zero value, it clears
the boot code selection and the timeout counter, which results
in chipselect reset for AST2400/AST2500.

View File

@@ -6,6 +6,8 @@ Required properties:
- "mediatek,mt7622-pwm": found on mt7622 SoC.
- "mediatek,mt7623-pwm": found on mt7623 SoC.
- "mediatek,mt7628-pwm": found on mt7628 SoC.
- "mediatek,mt7629-pwm", "mediatek,mt7622-pwm": found on mt7629 SoC.
- "mediatek,mt8516-pwm": found on mt8516 SoC.
- reg: physical base address and length of the controller's registers.
- #pwm-cells: must be 2. See pwm.txt in this directory for a description of
the cell format.

View File

@@ -0,0 +1,40 @@
Spreadtrum PWM controller
Spreadtrum SoCs PWM controller provides 4 PWM channels.
Required properties:
- compatible : Should be "sprd,ums512-pwm".
- reg: Physical base address and length of the controller's registers.
- clocks: The phandle and specifier referencing the controller's clocks.
- clock-names: Should contain following entries:
"pwmn": used to derive the functional clock for PWM channel n (n range: 0 ~ 3).
"enablen": for PWM channel n enable clock (n range: 0 ~ 3).
- #pwm-cells: Should be 2. See pwm.txt in this directory for a description of
the cells format.
Optional properties:
- assigned-clocks: Reference to the PWM clock entries.
- assigned-clock-parents: The phandle of the parent clock of PWM clock.
Example:
pwms: pwm@32260000 {
compatible = "sprd,ums512-pwm";
reg = <0 0x32260000 0 0x10000>;
clock-names = "pwm0", "enable0",
"pwm1", "enable1",
"pwm2", "enable2",
"pwm3", "enable3";
clocks = <&aon_clk CLK_PWM0>, <&aonapb_gate CLK_PWM0_EB>,
<&aon_clk CLK_PWM1>, <&aonapb_gate CLK_PWM1_EB>,
<&aon_clk CLK_PWM2>, <&aonapb_gate CLK_PWM2_EB>,
<&aon_clk CLK_PWM3>, <&aonapb_gate CLK_PWM3_EB>;
assigned-clocks = <&aon_clk CLK_PWM0>,
<&aon_clk CLK_PWM1>,
<&aon_clk CLK_PWM2>,
<&aon_clk CLK_PWM3>;
assigned-clock-parents = <&ext_26m>,
<&ext_26m>,
<&ext_26m>,
<&ext_26m>;
#pwm-cells = <2>;
};

View File

@@ -23,6 +23,7 @@ Required properties:
Optional property:
- little-endian : If present, the TMU registers are little endian. If absent,
the default is big endian.
- clocks : the clock for clocking the TMU silicon.
Example:

View File

@@ -0,0 +1,58 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/watchdog/allwinner,sun4i-a10-wdt.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Allwinner A10 Watchdog Device Tree Bindings
allOf:
- $ref: "watchdog.yaml#"
maintainers:
- Chen-Yu Tsai <wens@csie.org>
- Maxime Ripard <maxime.ripard@bootlin.com>
properties:
compatible:
oneOf:
- const: allwinner,sun4i-a10-wdt
- const: allwinner,sun6i-a31-wdt
- items:
- const: allwinner,sun50i-a64-wdt
- const: allwinner,sun6i-a31-wdt
- items:
- const: allwinner,sun50i-h6-wdt
- const: allwinner,sun6i-a31-wdt
- items:
- const: allwinner,suniv-f1c100s-wdt
- const: allwinner,sun4i-a10-wdt
reg:
maxItems: 1
clocks:
maxItems: 1
interrupts:
maxItems: 1
required:
- compatible
- reg
- clocks
- interrupts
unevaluatedProperties: false
examples:
- |
wdt: watchdog@1c20c90 {
compatible = "allwinner,sun4i-a10-wdt";
reg = <0x01c20c90 0x10>;
interrupts = <24>;
clocks = <&osc24M>;
timeout-sec = <10>;
};
...

View File

@@ -4,6 +4,7 @@ Required properties:
- compatible: must be one of:
- "aspeed,ast2400-wdt"
- "aspeed,ast2500-wdt"
- "aspeed,ast2600-wdt"
- reg: physical base address of the controller and length of memory mapped
region

View File

@@ -0,0 +1,22 @@
* Freescale i.MX7ULP Watchdog Timer (WDT) Controller
Required properties:
- compatible : Should be "fsl,imx7ulp-wdt"
- reg : Should contain WDT registers location and length
- interrupts : Should contain WDT interrupt
- clocks: Should contain a phandle pointing to the gated peripheral clock.
Optional properties:
- timeout-sec : Contains the watchdog timeout in seconds
Examples:
wdog1: watchdog@403d0000 {
compatible = "fsl,imx7ulp-wdt";
reg = <0x403d0000 0x10000>;
interrupts = <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&pcc2 IMX7ULP_CLK_WDG1>;
assigned-clocks = <&pcc2 IMX7ULP_CLK_WDG1>;
assigned-clocks-parents = <&scg1 IMX7ULP_CLK_FIRC_BUS_CLK>;
timeout-sec = <40>;
};

View File

@@ -1,22 +0,0 @@
Allwinner SoCs Watchdog timer
Required properties:
- compatible : should be one of
"allwinner,sun4i-a10-wdt"
"allwinner,sun6i-a31-wdt"
"allwinner,sun50i-a64-wdt","allwinner,sun6i-a31-wdt"
"allwinner,sun50i-h6-wdt","allwinner,sun6i-a31-wdt"
"allwinner,suniv-f1c100s-wdt", "allwinner,sun4i-a10-wdt"
- reg : Specifies base physical address and size of the registers.
Optional properties:
- timeout-sec : Contains the watchdog timeout in seconds
Example:
wdt: watchdog@1c20c90 {
compatible = "allwinner,sun4i-a10-wdt";
reg = <0x01c20c90 0x10>;
timeout-sec = <10>;
};

View File

@@ -0,0 +1,26 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/watchdog/watchdog.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Watchdog Generic Bindings
maintainers:
- Guenter Roeck <linux@roeck-us.net>
- Wim Van Sebroeck <wim@linux-watchdog.org>
description: |
This document describes generic bindings which can be used to
describe watchdog devices in a device tree.
properties:
$nodename:
pattern: "^watchdog(@.*|-[0-9a-f])?$"
timeout-sec:
$ref: /schemas/types.yaml#/definitions/uint32
description:
Contains the watchdog timeout in seconds.
...

View File

@@ -37,3 +37,13 @@ filesystem implementations.
journalling
fscrypt
fsverity
Filesystems
===========
Documentation for filesystem implementations.
.. toctree::
:maxdepth: 2
virtiofs

View File

@@ -0,0 +1,60 @@
.. SPDX-License-Identifier: GPL-2.0
===================================================
virtiofs: virtio-fs host<->guest shared file system
===================================================
- Copyright (C) 2019 Red Hat, Inc.
Introduction
============
The virtiofs file system for Linux implements a driver for the paravirtualized
VIRTIO "virtio-fs" device for guest<->host file system sharing. It allows a
guest to mount a directory that has been exported on the host.
Guests often require access to files residing on the host or remote systems.
Use cases include making files available to new guests during installation,
booting from a root file system located on the host, persistent storage for
stateless or ephemeral guests, and sharing a directory between guests.
Although it is possible to use existing network file systems for some of these
tasks, they require configuration steps that are hard to automate and they
expose the storage network to the guest. The virtio-fs device was designed to
solve these problems by providing file system access without networking.
Furthermore the virtio-fs device takes advantage of the co-location of the
guest and host to increase performance and provide semantics that are not
possible with network file systems.
Usage
=====
Mount file system with tag ``myfs`` on ``/mnt``:
.. code-block:: sh
guest# mount -t virtiofs myfs /mnt
Please see https://virtio-fs.gitlab.io/ for details on how to configure QEMU
and the virtiofsd daemon.
Internals
=========
Since the virtio-fs device uses the FUSE protocol for file system requests, the
virtiofs file system for Linux is integrated closely with the FUSE file system
client. The guest acts as the FUSE client while the host acts as the FUSE
server. The /dev/fuse interface between the kernel and userspace is replaced
with the virtio-fs device interface.
FUSE requests are placed into a virtqueue and processed by the host. The
response portion of the buffer is filled in by the host and the guest handles
the request completion.
Mapping /dev/fuse to virtqueues requires solving differences in semantics
between /dev/fuse and virtqueues. Each time the /dev/fuse device is read, the
FUSE client may choose which request to transfer, making it possible to
prioritize certain requests over others. Virtqueues have queue semantics and
it is not possible to change the order of requests that have been enqueued.
This is especially important if the virtqueue becomes full since it is then
impossible to add high priority requests. In order to address this difference,
the virtio-fs device uses a "hiprio" virtqueue specifically for requests that
have priority over normal requests.

View File

@@ -5309,3 +5309,16 @@ Architectures: x86
This capability indicates that KVM supports paravirtualized Hyper-V IPI send
hypercalls:
HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.
8.21 KVM_CAP_HYPERV_DIRECT_TLBFLUSH
Architecture: x86
This capability indicates that KVM running on top of Hyper-V hypervisor
enables Direct TLB flush for its guests meaning that TLB flush
hypercalls are handled by Level 0 hypervisor (Hyper-V) bypassing KVM.
Due to the different ABI for hypercall parameters between Hyper-V and
KVM, enabling this capability effectively disables all hypercall
handling by KVM (as some KVM hypercall may be mistakenly treated as TLB
flush hypercalls by Hyper-V) so userspace should disable KVM identification
in CPUID and only exposes Hyper-V identification. In this case, guest
thinks it's running on Hyper-V and only use Hyper-V hypercalls.

View File

@@ -301,15 +301,6 @@ ixp4xx_wdt:
-------------------------------------------------
ks8695_wdt:
wdt_time:
Watchdog time in seconds. (default=5)
nowayout:
Watchdog cannot be stopped once started
(default=kernel config parameter)
-------------------------------------------------
machzwd:
nowayout:
Watchdog cannot be stopped once started
@@ -375,16 +366,6 @@ nic7018_wdt:
-------------------------------------------------
nuc900_wdt:
heartbeat:
Watchdog heartbeats in seconds.
(default = 15)
nowayout:
Watchdog cannot be stopped once started
(default=kernel config parameter)
-------------------------------------------------
omap_wdt:
timer_margin:
initial watchdog timeout (in seconds)

View File

@@ -9060,6 +9060,7 @@ F: include/keys/trusted.h
KEYS/KEYRINGS:
M: David Howells <dhowells@redhat.com>
M: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
L: keyrings@vger.kernel.org
S: Maintained
F: Documentation/security/keys/core.rst
@@ -13245,9 +13246,11 @@ F: drivers/media/rc/pwm-ir-tx.c
PWM SUBSYSTEM
M: Thierry Reding <thierry.reding@gmail.com>
R: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
L: linux-pwm@vger.kernel.org
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm.git
Q: https://patchwork.ozlabs.org/project/linux-pwm/list/
F: Documentation/driver-api/pwm.rst
F: Documentation/devicetree/bindings/pwm/
F: include/linux/pwm.h
@@ -13256,6 +13259,7 @@ F: drivers/video/backlight/pwm_bl.c
F: include/linux/pwm_backlight.h
F: drivers/gpio/gpio-mvebu.c
F: Documentation/devicetree/bindings/gpio/gpio-mvebu.txt
K: pwm_(config|apply_state|ops)
PXA GPIO DRIVER
M: Robert Jarzmik <robert.jarzmik@free.fr>
@@ -16071,6 +16075,7 @@ THERMAL
M: Zhang Rui <rui.zhang@intel.com>
M: Eduardo Valentin <edubezval@gmail.com>
R: Daniel Lezcano <daniel.lezcano@linaro.org>
R: Amit Kucheria <amit.kucheria@verdurent.com>
L: linux-pm@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal.git
@@ -17275,6 +17280,18 @@ S: Supported
F: drivers/s390/virtio/
F: arch/s390/include/uapi/asm/virtio-ccw.h
VIRTIO FILE SYSTEM
M: Vivek Goyal <vgoyal@redhat.com>
M: Stefan Hajnoczi <stefanha@redhat.com>
M: Miklos Szeredi <miklos@szeredi.hu>
L: virtualization@lists.linux-foundation.org
L: linux-fsdevel@vger.kernel.org
W: https://virtio-fs.gitlab.io/
S: Supported
F: fs/fuse/virtio_fs.c
F: include/uapi/linux/virtio_fs.h
F: Documentation/filesystems/virtiofs.rst
VIRTIO GPU DRIVER
M: David Airlie <airlied@linux.ie>
M: Gerd Hoffmann <kraxel@redhat.com>

View File

@@ -123,7 +123,7 @@ asmlinkage void __init nios2_boot_init(unsigned r4, unsigned r5, unsigned r6,
dtb_passed = r6;
if (r7)
strncpy(cmdline_passed, (char *)r7, COMMAND_LINE_SIZE);
strlcpy(cmdline_passed, (char *)r7, COMMAND_LINE_SIZE);
}
#endif
@@ -131,10 +131,10 @@ asmlinkage void __init nios2_boot_init(unsigned r4, unsigned r5, unsigned r6,
#ifndef CONFIG_CMDLINE_FORCE
if (cmdline_passed[0])
strncpy(boot_command_line, cmdline_passed, COMMAND_LINE_SIZE);
strlcpy(boot_command_line, cmdline_passed, COMMAND_LINE_SIZE);
#ifdef CONFIG_NIOS2_CMDLINE_IGNORE_DTB
else
strncpy(boot_command_line, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
strlcpy(boot_command_line, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
#endif
#endif

View File

@@ -13,6 +13,7 @@
aliases {
serial0 = &uart0;
serial1 = &uart1;
ethernet0 = &eth0;
};
chosen {
@@ -60,7 +61,6 @@
};
};
cpu2: cpu@2 {
clock-frequency = <0>;
compatible = "sifive,u54-mc", "sifive,rocket0", "riscv";
d-cache-block-size = <64>;
d-cache-sets = <64>;
@@ -84,7 +84,6 @@
};
};
cpu3: cpu@3 {
clock-frequency = <0>;
compatible = "sifive,u54-mc", "sifive,rocket0", "riscv";
d-cache-block-size = <64>;
d-cache-sets = <64>;
@@ -108,7 +107,6 @@
};
};
cpu4: cpu@4 {
clock-frequency = <0>;
compatible = "sifive,u54-mc", "sifive,rocket0", "riscv";
d-cache-block-size = <64>;
d-cache-sets = <64>;
@@ -230,6 +228,24 @@
#size-cells = <0>;
status = "disabled";
};
pwm0: pwm@10020000 {
compatible = "sifive,fu540-c000-pwm", "sifive,pwm0";
reg = <0x0 0x10020000 0x0 0x1000>;
interrupt-parent = <&plic0>;
interrupts = <42 43 44 45>;
clocks = <&prci PRCI_CLK_TLCLK>;
#pwm-cells = <3>;
status = "disabled";
};
pwm1: pwm@10021000 {
compatible = "sifive,fu540-c000-pwm", "sifive,pwm0";
reg = <0x0 0x10021000 0x0 0x1000>;
interrupt-parent = <&plic0>;
interrupts = <46 47 48 49>;
clocks = <&prci PRCI_CLK_TLCLK>;
#pwm-cells = <3>;
status = "disabled";
};
};
};

View File

@@ -85,3 +85,11 @@
reg = <0>;
};
};
&pwm0 {
status = "okay";
};
&pwm1 {
status = "okay";
};

View File

@@ -29,6 +29,8 @@ CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_IP_PNP_RARP=y
CONFIG_NETLINK_DIAG=y
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y
CONFIG_PCI=y
CONFIG_PCIEPORTBUS=y
CONFIG_PCI_HOST_GENERIC=y
@@ -39,6 +41,7 @@ CONFIG_BLK_DEV_LOOP=y
CONFIG_VIRTIO_BLK=y
CONFIG_BLK_DEV_SD=y
CONFIG_BLK_DEV_SR=y
CONFIG_SCSI_VIRTIO=y
CONFIG_ATA=y
CONFIG_SATA_AHCI=y
CONFIG_SATA_AHCI_PLATFORM=y
@@ -54,6 +57,7 @@ CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_OF_PLATFORM=y
CONFIG_SERIAL_EARLYCON_RISCV_SBI=y
CONFIG_HVC_RISCV_SBI=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_VIRTIO=y
CONFIG_SPI=y
@@ -61,6 +65,7 @@ CONFIG_SPI_SIFIVE=y
# CONFIG_PTP_1588_CLOCK is not set
CONFIG_DRM=y
CONFIG_DRM_RADEON=y
CONFIG_DRM_VIRTIO_GPU=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_USB=y
CONFIG_USB_XHCI_HCD=y
@@ -73,7 +78,12 @@ CONFIG_USB_STORAGE=y
CONFIG_USB_UAS=y
CONFIG_MMC=y
CONFIG_MMC_SPI=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_BALLOON=y
CONFIG_VIRTIO_INPUT=y
CONFIG_VIRTIO_MMIO=y
CONFIG_RPMSG_CHAR=y
CONFIG_RPMSG_VIRTIO=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_AUTOFS4_FS=y
@@ -86,6 +96,7 @@ CONFIG_NFS_V4=y
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_ROOT_NFS=y
CONFIG_9P_FS=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CRYPTO_DEV_VIRTIO=y
CONFIG_PRINTK_TIME=y

View File

@@ -29,6 +29,8 @@ CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_IP_PNP_RARP=y
CONFIG_NETLINK_DIAG=y
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y
CONFIG_PCI=y
CONFIG_PCIEPORTBUS=y
CONFIG_PCI_HOST_GENERIC=y
@@ -39,6 +41,7 @@ CONFIG_BLK_DEV_LOOP=y
CONFIG_VIRTIO_BLK=y
CONFIG_BLK_DEV_SD=y
CONFIG_BLK_DEV_SR=y
CONFIG_SCSI_VIRTIO=y
CONFIG_ATA=y
CONFIG_SATA_AHCI=y
CONFIG_SATA_AHCI_PLATFORM=y
@@ -54,11 +57,13 @@ CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_OF_PLATFORM=y
CONFIG_SERIAL_EARLYCON_RISCV_SBI=y
CONFIG_HVC_RISCV_SBI=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_VIRTIO=y
# CONFIG_PTP_1588_CLOCK is not set
CONFIG_DRM=y
CONFIG_DRM_RADEON=y
CONFIG_DRM_VIRTIO_GPU=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_USB=y
CONFIG_USB_XHCI_HCD=y
@@ -69,7 +74,12 @@ CONFIG_USB_OHCI_HCD=y
CONFIG_USB_OHCI_HCD_PLATFORM=y
CONFIG_USB_STORAGE=y
CONFIG_USB_UAS=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_BALLOON=y
CONFIG_VIRTIO_INPUT=y
CONFIG_VIRTIO_MMIO=y
CONFIG_RPMSG_CHAR=y
CONFIG_RPMSG_VIRTIO=y
CONFIG_SIFIVE_PLIC=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
@@ -83,6 +93,7 @@ CONFIG_NFS_V4=y
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_ROOT_NFS=y
CONFIG_9P_FS=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CRYPTO_DEV_VIRTIO=y
CONFIG_PRINTK_TIME=y

View File

@@ -83,6 +83,18 @@ extern pgd_t swapper_pg_dir[];
#define __S110 PAGE_SHARED_EXEC
#define __S111 PAGE_SHARED_EXEC
#define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1)
#define VMALLOC_END (PAGE_OFFSET - 1)
#define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE)
#define FIXADDR_TOP VMALLOC_START
#ifdef CONFIG_64BIT
#define FIXADDR_SIZE PMD_SIZE
#else
#define FIXADDR_SIZE PGDIR_SIZE
#endif
#define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE)
/*
* Roughly size the vmemmap space to be large enough to fit enough
* struct pages to map half the virtual address space. Then
@@ -424,18 +436,6 @@ extern void *dtb_early_va;
extern void setup_bootmem(void);
extern void paging_init(void);
#define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1)
#define VMALLOC_END (PAGE_OFFSET - 1)
#define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE)
#define FIXADDR_TOP VMALLOC_START
#ifdef CONFIG_64BIT
#define FIXADDR_SIZE PMD_SIZE
#else
#define FIXADDR_SIZE PGDIR_SIZE
#endif
#define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE)
/*
* Task size is 0x4000000000 for RV64 or 0x9fc00000 for RV32.
* Note that PGDIR_SIZE must evenly divide TASK_SIZE.

View File

@@ -166,9 +166,13 @@ ENTRY(handle_exception)
move a0, sp /* pt_regs */
tail do_IRQ
1:
/* Exceptions run with interrupts enabled */
/* Exceptions run with interrupts enabled or disabled
depending on the state of sstatus.SR_SPIE */
andi t0, s1, SR_SPIE
beqz t0, 1f
csrs CSR_SSTATUS, SR_SIE
1:
/* Handle syscalls */
li t0, EXC_SYSCALL
beq s4, t0, handle_syscall

View File

@@ -63,6 +63,11 @@ _start_kernel:
li t0, SR_FS
csrc CSR_SSTATUS, t0
#ifdef CONFIG_SMP
li t0, CONFIG_NR_CPUS
bgeu a0, t0, .Lsecondary_park
#endif
/* Pick one hart to run the main boot sequence */
la a3, hart_lottery
li a2, 1
@@ -154,9 +159,6 @@ relocate:
.Lsecondary_start:
#ifdef CONFIG_SMP
li a1, CONFIG_NR_CPUS
bgeu a0, a1, .Lsecondary_park
/* Set trap vector to spin forever to help debug */
la a3, .Lsecondary_park
csrw CSR_STVEC, a3

View File

@@ -206,3 +206,4 @@ void smp_send_reschedule(int cpu)
{
send_ipi_single(cpu, IPI_RESCHEDULE);
}
EXPORT_SYMBOL_GPL(smp_send_reschedule);

View File

@@ -9,6 +9,7 @@
#include <asm/sbi.h>
unsigned long riscv_timebase;
EXPORT_SYMBOL_GPL(riscv_timebase);
void __init time_init(void)
{

View File

@@ -180,7 +180,15 @@
/* Recommend using enlightened VMCS */
#define HV_X64_ENLIGHTENED_VMCS_RECOMMENDED BIT(14)
/*
* Virtual processor will never share a physical core with another virtual
* processor, except for virtual processors that are reported as sibling SMT
* threads.
*/
#define HV_X64_NO_NONARCH_CORESHARING BIT(18)
/* Nested features. These are HYPERV_CPUID_NESTED_FEATURES.EAX bits. */
#define HV_X64_NESTED_DIRECT_FLUSH BIT(17)
#define HV_X64_NESTED_GUEST_MAPPING_FLUSH BIT(18)
#define HV_X64_NESTED_MSR_BITMAP BIT(19)
@@ -524,14 +532,24 @@ struct hv_timer_message_payload {
__u64 delivery_time; /* When the message was delivered */
} __packed;
struct hv_nested_enlightenments_control {
struct {
__u32 directhypercall:1;
__u32 reserved:31;
} features;
struct {
__u32 reserved;
} hypercallControls;
} __packed;
/* Define virtual processor assist page structure. */
struct hv_vp_assist_page {
__u32 apic_assist;
__u32 reserved;
__u64 vtl_control[2];
__u64 nested_enlightenments_control[2];
__u32 enlighten_vmentry;
__u32 padding;
__u32 reserved1;
__u64 vtl_control[3];
struct hv_nested_enlightenments_control nested_control;
__u8 enlighten_vmentry;
__u8 reserved2[7];
__u64 current_nested_vmcs;
} __packed;
@@ -882,4 +900,7 @@ struct hv_tlb_flush_ex {
u64 gva_list[];
} __packed;
struct hv_partition_assist_pg {
u32 tlb_lock_count;
};
#endif

View File

@@ -320,6 +320,7 @@ struct kvm_mmu_page {
struct list_head link;
struct hlist_node hash_link;
bool unsync;
u8 mmu_valid_gen;
bool mmio_cached;
/*
@@ -335,7 +336,6 @@ struct kvm_mmu_page {
int root_count; /* Currently serving as active root */
unsigned int unsync_children;
struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */
unsigned long mmu_valid_gen;
DECLARE_BITMAP(unsync_child_bitmap, 512);
#ifdef CONFIG_X86_32
@@ -844,6 +844,8 @@ struct kvm_hv {
/* How many vCPUs have VP index != vCPU index */
atomic_t num_mismatched_vp_indexes;
struct hv_partition_assist_pg *hv_pa_pg;
};
enum kvm_irqchip_mode {
@@ -857,12 +859,13 @@ struct kvm_arch {
unsigned long n_requested_mmu_pages;
unsigned long n_max_mmu_pages;
unsigned int indirect_shadow_pages;
unsigned long mmu_valid_gen;
u8 mmu_valid_gen;
struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
/*
* Hash table of struct kvm_mmu_page.
*/
struct list_head active_mmu_pages;
struct list_head zapped_obsolete_pages;
struct kvm_page_track_notifier_node mmu_sp_tracker;
struct kvm_page_track_notifier_head track_notifier_head;
@@ -1213,6 +1216,7 @@ struct kvm_x86_ops {
bool (*need_emulation_on_page_fault)(struct kvm_vcpu *vcpu);
bool (*apic_init_signal_blocked)(struct kvm_vcpu *vcpu);
int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
};
struct kvm_arch_async_pf {
@@ -1312,18 +1316,42 @@ extern u64 kvm_default_tsc_scaling_ratio;
extern u64 kvm_mce_cap_supported;
enum emulation_result {
EMULATE_DONE, /* no further processing */
EMULATE_USER_EXIT, /* kvm_run ready for userspace exit */
EMULATE_FAIL, /* can't emulate this instruction */
};
/*
* EMULTYPE_NO_DECODE - Set when re-emulating an instruction (after completing
* userspace I/O) to indicate that the emulation context
* should be resued as is, i.e. skip initialization of
* emulation context, instruction fetch and decode.
*
* EMULTYPE_TRAP_UD - Set when emulating an intercepted #UD from hardware.
* Indicates that only select instructions (tagged with
* EmulateOnUD) should be emulated (to minimize the emulator
* attack surface). See also EMULTYPE_TRAP_UD_FORCED.
*
* EMULTYPE_SKIP - Set when emulating solely to skip an instruction, i.e. to
* decode the instruction length. For use *only* by
* kvm_x86_ops->skip_emulated_instruction() implementations.
*
* EMULTYPE_ALLOW_RETRY - Set when the emulator should resume the guest to
* retry native execution under certain conditions.
*
* EMULTYPE_TRAP_UD_FORCED - Set when emulating an intercepted #UD that was
* triggered by KVM's magic "force emulation" prefix,
* which is opt in via module param (off by default).
* Bypasses EmulateOnUD restriction despite emulating
* due to an intercepted #UD (see EMULTYPE_TRAP_UD).
* Used to test the full emulator from userspace.
*
* EMULTYPE_VMWARE_GP - Set when emulating an intercepted #GP for VMware
* backdoor emulation, which is opt in via module param.
* VMware backoor emulation handles select instructions
* and reinjects the #GP for all other cases.
*/
#define EMULTYPE_NO_DECODE (1 << 0)
#define EMULTYPE_TRAP_UD (1 << 1)
#define EMULTYPE_SKIP (1 << 2)
#define EMULTYPE_ALLOW_RETRY (1 << 3)
#define EMULTYPE_NO_UD_ON_FAIL (1 << 4)
#define EMULTYPE_VMWARE (1 << 5)
#define EMULTYPE_TRAP_UD_FORCED (1 << 4)
#define EMULTYPE_VMWARE_GP (1 << 5)
int kvm_emulate_instruction(struct kvm_vcpu *vcpu, int emulation_type);
int kvm_emulate_instruction_from_buffer(struct kvm_vcpu *vcpu,
void *insn, int insn_len);
@@ -1506,7 +1534,7 @@ enum {
#define kvm_arch_vcpu_memslots_id(vcpu) ((vcpu)->arch.hflags & HF_SMM_MASK ? 1 : 0)
#define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, (role).smm)
asmlinkage void __noreturn kvm_spurious_fault(void);
asmlinkage void kvm_spurious_fault(void);
/*
* Hardware virtualization extension instructions may fault if a
@@ -1514,24 +1542,14 @@ asmlinkage void __noreturn kvm_spurious_fault(void);
* Usually after catching the fault we just panic; during reboot
* instead the instruction is ignored.
*/
#define ____kvm_handle_fault_on_reboot(insn, cleanup_insn) \
#define __kvm_handle_fault_on_reboot(insn) \
"666: \n\t" \
insn "\n\t" \
"jmp 668f \n\t" \
"667: \n\t" \
"call kvm_spurious_fault \n\t" \
"668: \n\t" \
".pushsection .fixup, \"ax\" \n\t" \
"700: \n\t" \
cleanup_insn "\n\t" \
"cmpb $0, kvm_rebooting\n\t" \
"je 667b \n\t" \
"jmp 668b \n\t" \
".popsection \n\t" \
_ASM_EXTABLE(666b, 700b)
#define __kvm_handle_fault_on_reboot(insn) \
____kvm_handle_fault_on_reboot(insn, "")
_ASM_EXTABLE(666b, 667b)
#define KVM_ARCH_WANT_MMU_NOTIFIER
int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end);

View File

@@ -52,6 +52,7 @@ enum {
INTERCEPT_MWAIT,
INTERCEPT_MWAIT_COND,
INTERCEPT_XSETBV,
INTERCEPT_RDPRU,
};

View File

@@ -69,6 +69,7 @@
#define SECONDARY_EXEC_PT_USE_GPA 0x01000000
#define SECONDARY_EXEC_MODE_BASED_EPT_EXEC 0x00400000
#define SECONDARY_EXEC_TSC_SCALING 0x02000000
#define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE 0x04000000
#define PIN_BASED_EXT_INTR_MASK 0x00000001
#define PIN_BASED_NMI_EXITING 0x00000008
@@ -110,6 +111,7 @@
#define VMX_MISC_SAVE_EFER_LMA 0x00000020
#define VMX_MISC_ACTIVITY_HLT 0x00000040
#define VMX_MISC_ZERO_LEN_INS 0x40000000
#define VMX_MISC_MSR_LIST_MULTIPLIER 512
/* VMFUNC functions */
#define VMX_VMFUNC_EPTP_SWITCHING 0x00000001

View File

@@ -75,6 +75,7 @@
#define SVM_EXIT_MWAIT 0x08b
#define SVM_EXIT_MWAIT_COND 0x08c
#define SVM_EXIT_XSETBV 0x08d
#define SVM_EXIT_RDPRU 0x08e
#define SVM_EXIT_NPF 0x400
#define SVM_EXIT_AVIC_INCOMPLETE_IPI 0x401
#define SVM_EXIT_AVIC_UNACCELERATED_ACCESS 0x402

View File

@@ -86,6 +86,8 @@
#define EXIT_REASON_PML_FULL 62
#define EXIT_REASON_XSAVES 63
#define EXIT_REASON_XRSTORS 64
#define EXIT_REASON_UMWAIT 67
#define EXIT_REASON_TPAUSE 68
#define VMX_EXIT_REASONS \
{ EXIT_REASON_EXCEPTION_NMI, "EXCEPTION_NMI" }, \
@@ -144,7 +146,9 @@
{ EXIT_REASON_RDSEED, "RDSEED" }, \
{ EXIT_REASON_PML_FULL, "PML_FULL" }, \
{ EXIT_REASON_XSAVES, "XSAVES" }, \
{ EXIT_REASON_XRSTORS, "XRSTORS" }
{ EXIT_REASON_XRSTORS, "XRSTORS" }, \
{ EXIT_REASON_UMWAIT, "UMWAIT" }, \
{ EXIT_REASON_TPAUSE, "TPAUSE" }
#define VMX_ABORT_SAVE_GUEST_MSR_FAIL 1
#define VMX_ABORT_LOAD_HOST_PDPTE_FAIL 2

View File

@@ -17,6 +17,12 @@
*/
static u32 umwait_control_cached = UMWAIT_CTRL_VAL(100000, UMWAIT_C02_ENABLE);
u32 get_umwait_control_msr(void)
{
return umwait_control_cached;
}
EXPORT_SYMBOL_GPL(get_umwait_control_msr);
/*
* Cache the original IA32_UMWAIT_CONTROL MSR value which is configured by
* hardware or BIOS before kernel boot.

View File

@@ -304,7 +304,13 @@ static void do_host_cpuid(struct kvm_cpuid_entry2 *entry, u32 function,
case 7:
case 0xb:
case 0xd:
case 0xf:
case 0x10:
case 0x12:
case 0x14:
case 0x17:
case 0x18:
case 0x1f:
case 0x8000001d:
entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
break;
@@ -360,7 +366,7 @@ static inline void do_cpuid_7_mask(struct kvm_cpuid_entry2 *entry, int index)
F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ |
F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B);
F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/;
/* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features =

View File

@@ -23,6 +23,7 @@
#include "ioapic.h"
#include "hyperv.h"
#include <linux/cpu.h>
#include <linux/kvm_host.h>
#include <linux/highmem.h>
#include <linux/sched/cputime.h>
@@ -645,7 +646,9 @@ static int stimer_notify_direct(struct kvm_vcpu_hv_stimer *stimer)
.vector = stimer->config.apic_vector
};
return !kvm_apic_set_irq(vcpu, &irq, NULL);
if (lapic_in_kernel(vcpu))
return !kvm_apic_set_irq(vcpu, &irq, NULL);
return 0;
}
static void stimer_expiration(struct kvm_vcpu_hv_stimer *stimer)
@@ -1852,7 +1855,13 @@ int kvm_vcpu_ioctl_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
ent->edx |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
ent->edx |= HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
ent->edx |= HV_STIMER_DIRECT_MODE_AVAILABLE;
/*
* Direct Synthetic timers only make sense with in-kernel
* LAPIC
*/
if (lapic_in_kernel(vcpu))
ent->edx |= HV_STIMER_DIRECT_MODE_AVAILABLE;
break;
@@ -1864,7 +1873,8 @@ int kvm_vcpu_ioctl_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
ent->eax |= HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED;
if (evmcs_ver)
ent->eax |= HV_X64_ENLIGHTENED_VMCS_RECOMMENDED;
if (!cpu_smt_possible())
ent->eax |= HV_X64_NO_NONARCH_CORESHARING;
/*
* Default number of spinlock retry attempts, matches
* HyperV 2016.

View File

@@ -65,7 +65,9 @@
#define APIC_BROADCAST 0xFF
#define X2APIC_BROADCAST 0xFFFFFFFFul
#define LAPIC_TIMER_ADVANCE_ADJUST_DONE 100
static bool lapic_timer_advance_dynamic __read_mostly;
#define LAPIC_TIMER_ADVANCE_ADJUST_MIN 100
#define LAPIC_TIMER_ADVANCE_ADJUST_MAX 5000
#define LAPIC_TIMER_ADVANCE_ADJUST_INIT 1000
/* step-by-step approximation to mitigate fluctuation */
#define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
@@ -1485,26 +1487,25 @@ static inline void adjust_lapic_timer_advance(struct kvm_vcpu *vcpu,
u32 timer_advance_ns = apic->lapic_timer.timer_advance_ns;
u64 ns;
/* Do not adjust for tiny fluctuations or large random spikes. */
if (abs(advance_expire_delta) > LAPIC_TIMER_ADVANCE_ADJUST_MAX ||
abs(advance_expire_delta) < LAPIC_TIMER_ADVANCE_ADJUST_MIN)
return;
/* too early */
if (advance_expire_delta < 0) {
ns = -advance_expire_delta * 1000000ULL;
do_div(ns, vcpu->arch.virtual_tsc_khz);
timer_advance_ns -= min((u32)ns,
timer_advance_ns / LAPIC_TIMER_ADVANCE_ADJUST_STEP);
timer_advance_ns -= ns/LAPIC_TIMER_ADVANCE_ADJUST_STEP;
} else {
/* too late */
ns = advance_expire_delta * 1000000ULL;
do_div(ns, vcpu->arch.virtual_tsc_khz);
timer_advance_ns += min((u32)ns,
timer_advance_ns / LAPIC_TIMER_ADVANCE_ADJUST_STEP);
timer_advance_ns += ns/LAPIC_TIMER_ADVANCE_ADJUST_STEP;
}
if (abs(advance_expire_delta) < LAPIC_TIMER_ADVANCE_ADJUST_DONE)
apic->lapic_timer.timer_advance_adjust_done = true;
if (unlikely(timer_advance_ns > 5000)) {
if (unlikely(timer_advance_ns > LAPIC_TIMER_ADVANCE_ADJUST_MAX))
timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
apic->lapic_timer.timer_advance_adjust_done = false;
}
apic->lapic_timer.timer_advance_ns = timer_advance_ns;
}
@@ -1524,7 +1525,7 @@ static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
if (guest_tsc < tsc_deadline)
__wait_lapic_expire(vcpu, tsc_deadline - guest_tsc);
if (unlikely(!apic->lapic_timer.timer_advance_adjust_done))
if (lapic_timer_advance_dynamic)
adjust_lapic_timer_advance(vcpu, apic->lapic_timer.advance_expire_delta);
}
@@ -2302,13 +2303,12 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
apic->lapic_timer.timer.function = apic_timer_fn;
if (timer_advance_ns == -1) {
apic->lapic_timer.timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
apic->lapic_timer.timer_advance_adjust_done = false;
lapic_timer_advance_dynamic = true;
} else {
apic->lapic_timer.timer_advance_ns = timer_advance_ns;
apic->lapic_timer.timer_advance_adjust_done = true;
lapic_timer_advance_dynamic = false;
}
/*
* APIC is created enabled. This will prevent kvm_lapic_set_base from
* thinking that APIC state has changed.

View File

@@ -35,7 +35,6 @@ struct kvm_timer {
s64 advance_expire_delta;
atomic_t pending; /* accumulated triggered timers */
bool hv_timer_in_use;
bool timer_advance_adjust_done;
};
struct kvm_lapic {

View File

@@ -403,8 +403,6 @@ static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
mask |= (gpa & shadow_nonpresent_or_rsvd_mask)
<< shadow_nonpresent_or_rsvd_mask_len;
page_header(__pa(sptep))->mmio_cached = true;
trace_mark_mmio_spte(sptep, gfn, access, gen);
mmu_spte_set(sptep, mask);
}
@@ -2103,6 +2101,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct
* depends on valid pages being added to the head of the list. See
* comments in kvm_zap_obsolete_pages().
*/
sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen;
list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages);
kvm_mod_used_mmu_pages(vcpu->kvm, +1);
return sp;
@@ -2252,7 +2251,7 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm,
#define for_each_valid_sp(_kvm, _sp, _gfn) \
hlist_for_each_entry(_sp, \
&(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)], hash_link) \
if (is_obsolete_sp((_kvm), (_sp)) || (_sp)->role.invalid) { \
if (is_obsolete_sp((_kvm), (_sp))) { \
} else
#define for_each_gfn_indirect_valid_sp(_kvm, _sp, _gfn) \
@@ -2311,7 +2310,8 @@ static void mmu_audit_disable(void) { }
static bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
{
return unlikely(sp->mmu_valid_gen != kvm->arch.mmu_valid_gen);
return sp->role.invalid ||
unlikely(sp->mmu_valid_gen != kvm->arch.mmu_valid_gen);
}
static bool kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
@@ -2538,7 +2538,6 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
if (level > PT_PAGE_TABLE_LEVEL && need_sync)
flush |= kvm_sync_pages(vcpu, gfn, &invalid_list);
}
sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen;
clear_page(sp->spt);
trace_kvm_mmu_get_page(sp, true);
@@ -2753,7 +2752,12 @@ static bool __kvm_mmu_prepare_zap_page(struct kvm *kvm,
} else {
list_move(&sp->link, &kvm->arch.active_mmu_pages);
if (!sp->role.invalid)
/*
* Obsolete pages cannot be used on any vCPUs, see the comment
* in kvm_mmu_zap_all_fast(). Note, is_obsolete_sp() also
* treats invalid shadow pages as being obsolete.
*/
if (!is_obsolete_sp(kvm, sp))
kvm_reload_remote_mmus(kvm);
}
@@ -5383,7 +5387,6 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t cr2, u64 error_code,
void *insn, int insn_len)
{
int r, emulation_type = 0;
enum emulation_result er;
bool direct = vcpu->arch.mmu->direct_map;
/* With shadow page tables, fault_address contains a GVA or nGPA. */
@@ -5450,19 +5453,8 @@ emulate:
return 1;
}
er = x86_emulate_instruction(vcpu, cr2, emulation_type, insn, insn_len);
switch (er) {
case EMULATE_DONE:
return 1;
case EMULATE_USER_EXIT:
++vcpu->stat.mmio_exits;
/* fall through */
case EMULATE_FAIL:
return 0;
default:
BUG();
}
return x86_emulate_instruction(vcpu, cr2, emulation_type, insn,
insn_len);
}
EXPORT_SYMBOL_GPL(kvm_mmu_page_fault);
@@ -5684,12 +5676,11 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
return ret;
}
#define BATCH_ZAP_PAGES 10
static void kvm_zap_obsolete_pages(struct kvm *kvm)
{
struct kvm_mmu_page *sp, *node;
LIST_HEAD(invalid_list);
int ign;
int nr_zapped, batch = 0;
restart:
list_for_each_entry_safe_reverse(sp, node,
@@ -5702,46 +5693,39 @@ restart:
break;
/*
* Do not repeatedly zap a root page to avoid unnecessary
* KVM_REQ_MMU_RELOAD, otherwise we may not be able to
* progress:
* vcpu 0 vcpu 1
* call vcpu_enter_guest():
* 1): handle KVM_REQ_MMU_RELOAD
* and require mmu-lock to
* load mmu
* repeat:
* 1): zap root page and
* send KVM_REQ_MMU_RELOAD
*
* 2): if (cond_resched_lock(mmu-lock))
*
* 2): hold mmu-lock and load mmu
*
* 3): see KVM_REQ_MMU_RELOAD bit
* on vcpu->requests is set
* then return 1 to call
* vcpu_enter_guest() again.
* goto repeat;
*
* Since we are reversely walking the list and the invalid
* list will be moved to the head, skip the invalid page
* can help us to avoid the infinity list walking.
* Skip invalid pages with a non-zero root count, zapping pages
* with a non-zero root count will never succeed, i.e. the page
* will get thrown back on active_mmu_pages and we'll get stuck
* in an infinite loop.
*/
if (sp->role.invalid)
if (sp->role.invalid && sp->root_count)
continue;
if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
kvm_mmu_commit_zap_page(kvm, &invalid_list);
cond_resched_lock(&kvm->mmu_lock);
/*
* No need to flush the TLB since we're only zapping shadow
* pages with an obsolete generation number and all vCPUS have
* loaded a new root, i.e. the shadow pages being zapped cannot
* be in active use by the guest.
*/
if (batch >= BATCH_ZAP_PAGES &&
cond_resched_lock(&kvm->mmu_lock)) {
batch = 0;
goto restart;
}
if (__kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign))
if (__kvm_mmu_prepare_zap_page(kvm, sp,
&kvm->arch.zapped_obsolete_pages, &nr_zapped)) {
batch += nr_zapped;
goto restart;
}
}
kvm_mmu_commit_zap_page(kvm, &invalid_list);
/*
* Trigger a remote TLB flush before freeing the page tables to ensure
* KVM is not in the middle of a lockless shadow page table walk, which
* may reference the pages.
*/
kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
}
/*
@@ -5755,13 +5739,39 @@ restart:
*/
static void kvm_mmu_zap_all_fast(struct kvm *kvm)
{
lockdep_assert_held(&kvm->slots_lock);
spin_lock(&kvm->mmu_lock);
kvm->arch.mmu_valid_gen++;
trace_kvm_mmu_zap_all_fast(kvm);
/*
* Toggle mmu_valid_gen between '0' and '1'. Because slots_lock is
* held for the entire duration of zapping obsolete pages, it's
* impossible for there to be multiple invalid generations associated
* with *valid* shadow pages at any given time, i.e. there is exactly
* one valid generation and (at most) one invalid generation.
*/
kvm->arch.mmu_valid_gen = kvm->arch.mmu_valid_gen ? 0 : 1;
/*
* Notify all vcpus to reload its shadow page table and flush TLB.
* Then all vcpus will switch to new shadow page table with the new
* mmu_valid_gen.
*
* Note: we need to do this under the protection of mmu_lock,
* otherwise, vcpu would purge shadow page but miss tlb flush.
*/
kvm_reload_remote_mmus(kvm);
kvm_zap_obsolete_pages(kvm);
spin_unlock(&kvm->mmu_lock);
}
static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
{
return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
}
static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot,
struct kvm_page_track_notifier_node *node)
@@ -5959,7 +5969,7 @@ void kvm_mmu_slot_set_dirty(struct kvm *kvm,
}
EXPORT_SYMBOL_GPL(kvm_mmu_slot_set_dirty);
static void __kvm_mmu_zap_all(struct kvm *kvm, bool mmio_only)
void kvm_mmu_zap_all(struct kvm *kvm)
{
struct kvm_mmu_page *sp, *node;
LIST_HEAD(invalid_list);
@@ -5968,14 +5978,10 @@ static void __kvm_mmu_zap_all(struct kvm *kvm, bool mmio_only)
spin_lock(&kvm->mmu_lock);
restart:
list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
if (mmio_only && !sp->mmio_cached)
continue;
if (sp->role.invalid && sp->root_count)
continue;
if (__kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign)) {
WARN_ON_ONCE(mmio_only);
if (__kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign))
goto restart;
}
if (cond_resched_lock(&kvm->mmu_lock))
goto restart;
}
@@ -5984,11 +5990,6 @@ restart:
spin_unlock(&kvm->mmu_lock);
}
void kvm_mmu_zap_all(struct kvm *kvm)
{
return __kvm_mmu_zap_all(kvm, false);
}
void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen)
{
WARN_ON(gen & KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS);
@@ -6010,7 +6011,7 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen)
*/
if (unlikely(gen == 0)) {
kvm_debug_ratelimited("kvm: zapping shadow pages for mmio generation wraparound\n");
__kvm_mmu_zap_all(kvm, true);
kvm_mmu_zap_all_fast(kvm);
}
}
@@ -6041,16 +6042,24 @@ mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
* want to shrink a VM that only started to populate its MMU
* anyway.
*/
if (!kvm->arch.n_used_mmu_pages)
if (!kvm->arch.n_used_mmu_pages &&
!kvm_has_zapped_obsolete_pages(kvm))
continue;
idx = srcu_read_lock(&kvm->srcu);
spin_lock(&kvm->mmu_lock);
if (kvm_has_zapped_obsolete_pages(kvm)) {
kvm_mmu_commit_zap_page(kvm,
&kvm->arch.zapped_obsolete_pages);
goto unlock;
}
if (prepare_zap_oldest_mmu_page(kvm, &invalid_list))
freed++;
kvm_mmu_commit_zap_page(kvm, &invalid_list);
unlock:
spin_unlock(&kvm->mmu_lock);
srcu_read_unlock(&kvm->srcu, idx);

View File

@@ -8,16 +8,18 @@
#undef TRACE_SYSTEM
#define TRACE_SYSTEM kvmmmu
#define KVM_MMU_PAGE_FIELDS \
__field(__u64, gfn) \
__field(__u32, role) \
__field(__u32, root_count) \
#define KVM_MMU_PAGE_FIELDS \
__field(__u8, mmu_valid_gen) \
__field(__u64, gfn) \
__field(__u32, role) \
__field(__u32, root_count) \
__field(bool, unsync)
#define KVM_MMU_PAGE_ASSIGN(sp) \
__entry->gfn = sp->gfn; \
__entry->role = sp->role.word; \
__entry->root_count = sp->root_count; \
#define KVM_MMU_PAGE_ASSIGN(sp) \
__entry->mmu_valid_gen = sp->mmu_valid_gen; \
__entry->gfn = sp->gfn; \
__entry->role = sp->role.word; \
__entry->root_count = sp->root_count; \
__entry->unsync = sp->unsync;
#define KVM_MMU_PAGE_PRINTK() ({ \
@@ -29,8 +31,9 @@
\
role.word = __entry->role; \
\
trace_seq_printf(p, "sp gfn %llx l%u %u-byte q%u%s %s%s" \
trace_seq_printf(p, "sp gen %u gfn %llx l%u %u-byte q%u%s %s%s" \
" %snxe %sad root %u %s%c", \
__entry->mmu_valid_gen, \
__entry->gfn, role.level, \
role.gpte_is_8_bytes ? 8 : 4, \
role.quadrant, \
@@ -279,6 +282,27 @@ TRACE_EVENT(
)
);
TRACE_EVENT(
kvm_mmu_zap_all_fast,
TP_PROTO(struct kvm *kvm),
TP_ARGS(kvm),
TP_STRUCT__entry(
__field(__u8, mmu_valid_gen)
__field(unsigned int, mmu_used_pages)
),
TP_fast_assign(
__entry->mmu_valid_gen = kvm->arch.mmu_valid_gen;
__entry->mmu_used_pages = kvm->arch.n_used_mmu_pages;
),
TP_printk("kvm-mmu-valid-gen %u used_pages %x",
__entry->mmu_valid_gen, __entry->mmu_used_pages
)
);
TRACE_EVENT(
check_mmio_spte,
TP_PROTO(u64 spte, unsigned int kvm_gen, unsigned int spte_gen),

View File

@@ -777,17 +777,18 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
svm->next_rip = svm->vmcb->control.next_rip;
}
if (!svm->next_rip)
return kvm_emulate_instruction(vcpu, EMULTYPE_SKIP);
if (svm->next_rip - kvm_rip_read(vcpu) > MAX_INST_SIZE)
printk(KERN_ERR "%s: ip 0x%lx next 0x%llx\n",
__func__, kvm_rip_read(vcpu), svm->next_rip);
kvm_rip_write(vcpu, svm->next_rip);
if (!svm->next_rip) {
if (!kvm_emulate_instruction(vcpu, EMULTYPE_SKIP))
return 0;
} else {
if (svm->next_rip - kvm_rip_read(vcpu) > MAX_INST_SIZE)
pr_err("%s: ip 0x%lx next 0x%llx\n",
__func__, kvm_rip_read(vcpu), svm->next_rip);
kvm_rip_write(vcpu, svm->next_rip);
}
svm_set_interrupt_shadow(vcpu, 0);
return EMULATE_DONE;
return 1;
}
static void svm_queue_exception(struct kvm_vcpu *vcpu)
@@ -1539,6 +1540,7 @@ static void init_vmcb(struct vcpu_svm *svm)
set_intercept(svm, INTERCEPT_SKINIT);
set_intercept(svm, INTERCEPT_WBINVD);
set_intercept(svm, INTERCEPT_XSETBV);
set_intercept(svm, INTERCEPT_RDPRU);
set_intercept(svm, INTERCEPT_RSM);
if (!kvm_mwait_in_guest(svm->vcpu.kvm)) {
@@ -2768,17 +2770,18 @@ static int gp_interception(struct vcpu_svm *svm)
{
struct kvm_vcpu *vcpu = &svm->vcpu;
u32 error_code = svm->vmcb->control.exit_info_1;
int er;
WARN_ON_ONCE(!enable_vmware_backdoor);
er = kvm_emulate_instruction(vcpu,
EMULTYPE_VMWARE | EMULTYPE_NO_UD_ON_FAIL);
if (er == EMULATE_USER_EXIT)
return 0;
else if (er != EMULATE_DONE)
/*
* VMware backdoor emulation on #GP interception only handles IN{S},
* OUT{S}, and RDPMC, none of which generate a non-zero error code.
*/
if (error_code) {
kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
return 1;
return 1;
}
return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP);
}
static bool is_erratum_383(void)
@@ -2876,7 +2879,7 @@ static int io_interception(struct vcpu_svm *svm)
string = (io_info & SVM_IOIO_STR_MASK) != 0;
in = (io_info & SVM_IOIO_TYPE_MASK) != 0;
if (string)
return kvm_emulate_instruction(vcpu, 0) == EMULATE_DONE;
return kvm_emulate_instruction(vcpu, 0);
port = io_info >> 16;
size = (io_info & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT;
@@ -3830,6 +3833,12 @@ static int xsetbv_interception(struct vcpu_svm *svm)
return 1;
}
static int rdpru_interception(struct vcpu_svm *svm)
{
kvm_queue_exception(&svm->vcpu, UD_VECTOR);
return 1;
}
static int task_switch_interception(struct vcpu_svm *svm)
{
u16 tss_selector;
@@ -3883,24 +3892,15 @@ static int task_switch_interception(struct vcpu_svm *svm)
int_type == SVM_EXITINTINFO_TYPE_SOFT ||
(int_type == SVM_EXITINTINFO_TYPE_EXEPT &&
(int_vec == OF_VECTOR || int_vec == BP_VECTOR))) {
if (skip_emulated_instruction(&svm->vcpu) != EMULATE_DONE)
goto fail;
if (!skip_emulated_instruction(&svm->vcpu))
return 0;
}
if (int_type != SVM_EXITINTINFO_TYPE_SOFT)
int_vec = -1;
if (kvm_task_switch(&svm->vcpu, tss_selector, int_vec, reason,
has_error_code, error_code) == EMULATE_FAIL)
goto fail;
return 1;
fail:
svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
svm->vcpu.run->internal.ndata = 0;
return 0;
return kvm_task_switch(&svm->vcpu, tss_selector, int_vec, reason,
has_error_code, error_code);
}
static int cpuid_interception(struct vcpu_svm *svm)
@@ -3921,7 +3921,7 @@ static int iret_interception(struct vcpu_svm *svm)
static int invlpg_interception(struct vcpu_svm *svm)
{
if (!static_cpu_has(X86_FEATURE_DECODEASSISTS))
return kvm_emulate_instruction(&svm->vcpu, 0) == EMULATE_DONE;
return kvm_emulate_instruction(&svm->vcpu, 0);
kvm_mmu_invlpg(&svm->vcpu, svm->vmcb->control.exit_info_1);
return kvm_skip_emulated_instruction(&svm->vcpu);
@@ -3929,13 +3929,12 @@ static int invlpg_interception(struct vcpu_svm *svm)
static int emulate_on_interception(struct vcpu_svm *svm)
{
return kvm_emulate_instruction(&svm->vcpu, 0) == EMULATE_DONE;
return kvm_emulate_instruction(&svm->vcpu, 0);
}
static int rsm_interception(struct vcpu_svm *svm)
{
return kvm_emulate_instruction_from_buffer(&svm->vcpu,
rsm_ins_bytes, 2) == EMULATE_DONE;
return kvm_emulate_instruction_from_buffer(&svm->vcpu, rsm_ins_bytes, 2);
}
static int rdpmc_interception(struct vcpu_svm *svm)
@@ -4724,7 +4723,7 @@ static int avic_unaccelerated_access_interception(struct vcpu_svm *svm)
ret = avic_unaccel_trap_write(svm);
} else {
/* Handling Fault */
ret = (kvm_emulate_instruction(&svm->vcpu, 0) == EMULATE_DONE);
ret = kvm_emulate_instruction(&svm->vcpu, 0);
}
return ret;
@@ -4791,6 +4790,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
[SVM_EXIT_MONITOR] = monitor_interception,
[SVM_EXIT_MWAIT] = mwait_interception,
[SVM_EXIT_XSETBV] = xsetbv_interception,
[SVM_EXIT_RDPRU] = rdpru_interception,
[SVM_EXIT_NPF] = npf_interception,
[SVM_EXIT_RSM] = rsm_interception,
[SVM_EXIT_AVIC_INCOMPLETE_IPI] = avic_incomplete_ipi_interception,
@@ -7099,13 +7099,6 @@ failed:
return ret;
}
static int nested_enable_evmcs(struct kvm_vcpu *vcpu,
uint16_t *vmcs_version)
{
/* Intel-only feature */
return -ENODEV;
}
static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
{
unsigned long cr4 = kvm_read_cr4(vcpu);
@@ -7311,7 +7304,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.mem_enc_reg_region = svm_register_enc_region,
.mem_enc_unreg_region = svm_unregister_enc_region,
.nested_enable_evmcs = nested_enable_evmcs,
.nested_enable_evmcs = NULL,
.nested_get_evmcs_version = NULL,
.need_emulation_on_page_fault = svm_need_emulation_on_page_fault,

View File

@@ -247,6 +247,12 @@ static inline bool vmx_xsaves_supported(void)
SECONDARY_EXEC_XSAVES;
}
static inline bool vmx_waitpkg_supported(void)
{
return vmcs_config.cpu_based_2nd_exec_ctrl &
SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
}
static inline bool cpu_has_vmx_tsc_scaling(void)
{
return vmcs_config.cpu_based_2nd_exec_ctrl &

View File

@@ -178,6 +178,8 @@ static inline void evmcs_load(u64 phys_addr)
struct hv_vp_assist_page *vp_ap =
hv_get_vp_assist_page(smp_processor_id());
if (current_evmcs->hv_enlightenments_control.nested_flush_hypercall)
vp_ap->nested_control.features.directhypercall = 1;
vp_ap->current_nested_vmcs = phys_addr;
vp_ap->enlighten_vmentry = 1;
}

View File

@@ -198,6 +198,16 @@ static void nested_vmx_abort(struct kvm_vcpu *vcpu, u32 indicator)
pr_debug_ratelimited("kvm: nested vmx abort, indicator %d\n", indicator);
}
static inline bool vmx_control_verify(u32 control, u32 low, u32 high)
{
return fixed_bits_valid(control, low, high);
}
static inline u64 vmx_control_msr(u32 low, u32 high)
{
return low | ((u64)high << 32);
}
static void vmx_disable_shadow_vmcs(struct vcpu_vmx *vmx)
{
secondary_exec_controls_clearbit(vmx, SECONDARY_EXEC_SHADOW_VMCS);
@@ -866,16 +876,34 @@ static int nested_vmx_store_msr_check(struct kvm_vcpu *vcpu,
return 0;
}
static u32 nested_vmx_max_atomic_switch_msrs(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
u64 vmx_misc = vmx_control_msr(vmx->nested.msrs.misc_low,
vmx->nested.msrs.misc_high);
return (vmx_misc_max_msr(vmx_misc) + 1) * VMX_MISC_MSR_LIST_MULTIPLIER;
}
/*
* Load guest's/host's msr at nested entry/exit.
* return 0 for success, entry index for failure.
*
* One of the failure modes for MSR load/store is when a list exceeds the
* virtual hardware's capacity. To maintain compatibility with hardware inasmuch
* as possible, process all valid entries before failing rather than precheck
* for a capacity violation.
*/
static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
{
u32 i;
struct vmx_msr_entry e;
u32 max_msr_list_size = nested_vmx_max_atomic_switch_msrs(vcpu);
for (i = 0; i < count; i++) {
if (unlikely(i >= max_msr_list_size))
goto fail;
if (kvm_vcpu_read_guest(vcpu, gpa + i * sizeof(e),
&e, sizeof(e))) {
pr_debug_ratelimited(
@@ -906,8 +934,12 @@ static int nested_vmx_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
u64 data;
u32 i;
struct vmx_msr_entry e;
u32 max_msr_list_size = nested_vmx_max_atomic_switch_msrs(vcpu);
for (i = 0; i < count; i++) {
if (unlikely(i >= max_msr_list_size))
return -EINVAL;
if (kvm_vcpu_read_guest(vcpu,
gpa + i * sizeof(e),
&e, 2 * sizeof(u32))) {
@@ -1013,17 +1045,6 @@ static u16 nested_get_vpid02(struct kvm_vcpu *vcpu)
return vmx->nested.vpid02 ? vmx->nested.vpid02 : vmx->vpid;
}
static inline bool vmx_control_verify(u32 control, u32 low, u32 high)
{
return fixed_bits_valid(control, low, high);
}
static inline u64 vmx_control_msr(u32 low, u32 high)
{
return low | ((u64)high << 32);
}
static bool is_bitwise_subset(u64 superset, u64 subset, u64 mask)
{
superset &= mask;
@@ -2089,6 +2110,7 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
SECONDARY_EXEC_ENABLE_INVPCID |
SECONDARY_EXEC_RDTSCP |
SECONDARY_EXEC_XSAVES |
SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
SECONDARY_EXEC_APIC_REGISTER_VIRT |
SECONDARY_EXEC_ENABLE_VMFUNC);
@@ -2642,8 +2664,23 @@ static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu,
CC(!kvm_pat_valid(vmcs12->host_ia32_pat)))
return -EINVAL;
ia32e = (vmcs12->vm_exit_controls &
VM_EXIT_HOST_ADDR_SPACE_SIZE) != 0;
#ifdef CONFIG_X86_64
ia32e = !!(vcpu->arch.efer & EFER_LMA);
#else
ia32e = false;
#endif
if (ia32e) {
if (CC(!(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE)) ||
CC(!(vmcs12->host_cr4 & X86_CR4_PAE)))
return -EINVAL;
} else {
if (CC(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE) ||
CC(vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) ||
CC(vmcs12->host_cr4 & X86_CR4_PCIDE) ||
CC((vmcs12->host_rip) >> 32))
return -EINVAL;
}
if (CC(vmcs12->host_cs_selector & (SEGMENT_RPL_MASK | SEGMENT_TI_MASK)) ||
CC(vmcs12->host_ss_selector & (SEGMENT_RPL_MASK | SEGMENT_TI_MASK)) ||
@@ -2662,7 +2699,8 @@ static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu,
CC(is_noncanonical_address(vmcs12->host_gs_base, vcpu)) ||
CC(is_noncanonical_address(vmcs12->host_gdtr_base, vcpu)) ||
CC(is_noncanonical_address(vmcs12->host_idtr_base, vcpu)) ||
CC(is_noncanonical_address(vmcs12->host_tr_base, vcpu)))
CC(is_noncanonical_address(vmcs12->host_tr_base, vcpu)) ||
CC(is_noncanonical_address(vmcs12->host_rip, vcpu)))
return -EINVAL;
#endif
@@ -5441,6 +5479,10 @@ bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason)
case EXIT_REASON_ENCLS:
/* SGX is never exposed to L1 */
return false;
case EXIT_REASON_UMWAIT:
case EXIT_REASON_TPAUSE:
return nested_cpu_has2(vmcs12,
SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE);
default:
return true;
}

View File

@@ -11,8 +11,13 @@
#include "vmcs.h"
#define __ex(x) __kvm_handle_fault_on_reboot(x)
#define __ex_clear(x, reg) \
____kvm_handle_fault_on_reboot(x, "xor " reg ", " reg)
asmlinkage void vmread_error(unsigned long field, bool fault);
void vmwrite_error(unsigned long field, unsigned long value);
void vmclear_error(struct vmcs *vmcs, u64 phys_addr);
void vmptrld_error(struct vmcs *vmcs, u64 phys_addr);
void invvpid_error(unsigned long ext, u16 vpid, gva_t gva);
void invept_error(unsigned long ext, u64 eptp, gpa_t gpa);
static __always_inline void vmcs_check16(unsigned long field)
{
@@ -62,8 +67,22 @@ static __always_inline unsigned long __vmcs_readl(unsigned long field)
{
unsigned long value;
asm volatile (__ex_clear("vmread %1, %0", "%k0")
: "=r"(value) : "r"(field));
asm volatile("1: vmread %2, %1\n\t"
".byte 0x3e\n\t" /* branch taken hint */
"ja 3f\n\t"
"mov %2, %%" _ASM_ARG1 "\n\t"
"xor %%" _ASM_ARG2 ", %%" _ASM_ARG2 "\n\t"
"2: call vmread_error\n\t"
"xor %k1, %k1\n\t"
"3:\n\t"
".pushsection .fixup, \"ax\"\n\t"
"4: mov %2, %%" _ASM_ARG1 "\n\t"
"mov $1, %%" _ASM_ARG2 "\n\t"
"jmp 2b\n\t"
".popsection\n\t"
_ASM_EXTABLE(1b, 4b)
: ASM_CALL_CONSTRAINT, "=r"(value) : "r"(field) : "cc");
return value;
}
@@ -103,21 +122,39 @@ static __always_inline unsigned long vmcs_readl(unsigned long field)
return __vmcs_readl(field);
}
static noinline void vmwrite_error(unsigned long field, unsigned long value)
{
printk(KERN_ERR "vmwrite error: reg %lx value %lx (err %d)\n",
field, value, vmcs_read32(VM_INSTRUCTION_ERROR));
dump_stack();
}
#define vmx_asm1(insn, op1, error_args...) \
do { \
asm_volatile_goto("1: " __stringify(insn) " %0\n\t" \
".byte 0x2e\n\t" /* branch not taken hint */ \
"jna %l[error]\n\t" \
_ASM_EXTABLE(1b, %l[fault]) \
: : op1 : "cc" : error, fault); \
return; \
error: \
insn##_error(error_args); \
return; \
fault: \
kvm_spurious_fault(); \
} while (0)
#define vmx_asm2(insn, op1, op2, error_args...) \
do { \
asm_volatile_goto("1: " __stringify(insn) " %1, %0\n\t" \
".byte 0x2e\n\t" /* branch not taken hint */ \
"jna %l[error]\n\t" \
_ASM_EXTABLE(1b, %l[fault]) \
: : op1, op2 : "cc" : error, fault); \
return; \
error: \
insn##_error(error_args); \
return; \
fault: \
kvm_spurious_fault(); \
} while (0)
static __always_inline void __vmcs_writel(unsigned long field, unsigned long value)
{
bool error;
asm volatile (__ex("vmwrite %2, %1") CC_SET(na)
: CC_OUT(na) (error) : "r"(field), "rm"(value));
if (unlikely(error))
vmwrite_error(field, value);
vmx_asm2(vmwrite, "r"(field), "rm"(value), field, value);
}
static __always_inline void vmcs_write16(unsigned long field, u16 value)
@@ -182,28 +219,18 @@ static __always_inline void vmcs_set_bits(unsigned long field, u32 mask)
static inline void vmcs_clear(struct vmcs *vmcs)
{
u64 phys_addr = __pa(vmcs);
bool error;
asm volatile (__ex("vmclear %1") CC_SET(na)
: CC_OUT(na) (error) : "m"(phys_addr));
if (unlikely(error))
printk(KERN_ERR "kvm: vmclear fail: %p/%llx\n",
vmcs, phys_addr);
vmx_asm1(vmclear, "m"(phys_addr), vmcs, phys_addr);
}
static inline void vmcs_load(struct vmcs *vmcs)
{
u64 phys_addr = __pa(vmcs);
bool error;
if (static_branch_unlikely(&enable_evmcs))
return evmcs_load(phys_addr);
asm volatile (__ex("vmptrld %1") CC_SET(na)
: CC_OUT(na) (error) : "m"(phys_addr));
if (unlikely(error))
printk(KERN_ERR "kvm: vmptrld %p/%llx failed\n",
vmcs, phys_addr);
vmx_asm1(vmptrld, "m"(phys_addr), vmcs, phys_addr);
}
static inline void __invvpid(unsigned long ext, u16 vpid, gva_t gva)
@@ -213,11 +240,8 @@ static inline void __invvpid(unsigned long ext, u16 vpid, gva_t gva)
u64 rsvd : 48;
u64 gva;
} operand = { vpid, 0, gva };
bool error;
asm volatile (__ex("invvpid %2, %1") CC_SET(na)
: CC_OUT(na) (error) : "r"(ext), "m"(operand));
BUG_ON(error);
vmx_asm2(invvpid, "r"(ext), "m"(operand), ext, vpid, gva);
}
static inline void __invept(unsigned long ext, u64 eptp, gpa_t gpa)
@@ -225,11 +249,8 @@ static inline void __invept(unsigned long ext, u64 eptp, gpa_t gpa)
struct {
u64 eptp, gpa;
} operand = {eptp, gpa};
bool error;
asm volatile (__ex("invept %2, %1") CC_SET(na)
: CC_OUT(na) (error) : "r"(ext), "m"(operand));
BUG_ON(error);
vmx_asm2(invept, "r"(ext), "m"(operand), ext, eptp, gpa);
}
static inline bool vpid_sync_vcpu_addr(int vpid, gva_t addr)

View File

@@ -343,6 +343,48 @@ static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bit
void vmx_vmexit(void);
#define vmx_insn_failed(fmt...) \
do { \
WARN_ONCE(1, fmt); \
pr_warn_ratelimited(fmt); \
} while (0)
asmlinkage void vmread_error(unsigned long field, bool fault)
{
if (fault)
kvm_spurious_fault();
else
vmx_insn_failed("kvm: vmread failed: field=%lx\n", field);
}
noinline void vmwrite_error(unsigned long field, unsigned long value)
{
vmx_insn_failed("kvm: vmwrite failed: field=%lx val=%lx err=%d\n",
field, value, vmcs_read32(VM_INSTRUCTION_ERROR));
}
noinline void vmclear_error(struct vmcs *vmcs, u64 phys_addr)
{
vmx_insn_failed("kvm: vmclear failed: %p/%llx\n", vmcs, phys_addr);
}
noinline void vmptrld_error(struct vmcs *vmcs, u64 phys_addr)
{
vmx_insn_failed("kvm: vmptrld failed: %p/%llx\n", vmcs, phys_addr);
}
noinline void invvpid_error(unsigned long ext, u16 vpid, gva_t gva)
{
vmx_insn_failed("kvm: invvpid failed: ext=0x%lx vpid=%u gva=0x%lx\n",
ext, vpid, gva);
}
noinline void invept_error(unsigned long ext, u64 eptp, gpa_t gpa)
{
vmx_insn_failed("kvm: invept failed: ext=0x%lx eptp=%llx gpa=0x%llx\n",
ext, eptp, gpa);
}
static DEFINE_PER_CPU(struct vmcs *, vmxarea);
DEFINE_PER_CPU(struct vmcs *, current_vmcs);
/*
@@ -486,6 +528,31 @@ static int hv_remote_flush_tlb(struct kvm *kvm)
return hv_remote_flush_tlb_with_range(kvm, NULL);
}
static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu)
{
struct hv_enlightened_vmcs *evmcs;
struct hv_partition_assist_pg **p_hv_pa_pg =
&vcpu->kvm->arch.hyperv.hv_pa_pg;
/*
* Synthetic VM-Exit is not enabled in current code and so All
* evmcs in singe VM shares same assist page.
*/
if (!*p_hv_pa_pg)
*p_hv_pa_pg = kzalloc(PAGE_SIZE, GFP_KERNEL);
if (!*p_hv_pa_pg)
return -ENOMEM;
evmcs = (struct hv_enlightened_vmcs *)to_vmx(vcpu)->loaded_vmcs->vmcs;
evmcs->partition_assist_page =
__pa(*p_hv_pa_pg);
evmcs->hv_vm_id = (unsigned long)vcpu->kvm;
evmcs->hv_enlightenments_control.nested_flush_hypercall = 1;
return 0;
}
#endif /* IS_ENABLED(CONFIG_HYPERV) */
/*
@@ -1472,27 +1539,32 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
return 0;
}
/*
* Returns an int to be compatible with SVM implementation (which can fail).
* Do not use directly, use skip_emulated_instruction() instead.
*/
static int __skip_emulated_instruction(struct kvm_vcpu *vcpu)
static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
{
unsigned long rip;
rip = kvm_rip_read(vcpu);
rip += vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
kvm_rip_write(vcpu, rip);
/*
* Using VMCS.VM_EXIT_INSTRUCTION_LEN on EPT misconfig depends on
* undefined behavior: Intel's SDM doesn't mandate the VMCS field be
* set when EPT misconfig occurs. In practice, real hardware updates
* VM_EXIT_INSTRUCTION_LEN on EPT misconfig, but other hypervisors
* (namely Hyper-V) don't set it due to it being undefined behavior,
* i.e. we end up advancing IP with some random value.
*/
if (!static_cpu_has(X86_FEATURE_HYPERVISOR) ||
to_vmx(vcpu)->exit_reason != EXIT_REASON_EPT_MISCONFIG) {
rip = kvm_rip_read(vcpu);
rip += vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
kvm_rip_write(vcpu, rip);
} else {
if (!kvm_emulate_instruction(vcpu, EMULTYPE_SKIP))
return 0;
}
/* skipping an emulated instruction also counts */
vmx_set_interrupt_shadow(vcpu, 0);
return EMULATE_DONE;
}
static inline void skip_emulated_instruction(struct kvm_vcpu *vcpu)
{
(void)__skip_emulated_instruction(vcpu);
return 1;
}
static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
@@ -1527,8 +1599,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
int inc_eip = 0;
if (kvm_exception_is_soft(nr))
inc_eip = vcpu->arch.event_exit_inst_len;
if (kvm_inject_realmode_interrupt(vcpu, nr, inc_eip) != EMULATE_DONE)
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
kvm_inject_realmode_interrupt(vcpu, nr, inc_eip);
return;
}
@@ -1700,6 +1771,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
#endif
case MSR_EFER:
return kvm_get_msr_common(vcpu, msr_info);
case MSR_IA32_UMWAIT_CONTROL:
if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx))
return 1;
msr_info->data = vmx->msr_ia32_umwait_control;
break;
case MSR_IA32_SPEC_CTRL:
if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
@@ -1873,6 +1950,16 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 1;
vmcs_write64(GUEST_BNDCFGS, data);
break;
case MSR_IA32_UMWAIT_CONTROL:
if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx))
return 1;
/* The reserved bit 1 and non-32 bit [63:32] should be zero */
if (data & (BIT_ULL(1) | GENMASK_ULL(63, 32)))
return 1;
vmx->msr_ia32_umwait_control = data;
break;
case MSR_IA32_SPEC_CTRL:
if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
@@ -2290,6 +2377,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
SECONDARY_EXEC_RDRAND_EXITING |
SECONDARY_EXEC_ENABLE_PML |
SECONDARY_EXEC_TSC_SCALING |
SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
SECONDARY_EXEC_PT_USE_GPA |
SECONDARY_EXEC_PT_CONCEAL_VMX |
SECONDARY_EXEC_ENABLE_VMFUNC |
@@ -4026,6 +4114,23 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx)
}
}
if (vmx_waitpkg_supported()) {
bool waitpkg_enabled =
guest_cpuid_has(vcpu, X86_FEATURE_WAITPKG);
if (!waitpkg_enabled)
exec_control &= ~SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
if (nested) {
if (waitpkg_enabled)
vmx->nested.msrs.secondary_ctls_high |=
SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
else
vmx->nested.msrs.secondary_ctls_high &=
~SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
}
}
vmx->secondary_exec_control = exec_control;
}
@@ -4160,6 +4265,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
vmx->rmode.vm86_active = 0;
vmx->spec_ctrl = 0;
vmx->msr_ia32_umwait_control = 0;
vcpu->arch.microcode_version = 0x100000000ULL;
vmx->vcpu.arch.regs[VCPU_REGS_RDX] = get_rdx_init_val();
vmx->hv_deadline_tsc = -1;
@@ -4277,8 +4384,7 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu)
int inc_eip = 0;
if (vcpu->arch.interrupt.soft)
inc_eip = vcpu->arch.event_exit_inst_len;
if (kvm_inject_realmode_interrupt(vcpu, irq, inc_eip) != EMULATE_DONE)
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
kvm_inject_realmode_interrupt(vcpu, irq, inc_eip);
return;
}
intr = irq | INTR_INFO_VALID_MASK;
@@ -4314,8 +4420,7 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
vmx->loaded_vmcs->nmi_known_unmasked = false;
if (vmx->rmode.vm86_active) {
if (kvm_inject_realmode_interrupt(vcpu, NMI_VECTOR, 0) != EMULATE_DONE)
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
kvm_inject_realmode_interrupt(vcpu, NMI_VECTOR, 0);
return;
}
@@ -4442,7 +4547,7 @@ static int handle_rmode_exception(struct kvm_vcpu *vcpu,
* Cause the #SS fault with 0 error code in VM86 mode.
*/
if (((vec == GP_VECTOR) || (vec == SS_VECTOR)) && err_code == 0) {
if (kvm_emulate_instruction(vcpu, 0) == EMULATE_DONE) {
if (kvm_emulate_instruction(vcpu, 0)) {
if (vcpu->arch.halt_request) {
vcpu->arch.halt_request = 0;
return kvm_vcpu_halt(vcpu);
@@ -4493,7 +4598,6 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
u32 intr_info, ex_no, error_code;
unsigned long cr2, rip, dr6;
u32 vect_info;
enum emulation_result er;
vect_info = vmx->idt_vectoring_info;
intr_info = vmx->exit_intr_info;
@@ -4510,13 +4614,17 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
if (!vmx->rmode.vm86_active && is_gp_fault(intr_info)) {
WARN_ON_ONCE(!enable_vmware_backdoor);
er = kvm_emulate_instruction(vcpu,
EMULTYPE_VMWARE | EMULTYPE_NO_UD_ON_FAIL);
if (er == EMULATE_USER_EXIT)
return 0;
else if (er != EMULATE_DONE)
/*
* VMware backdoor emulation on #GP interception only handles
* IN{S}, OUT{S}, and RDPMC, none of which generate a non-zero
* error code on #GP.
*/
if (error_code) {
kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
return 1;
return 1;
}
return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP);
}
/*
@@ -4558,7 +4666,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
vcpu->arch.dr6 &= ~DR_TRAP_BITS;
vcpu->arch.dr6 |= dr6 | DR6_RTM;
if (is_icebp(intr_info))
skip_emulated_instruction(vcpu);
WARN_ON(!skip_emulated_instruction(vcpu));
kvm_queue_exception(vcpu, DB_VECTOR);
return 1;
@@ -4613,7 +4721,7 @@ static int handle_io(struct kvm_vcpu *vcpu)
++vcpu->stat.io_exits;
if (string)
return kvm_emulate_instruction(vcpu, 0) == EMULATE_DONE;
return kvm_emulate_instruction(vcpu, 0);
port = exit_qualification >> 16;
size = (exit_qualification & 7) + 1;
@@ -4687,7 +4795,7 @@ static int handle_set_cr4(struct kvm_vcpu *vcpu, unsigned long val)
static int handle_desc(struct kvm_vcpu *vcpu)
{
WARN_ON(!(vcpu->arch.cr4 & X86_CR4_UMIP));
return kvm_emulate_instruction(vcpu, 0) == EMULATE_DONE;
return kvm_emulate_instruction(vcpu, 0);
}
static int handle_cr(struct kvm_vcpu *vcpu)
@@ -4903,7 +5011,7 @@ static int handle_vmcall(struct kvm_vcpu *vcpu)
static int handle_invd(struct kvm_vcpu *vcpu)
{
return kvm_emulate_instruction(vcpu, 0) == EMULATE_DONE;
return kvm_emulate_instruction(vcpu, 0);
}
static int handle_invlpg(struct kvm_vcpu *vcpu)
@@ -4937,20 +5045,6 @@ static int handle_xsetbv(struct kvm_vcpu *vcpu)
return 1;
}
static int handle_xsaves(struct kvm_vcpu *vcpu)
{
kvm_skip_emulated_instruction(vcpu);
WARN(1, "this should never happen\n");
return 1;
}
static int handle_xrstors(struct kvm_vcpu *vcpu)
{
kvm_skip_emulated_instruction(vcpu);
WARN(1, "this should never happen\n");
return 1;
}
static int handle_apic_access(struct kvm_vcpu *vcpu)
{
if (likely(fasteoi)) {
@@ -4970,7 +5064,7 @@ static int handle_apic_access(struct kvm_vcpu *vcpu)
return kvm_skip_emulated_instruction(vcpu);
}
}
return kvm_emulate_instruction(vcpu, 0) == EMULATE_DONE;
return kvm_emulate_instruction(vcpu, 0);
}
static int handle_apic_eoi_induced(struct kvm_vcpu *vcpu)
@@ -5039,23 +5133,15 @@ static int handle_task_switch(struct kvm_vcpu *vcpu)
if (!idt_v || (type != INTR_TYPE_HARD_EXCEPTION &&
type != INTR_TYPE_EXT_INTR &&
type != INTR_TYPE_NMI_INTR))
skip_emulated_instruction(vcpu);
if (kvm_task_switch(vcpu, tss_selector,
type == INTR_TYPE_SOFT_INTR ? idt_index : -1, reason,
has_error_code, error_code) == EMULATE_FAIL) {
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
return 0;
}
WARN_ON(!skip_emulated_instruction(vcpu));
/*
* TODO: What about debug traps on tss switch?
* Are we supposed to inject them and update dr6?
*/
return 1;
return kvm_task_switch(vcpu, tss_selector,
type == INTR_TYPE_SOFT_INTR ? idt_index : -1,
reason, has_error_code, error_code);
}
static int handle_ept_violation(struct kvm_vcpu *vcpu)
@@ -5114,21 +5200,7 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
if (!is_guest_mode(vcpu) &&
!kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) {
trace_kvm_fast_mmio(gpa);
/*
* Doing kvm_skip_emulated_instruction() depends on undefined
* behavior: Intel's manual doesn't mandate
* VM_EXIT_INSTRUCTION_LEN to be set in VMCS when EPT MISCONFIG
* occurs and while on real hardware it was observed to be set,
* other hypervisors (namely Hyper-V) don't set it, we end up
* advancing IP with some random value. Disable fast mmio when
* running nested and keep it for real hardware in hope that
* VM_EXIT_INSTRUCTION_LEN will always be set correctly.
*/
if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
return kvm_skip_emulated_instruction(vcpu);
else
return kvm_emulate_instruction(vcpu, EMULTYPE_SKIP) ==
EMULATE_DONE;
return kvm_skip_emulated_instruction(vcpu);
}
return kvm_mmu_page_fault(vcpu, gpa, PFERR_RSVD_MASK, NULL, 0);
@@ -5147,8 +5219,6 @@ static int handle_nmi_window(struct kvm_vcpu *vcpu)
static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
enum emulation_result err = EMULATE_DONE;
int ret = 1;
bool intr_window_requested;
unsigned count = 130;
@@ -5169,41 +5239,35 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
if (kvm_test_request(KVM_REQ_EVENT, vcpu))
return 1;
err = kvm_emulate_instruction(vcpu, 0);
if (err == EMULATE_USER_EXIT) {
++vcpu->stat.mmio_exits;
ret = 0;
goto out;
}
if (err != EMULATE_DONE)
goto emulation_error;
if (!kvm_emulate_instruction(vcpu, 0))
return 0;
if (vmx->emulation_required && !vmx->rmode.vm86_active &&
vcpu->arch.exception.pending)
goto emulation_error;
vcpu->arch.exception.pending) {
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror =
KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
return 0;
}
if (vcpu->arch.halt_request) {
vcpu->arch.halt_request = 0;
ret = kvm_vcpu_halt(vcpu);
goto out;
return kvm_vcpu_halt(vcpu);
}
/*
* Note, return 1 and not 0, vcpu_run() is responsible for
* morphing the pending signal into the proper return code.
*/
if (signal_pending(current))
goto out;
return 1;
if (need_resched())
schedule();
}
out:
return ret;
emulation_error:
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
return 0;
return 1;
}
static void grow_ple_window(struct kvm_vcpu *vcpu)
@@ -5474,6 +5538,14 @@ static int handle_encls(struct kvm_vcpu *vcpu)
return 1;
}
static int handle_unexpected_vmexit(struct kvm_vcpu *vcpu)
{
kvm_skip_emulated_instruction(vcpu);
WARN_ONCE(1, "Unexpected VM-Exit Reason = 0x%x",
vmcs_read32(VM_EXIT_REASON));
return 1;
}
/*
* The exit handlers return 1 if the exit was handled fully and guest execution
* may resume. Otherwise they set the kvm_run parameter to indicate what needs
@@ -5525,13 +5597,15 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
[EXIT_REASON_INVVPID] = handle_vmx_instruction,
[EXIT_REASON_RDRAND] = handle_invalid_op,
[EXIT_REASON_RDSEED] = handle_invalid_op,
[EXIT_REASON_XSAVES] = handle_xsaves,
[EXIT_REASON_XRSTORS] = handle_xrstors,
[EXIT_REASON_XSAVES] = handle_unexpected_vmexit,
[EXIT_REASON_XRSTORS] = handle_unexpected_vmexit,
[EXIT_REASON_PML_FULL] = handle_pml_full,
[EXIT_REASON_INVPCID] = handle_invpcid,
[EXIT_REASON_VMFUNC] = handle_vmx_instruction,
[EXIT_REASON_PREEMPTION_TIMER] = handle_preemption_timer,
[EXIT_REASON_ENCLS] = handle_encls,
[EXIT_REASON_UMWAIT] = handle_unexpected_vmexit,
[EXIT_REASON_TPAUSE] = handle_unexpected_vmexit,
};
static const int kvm_vmx_max_exit_handlers =
@@ -6362,6 +6436,23 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
msrs[i].host, false);
}
static void atomic_switch_umwait_control_msr(struct vcpu_vmx *vmx)
{
u32 host_umwait_control;
if (!vmx_has_waitpkg(vmx))
return;
host_umwait_control = get_umwait_control_msr();
if (vmx->msr_ia32_umwait_control != host_umwait_control)
add_atomic_switch_msr(vmx, MSR_IA32_UMWAIT_CONTROL,
vmx->msr_ia32_umwait_control,
host_umwait_control, false);
else
clear_atomic_switch_msr(vmx, MSR_IA32_UMWAIT_CONTROL);
}
static void vmx_update_hv_timer(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -6456,6 +6547,7 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
pt_guest_enter(vmx);
atomic_switch_perf_msrs(vmx);
atomic_switch_umwait_control_msr(vmx);
if (enable_preemption_timer)
vmx_update_hv_timer(vcpu);
@@ -6511,6 +6603,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
current_evmcs->hv_clean_fields |=
HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL;
if (static_branch_unlikely(&enable_evmcs))
current_evmcs->hv_vp_id = vcpu->arch.hyperv.vp_index;
/* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
if (vmx->host_debugctlmsr)
update_debugctlmsr(vmx->host_debugctlmsr);
@@ -6578,6 +6673,7 @@ static struct kvm *vmx_vm_alloc(void)
static void vmx_vm_free(struct kvm *kvm)
{
kfree(kvm->arch.hyperv.hv_pa_pg);
vfree(to_kvm_vmx(kvm));
}
@@ -7706,7 +7802,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
.run = vmx_vcpu_run,
.handle_exit = vmx_handle_exit,
.skip_emulated_instruction = __skip_emulated_instruction,
.skip_emulated_instruction = skip_emulated_instruction,
.set_interrupt_shadow = vmx_set_interrupt_shadow,
.get_interrupt_shadow = vmx_get_interrupt_shadow,
.patch_hypercall = vmx_patch_hypercall,
@@ -7837,6 +7933,7 @@ static void vmx_exit(void)
if (!vp_ap)
continue;
vp_ap->nested_control.features.directhypercall = 0;
vp_ap->current_nested_vmcs = 0;
vp_ap->enlighten_vmentry = 0;
}
@@ -7876,6 +7973,11 @@ static int __init vmx_init(void)
pr_info("KVM: vmx: using Hyper-V Enlightened VMCS\n");
static_branch_enable(&enable_evmcs);
}
if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH)
vmx_x86_ops.enable_direct_tlbflush
= hv_enable_direct_tlbflush;
} else {
enlightened_vmcs = false;
}

View File

@@ -14,6 +14,8 @@
extern const u32 vmx_msr_index[];
extern u64 host_efer;
extern u32 get_umwait_control_msr(void);
#define MSR_TYPE_R 1
#define MSR_TYPE_W 2
#define MSR_TYPE_RW 3
@@ -211,6 +213,7 @@ struct vcpu_vmx {
#endif
u64 spec_ctrl;
u32 msr_ia32_umwait_control;
u32 secondary_exec_control;
@@ -497,6 +500,12 @@ static inline void decache_tsc_multiplier(struct vcpu_vmx *vmx)
vmcs_write64(TSC_MULTIPLIER, vmx->current_tsc_ratio);
}
static inline bool vmx_has_waitpkg(struct vcpu_vmx *vmx)
{
return vmx->secondary_exec_control &
SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
}
void dump_vmcs(void);
#endif /* __KVM_X86_VMX_H */

View File

@@ -360,7 +360,8 @@ EXPORT_SYMBOL_GPL(kvm_set_apic_base);
asmlinkage __visible void kvm_spurious_fault(void)
{
/* Fault while not rebooting. We want the trace. */
BUG();
if (!kvm_rebooting)
BUG();
}
EXPORT_SYMBOL_GPL(kvm_spurious_fault);
@@ -1145,6 +1146,44 @@ static u32 msrs_to_save[] = {
MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
MSR_IA32_UMWAIT_CONTROL,
MSR_ARCH_PERFMON_FIXED_CTR0, MSR_ARCH_PERFMON_FIXED_CTR1,
MSR_ARCH_PERFMON_FIXED_CTR0 + 2, MSR_ARCH_PERFMON_FIXED_CTR0 + 3,
MSR_CORE_PERF_FIXED_CTR_CTRL, MSR_CORE_PERF_GLOBAL_STATUS,
MSR_CORE_PERF_GLOBAL_CTRL, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
MSR_ARCH_PERFMON_PERFCTR0, MSR_ARCH_PERFMON_PERFCTR1,
MSR_ARCH_PERFMON_PERFCTR0 + 2, MSR_ARCH_PERFMON_PERFCTR0 + 3,
MSR_ARCH_PERFMON_PERFCTR0 + 4, MSR_ARCH_PERFMON_PERFCTR0 + 5,
MSR_ARCH_PERFMON_PERFCTR0 + 6, MSR_ARCH_PERFMON_PERFCTR0 + 7,
MSR_ARCH_PERFMON_PERFCTR0 + 8, MSR_ARCH_PERFMON_PERFCTR0 + 9,
MSR_ARCH_PERFMON_PERFCTR0 + 10, MSR_ARCH_PERFMON_PERFCTR0 + 11,
MSR_ARCH_PERFMON_PERFCTR0 + 12, MSR_ARCH_PERFMON_PERFCTR0 + 13,
MSR_ARCH_PERFMON_PERFCTR0 + 14, MSR_ARCH_PERFMON_PERFCTR0 + 15,
MSR_ARCH_PERFMON_PERFCTR0 + 16, MSR_ARCH_PERFMON_PERFCTR0 + 17,
MSR_ARCH_PERFMON_PERFCTR0 + 18, MSR_ARCH_PERFMON_PERFCTR0 + 19,
MSR_ARCH_PERFMON_PERFCTR0 + 20, MSR_ARCH_PERFMON_PERFCTR0 + 21,
MSR_ARCH_PERFMON_PERFCTR0 + 22, MSR_ARCH_PERFMON_PERFCTR0 + 23,
MSR_ARCH_PERFMON_PERFCTR0 + 24, MSR_ARCH_PERFMON_PERFCTR0 + 25,
MSR_ARCH_PERFMON_PERFCTR0 + 26, MSR_ARCH_PERFMON_PERFCTR0 + 27,
MSR_ARCH_PERFMON_PERFCTR0 + 28, MSR_ARCH_PERFMON_PERFCTR0 + 29,
MSR_ARCH_PERFMON_PERFCTR0 + 30, MSR_ARCH_PERFMON_PERFCTR0 + 31,
MSR_ARCH_PERFMON_EVENTSEL0, MSR_ARCH_PERFMON_EVENTSEL1,
MSR_ARCH_PERFMON_EVENTSEL0 + 2, MSR_ARCH_PERFMON_EVENTSEL0 + 3,
MSR_ARCH_PERFMON_EVENTSEL0 + 4, MSR_ARCH_PERFMON_EVENTSEL0 + 5,
MSR_ARCH_PERFMON_EVENTSEL0 + 6, MSR_ARCH_PERFMON_EVENTSEL0 + 7,
MSR_ARCH_PERFMON_EVENTSEL0 + 8, MSR_ARCH_PERFMON_EVENTSEL0 + 9,
MSR_ARCH_PERFMON_EVENTSEL0 + 10, MSR_ARCH_PERFMON_EVENTSEL0 + 11,
MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
MSR_ARCH_PERFMON_EVENTSEL0 + 18, MSR_ARCH_PERFMON_EVENTSEL0 + 19,
MSR_ARCH_PERFMON_EVENTSEL0 + 20, MSR_ARCH_PERFMON_EVENTSEL0 + 21,
MSR_ARCH_PERFMON_EVENTSEL0 + 22, MSR_ARCH_PERFMON_EVENTSEL0 + 23,
MSR_ARCH_PERFMON_EVENTSEL0 + 24, MSR_ARCH_PERFMON_EVENTSEL0 + 25,
MSR_ARCH_PERFMON_EVENTSEL0 + 26, MSR_ARCH_PERFMON_EVENTSEL0 + 27,
MSR_ARCH_PERFMON_EVENTSEL0 + 28, MSR_ARCH_PERFMON_EVENTSEL0 + 29,
MSR_ARCH_PERFMON_EVENTSEL0 + 30, MSR_ARCH_PERFMON_EVENTSEL0 + 31,
};
static unsigned num_msrs_to_save;
@@ -3169,7 +3208,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_HYPERV_EVENTFD:
case KVM_CAP_HYPERV_TLBFLUSH:
case KVM_CAP_HYPERV_SEND_IPI:
case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
case KVM_CAP_HYPERV_CPUID:
case KVM_CAP_PCI_SEGMENT:
case KVM_CAP_DEBUGREGS:
@@ -3246,6 +3284,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = kvm_x86_ops->get_nested_state ?
kvm_x86_ops->get_nested_state(NULL, NULL, 0) : 0;
break;
case KVM_CAP_HYPERV_DIRECT_TLBFLUSH:
r = kvm_x86_ops->enable_direct_tlbflush != NULL;
break;
case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
r = kvm_x86_ops->nested_enable_evmcs != NULL;
break;
default:
break;
}
@@ -4019,6 +4063,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
r = -EFAULT;
}
return r;
case KVM_CAP_HYPERV_DIRECT_TLBFLUSH:
if (!kvm_x86_ops->enable_direct_tlbflush)
return -ENOTTY;
return kvm_x86_ops->enable_direct_tlbflush(vcpu);
default:
return -EINVAL;
@@ -5051,6 +5100,11 @@ static void kvm_init_msr_list(void)
u32 dummy[2];
unsigned i, j;
BUILD_BUG_ON_MSG(INTEL_PMC_MAX_FIXED != 4,
"Please update the fixed PMCs in msrs_to_save[]");
BUILD_BUG_ON_MSG(INTEL_PMC_MAX_GENERIC != 32,
"Please update the generic perfctr/eventsel MSRs in msrs_to_save[]");
for (i = j = 0; i < ARRAY_SIZE(msrs_to_save); i++) {
if (rdmsr_safe(msrs_to_save[i], &dummy[0], &dummy[1]) < 0)
continue;
@@ -5389,7 +5443,6 @@ EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
int handle_ud(struct kvm_vcpu *vcpu)
{
int emul_type = EMULTYPE_TRAP_UD;
enum emulation_result er;
char sig[5]; /* ud2; .ascii "kvm" */
struct x86_exception e;
@@ -5398,15 +5451,10 @@ int handle_ud(struct kvm_vcpu *vcpu)
sig, sizeof(sig), &e) == 0 &&
memcmp(sig, "\xf\xbkvm", sizeof(sig)) == 0) {
kvm_rip_write(vcpu, kvm_rip_read(vcpu) + sizeof(sig));
emul_type = 0;
emul_type = EMULTYPE_TRAP_UD_FORCED;
}
er = kvm_emulate_instruction(vcpu, emul_type);
if (er == EMULATE_USER_EXIT)
return 0;
if (er != EMULATE_DONE)
kvm_queue_exception(vcpu, UD_VECTOR);
return 1;
return kvm_emulate_instruction(vcpu, emul_type);
}
EXPORT_SYMBOL_GPL(handle_ud);
@@ -6228,7 +6276,7 @@ static void init_emulate_ctxt(struct kvm_vcpu *vcpu)
vcpu->arch.emulate_regs_need_sync_from_vcpu = false;
}
int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
{
struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
int ret;
@@ -6240,37 +6288,43 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
ctxt->_eip = ctxt->eip + inc_eip;
ret = emulate_int_real(ctxt, irq);
if (ret != X86EMUL_CONTINUE)
return EMULATE_FAIL;
ctxt->eip = ctxt->_eip;
kvm_rip_write(vcpu, ctxt->eip);
kvm_set_rflags(vcpu, ctxt->eflags);
return EMULATE_DONE;
if (ret != X86EMUL_CONTINUE) {
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
} else {
ctxt->eip = ctxt->_eip;
kvm_rip_write(vcpu, ctxt->eip);
kvm_set_rflags(vcpu, ctxt->eflags);
}
}
EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt);
static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
{
int r = EMULATE_DONE;
++vcpu->stat.insn_emulation_fail;
trace_kvm_emulate_insn_failed(vcpu);
if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)
return EMULATE_FAIL;
if (emulation_type & EMULTYPE_VMWARE_GP) {
kvm_queue_exception_e(vcpu, GP_VECTOR, 0);
return 1;
}
if (emulation_type & EMULTYPE_SKIP) {
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
return 0;
}
kvm_queue_exception(vcpu, UD_VECTOR);
if (!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) {
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
r = EMULATE_USER_EXIT;
return 0;
}
kvm_queue_exception(vcpu, UD_VECTOR);
return r;
return 1;
}
static bool reexecute_instruction(struct kvm_vcpu *vcpu, gva_t cr2,
@@ -6425,7 +6479,7 @@ static int kvm_vcpu_check_hw_bp(unsigned long addr, u32 type, u32 dr7,
return dr6;
}
static void kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, int *r)
static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu)
{
struct kvm_run *kvm_run = vcpu->run;
@@ -6434,10 +6488,10 @@ static void kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, int *r)
kvm_run->debug.arch.pc = vcpu->arch.singlestep_rip;
kvm_run->debug.arch.exception = DB_VECTOR;
kvm_run->exit_reason = KVM_EXIT_DEBUG;
*r = EMULATE_USER_EXIT;
} else {
kvm_queue_exception_p(vcpu, DB_VECTOR, DR6_BS);
return 0;
}
kvm_queue_exception_p(vcpu, DB_VECTOR, DR6_BS);
return 1;
}
int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
@@ -6446,7 +6500,7 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
int r;
r = kvm_x86_ops->skip_emulated_instruction(vcpu);
if (unlikely(r != EMULATE_DONE))
if (unlikely(!r))
return 0;
/*
@@ -6458,8 +6512,8 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
* that sets the TF flag".
*/
if (unlikely(rflags & X86_EFLAGS_TF))
kvm_vcpu_do_singlestep(vcpu, &r);
return r == EMULATE_DONE;
r = kvm_vcpu_do_singlestep(vcpu);
return r;
}
EXPORT_SYMBOL_GPL(kvm_skip_emulated_instruction);
@@ -6478,7 +6532,7 @@ static bool kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu, int *r)
kvm_run->debug.arch.pc = eip;
kvm_run->debug.arch.exception = DB_VECTOR;
kvm_run->exit_reason = KVM_EXIT_DEBUG;
*r = EMULATE_USER_EXIT;
*r = 0;
return true;
}
}
@@ -6494,7 +6548,7 @@ static bool kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu, int *r)
vcpu->arch.dr6 &= ~DR_TRAP_BITS;
vcpu->arch.dr6 |= dr6 | DR6_RTM;
kvm_queue_exception(vcpu, DB_VECTOR);
*r = EMULATE_DONE;
*r = 1;
return true;
}
}
@@ -6578,11 +6632,14 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
trace_kvm_emulate_insn_start(vcpu);
++vcpu->stat.insn_emulation;
if (r != EMULATION_OK) {
if (emulation_type & EMULTYPE_TRAP_UD)
return EMULATE_FAIL;
if ((emulation_type & EMULTYPE_TRAP_UD) ||
(emulation_type & EMULTYPE_TRAP_UD_FORCED)) {
kvm_queue_exception(vcpu, UD_VECTOR);
return 1;
}
if (reexecute_instruction(vcpu, cr2, write_fault_to_spt,
emulation_type))
return EMULATE_DONE;
return 1;
if (ctxt->have_exception) {
/*
* #UD should result in just EMULATION_FAILED, and trap-like
@@ -6591,28 +6648,32 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
WARN_ON_ONCE(ctxt->exception.vector == UD_VECTOR ||
exception_type(ctxt->exception.vector) == EXCPT_TRAP);
inject_emulated_exception(vcpu);
return EMULATE_DONE;
return 1;
}
if (emulation_type & EMULTYPE_SKIP)
return EMULATE_FAIL;
return handle_emulation_failure(vcpu, emulation_type);
}
}
if ((emulation_type & EMULTYPE_VMWARE) &&
!is_vmware_backdoor_opcode(ctxt))
return EMULATE_FAIL;
if ((emulation_type & EMULTYPE_VMWARE_GP) &&
!is_vmware_backdoor_opcode(ctxt)) {
kvm_queue_exception_e(vcpu, GP_VECTOR, 0);
return 1;
}
/*
* Note, EMULTYPE_SKIP is intended for use *only* by vendor callbacks
* for kvm_skip_emulated_instruction(). The caller is responsible for
* updating interruptibility state and injecting single-step #DBs.
*/
if (emulation_type & EMULTYPE_SKIP) {
kvm_rip_write(vcpu, ctxt->_eip);
if (ctxt->eflags & X86_EFLAGS_RF)
kvm_set_rflags(vcpu, ctxt->eflags & ~X86_EFLAGS_RF);
kvm_x86_ops->set_interrupt_shadow(vcpu, 0);
return EMULATE_DONE;
return 1;
}
if (retry_instruction(ctxt, cr2, emulation_type))
return EMULATE_DONE;
return 1;
/* this is needed for vmware backdoor interface to work since it
changes registers values during IO operation */
@@ -6628,18 +6689,18 @@ restart:
r = x86_emulate_insn(ctxt);
if (r == EMULATION_INTERCEPTED)
return EMULATE_DONE;
return 1;
if (r == EMULATION_FAILED) {
if (reexecute_instruction(vcpu, cr2, write_fault_to_spt,
emulation_type))
return EMULATE_DONE;
return 1;
return handle_emulation_failure(vcpu, emulation_type);
}
if (ctxt->have_exception) {
r = EMULATE_DONE;
r = 1;
if (inject_emulated_exception(vcpu))
return r;
} else if (vcpu->arch.pio.count) {
@@ -6650,16 +6711,18 @@ restart:
writeback = false;
vcpu->arch.complete_userspace_io = complete_emulated_pio;
}
r = EMULATE_USER_EXIT;
r = 0;
} else if (vcpu->mmio_needed) {
++vcpu->stat.mmio_exits;
if (!vcpu->mmio_is_write)
writeback = false;
r = EMULATE_USER_EXIT;
r = 0;
vcpu->arch.complete_userspace_io = complete_emulated_mmio;
} else if (r == EMULATION_RESTART)
goto restart;
else
r = EMULATE_DONE;
r = 1;
if (writeback) {
unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
@@ -6668,8 +6731,8 @@ restart:
if (!ctxt->have_exception ||
exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
kvm_rip_write(vcpu, ctxt->eip);
if (r == EMULATE_DONE && ctxt->tf)
kvm_vcpu_do_singlestep(vcpu, &r);
if (r && ctxt->tf)
r = kvm_vcpu_do_singlestep(vcpu);
__kvm_set_rflags(vcpu, ctxt->eflags);
}
@@ -8263,12 +8326,11 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
static inline int complete_emulated_io(struct kvm_vcpu *vcpu)
{
int r;
vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
r = kvm_emulate_instruction(vcpu, EMULTYPE_NO_DECODE);
srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
if (r != EMULATE_DONE)
return 0;
return 1;
return r;
}
static int complete_emulated_pio(struct kvm_vcpu *vcpu)
@@ -8636,14 +8698,17 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int idt_index,
ret = emulator_task_switch(ctxt, tss_selector, idt_index, reason,
has_error_code, error_code);
if (ret)
return EMULATE_FAIL;
if (ret) {
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
return 0;
}
kvm_rip_write(vcpu, ctxt->eip);
kvm_set_rflags(vcpu, ctxt->eflags);
kvm_make_request(KVM_REQ_EVENT, vcpu);
return EMULATE_DONE;
return 1;
}
EXPORT_SYMBOL_GPL(kvm_task_switch);
@@ -9361,6 +9426,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list);
INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages);
INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);
atomic_set(&kvm->arch.noncoherent_dma_count, 0);
@@ -9690,8 +9756,13 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
* Scan sptes if dirty logging has been stopped, dropping those
* which can be collapsed into a single large-page spte. Later
* page faults will create the large-page sptes.
*
* There is no need to do this in any of the following cases:
* CREATE: No dirty mappings will already exist.
* MOVE/DELETE: The old mappings will already have been cleaned up by
* kvm_arch_flush_shadow_memslot()
*/
if ((change != KVM_MR_DELETE) &&
if (change == KVM_MR_FLAGS_ONLY &&
(old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
!(new->flags & KVM_MEM_LOG_DIRTY_PAGES))
kvm_mmu_zap_collapsible_sptes(kvm, new);

View File

@@ -261,7 +261,7 @@ static inline bool kvm_check_has_quirk(struct kvm *kvm, u64 quirk)
}
void kvm_set_pending_timer(struct kvm_vcpu *vcpu);
int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip);
void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip);
void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr);
u64 get_kvmclock_ns(struct kvm *kvm);

View File

@@ -214,6 +214,16 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
/* release the tag's ownership to the req cloned from */
spin_lock_irqsave(&fq->mq_flush_lock, flags);
if (!refcount_dec_and_test(&flush_rq->ref)) {
fq->rq_status = error;
spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
return;
}
if (fq->rq_status != BLK_STS_OK)
error = fq->rq_status;
hctx = flush_rq->mq_hctx;
if (!q->elevator) {
blk_mq_tag_set_rq(hctx, flush_rq->tag, fq->orig_rq);

View File

@@ -529,8 +529,8 @@ struct iocg_wake_ctx {
static const struct ioc_params autop[] = {
[AUTOP_HDD] = {
.qos = {
[QOS_RLAT] = 50000, /* 50ms */
[QOS_WLAT] = 50000,
[QOS_RLAT] = 250000, /* 250ms */
[QOS_WLAT] = 250000,
[QOS_MIN] = VRATE_MIN_PPM,
[QOS_MAX] = VRATE_MAX_PPM,
},
@@ -1343,7 +1343,7 @@ static void ioc_timer_fn(struct timer_list *timer)
u32 ppm_wthr = MILLION - ioc->params.qos[QOS_WPPM];
u32 missed_ppm[2], rq_wait_pct;
u64 period_vtime;
int i;
int prev_busy_level, i;
/* how were the latencies during the period? */
ioc_lat_stat(ioc, missed_ppm, &rq_wait_pct);
@@ -1407,7 +1407,8 @@ static void ioc_timer_fn(struct timer_list *timer)
* comparing vdone against period start. If lagging behind
* IOs from past periods, don't increase vrate.
*/
if (!atomic_read(&iocg_to_blkg(iocg)->use_delay) &&
if ((ppm_rthr != MILLION || ppm_wthr != MILLION) &&
!atomic_read(&iocg_to_blkg(iocg)->use_delay) &&
time_after64(vtime, vdone) &&
time_after64(vtime, now.vnow -
MAX_LAGGING_PERIODS * period_vtime) &&
@@ -1531,26 +1532,29 @@ skip_surplus_transfers:
* and experiencing shortages but not surpluses, we're too stingy
* and should increase vtime rate.
*/
prev_busy_level = ioc->busy_level;
if (rq_wait_pct > RQ_WAIT_BUSY_PCT ||
missed_ppm[READ] > ppm_rthr ||
missed_ppm[WRITE] > ppm_wthr) {
ioc->busy_level = max(ioc->busy_level, 0);
ioc->busy_level++;
} else if (nr_lagging) {
ioc->busy_level = max(ioc->busy_level, 0);
} else if (nr_shortages && !nr_surpluses &&
rq_wait_pct <= RQ_WAIT_BUSY_PCT * UNBUSY_THR_PCT / 100 &&
} else if (rq_wait_pct <= RQ_WAIT_BUSY_PCT * UNBUSY_THR_PCT / 100 &&
missed_ppm[READ] <= ppm_rthr * UNBUSY_THR_PCT / 100 &&
missed_ppm[WRITE] <= ppm_wthr * UNBUSY_THR_PCT / 100) {
ioc->busy_level = min(ioc->busy_level, 0);
ioc->busy_level--;
/* take action iff there is contention */
if (nr_shortages && !nr_lagging) {
ioc->busy_level = min(ioc->busy_level, 0);
/* redistribute surpluses first */
if (!nr_surpluses)
ioc->busy_level--;
}
} else {
ioc->busy_level = 0;
}
ioc->busy_level = clamp(ioc->busy_level, -1000, 1000);
if (ioc->busy_level) {
if (ioc->busy_level > 0 || (ioc->busy_level < 0 && !nr_lagging)) {
u64 vrate = atomic64_read(&ioc->vtime_rate);
u64 vrate_min = ioc->vrate_min, vrate_max = ioc->vrate_max;
@@ -1592,6 +1596,10 @@ skip_surplus_transfers:
atomic64_set(&ioc->vtime_rate, vrate);
ioc->inuse_margin_vtime = DIV64_U64_ROUND_UP(
ioc->period_us * vrate * INUSE_MARGIN_PCT, 100);
} else if (ioc->busy_level != prev_busy_level || nr_lagging) {
trace_iocost_ioc_vrate_adj(ioc, atomic64_read(&ioc->vtime_rate),
&missed_ppm, rq_wait_pct, nr_lagging,
nr_shortages, nr_surpluses);
}
ioc_refresh_params(ioc, false);

View File

@@ -555,8 +555,6 @@ void blk_mq_sched_free_requests(struct request_queue *q)
struct blk_mq_hw_ctx *hctx;
int i;
lockdep_assert_held(&q->sysfs_lock);
queue_for_each_hw_ctx(q, hctx, i) {
if (hctx->sched_tags)
blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i);

View File

@@ -918,7 +918,10 @@ static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
*/
if (blk_mq_req_expired(rq, next))
blk_mq_rq_timed_out(rq, reserved);
if (refcount_dec_and_test(&rq->ref))
if (is_flush_rq(rq, hctx))
rq->end_io(rq, 0);
else if (refcount_dec_and_test(&rq->ref))
__blk_mq_free_request(rq);
return true;

View File

@@ -482,7 +482,6 @@ static ssize_t queue_wb_lat_store(struct request_queue *q, const char *page,
blk_mq_quiesce_queue(q);
wbt_set_min_lat(q, val);
wbt_update_limits(q);
blk_mq_unquiesce_queue(q);
blk_mq_unfreeze_queue(q);
@@ -989,13 +988,11 @@ int blk_register_queue(struct gendisk *disk)
blk_mq_debugfs_register(q);
}
/*
* The flag of QUEUE_FLAG_REGISTERED isn't set yet, so elevator
* switch won't happen at all.
*/
mutex_lock(&q->sysfs_lock);
if (q->elevator) {
ret = elv_register_queue(q, false);
if (ret) {
mutex_unlock(&q->sysfs_lock);
mutex_unlock(&q->sysfs_dir_lock);
kobject_del(&q->kobj);
blk_trace_remove_sysfs(dev);
@@ -1005,7 +1002,6 @@ int blk_register_queue(struct gendisk *disk)
has_elevator = true;
}
mutex_lock(&q->sysfs_lock);
blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q);
wbt_enable_default(q);
blk_throtl_register_queue(q);
@@ -1062,12 +1058,10 @@ void blk_unregister_queue(struct gendisk *disk)
kobject_del(&q->kobj);
blk_trace_remove_sysfs(disk_to_dev(disk));
/*
* q->kobj has been removed, so it is safe to check if elevator
* exists without holding q->sysfs_lock.
*/
mutex_lock(&q->sysfs_lock);
if (q->elevator)
elv_unregister_queue(q);
mutex_unlock(&q->sysfs_lock);
mutex_unlock(&q->sysfs_dir_lock);
kobject_put(&disk_to_dev(disk)->kobj);

View File

@@ -19,6 +19,7 @@ struct blk_flush_queue {
unsigned int flush_queue_delayed:1;
unsigned int flush_pending_idx:1;
unsigned int flush_running_idx:1;
blk_status_t rq_status;
unsigned long flush_pending_since;
struct list_head flush_queue[2];
struct list_head flush_data_in_flight;
@@ -47,6 +48,12 @@ static inline void __blk_get_queue(struct request_queue *q)
kobject_get(&q->kobj);
}
static inline bool
is_flush_rq(struct request *req, struct blk_mq_hw_ctx *hctx)
{
return hctx->fq->flush_rq == req;
}
struct blk_flush_queue *blk_alloc_flush_queue(struct request_queue *q,
int node, int cmd_size, gfp_t flags);
void blk_free_flush_queue(struct blk_flush_queue *q);
@@ -194,6 +201,8 @@ void elv_unregister_queue(struct request_queue *q);
static inline void elevator_exit(struct request_queue *q,
struct elevator_queue *e)
{
lockdep_assert_held(&q->sysfs_lock);
blk_mq_sched_free_requests(q);
__elevator_exit(q, e);
}

View File

@@ -503,9 +503,7 @@ int elv_register_queue(struct request_queue *q, bool uevent)
if (uevent)
kobject_uevent(&e->kobj, KOBJ_ADD);
mutex_lock(&q->sysfs_lock);
e->registered = 1;
mutex_unlock(&q->sysfs_lock);
}
return error;
}
@@ -523,11 +521,9 @@ void elv_unregister_queue(struct request_queue *q)
kobject_uevent(&e->kobj, KOBJ_REMOVE);
kobject_del(&e->kobj);
mutex_lock(&q->sysfs_lock);
e->registered = 0;
/* Re-enable throttling in case elevator disabled it */
wbt_enable_default(q);
mutex_unlock(&q->sysfs_lock);
}
}
@@ -590,32 +586,11 @@ int elevator_switch_mq(struct request_queue *q,
lockdep_assert_held(&q->sysfs_lock);
if (q->elevator) {
if (q->elevator->registered) {
mutex_unlock(&q->sysfs_lock);
/*
* Concurrent elevator switch can't happen becasue
* sysfs write is always exclusively on same file.
*
* Also the elevator queue won't be freed after
* sysfs_lock is released becasue kobject_del() in
* blk_unregister_queue() waits for completion of
* .store & .show on its attributes.
*/
if (q->elevator->registered)
elv_unregister_queue(q);
mutex_lock(&q->sysfs_lock);
}
ioc_clear_queue(q);
elevator_exit(q, q->elevator);
/*
* sysfs_lock may be dropped, so re-check if queue is
* unregistered. If yes, don't switch to new elevator
* any more
*/
if (!blk_queue_registered(q))
return 0;
}
ret = blk_mq_init_sched(q, new_e);
@@ -623,11 +598,7 @@ int elevator_switch_mq(struct request_queue *q,
goto out;
if (new_e) {
mutex_unlock(&q->sysfs_lock);
ret = elv_register_queue(q, true);
mutex_lock(&q->sysfs_lock);
if (ret) {
elevator_exit(q, q->elevator);
goto out;

View File

@@ -694,7 +694,7 @@ static void mvebu_pwm_get_state(struct pwm_chip *chip,
}
static int mvebu_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
struct pwm_state *state)
const struct pwm_state *state)
{
struct mvebu_pwm *mvpwm = to_mvebu_pwm(chip);
struct mvebu_gpio_chip *mvchip = mvpwm->mvchip;

View File

@@ -948,6 +948,7 @@ int amdgpu_dpm_set_powergating_by_smu(struct amdgpu_device *adev, uint32_t block
case AMD_IP_BLOCK_TYPE_UVD:
case AMD_IP_BLOCK_TYPE_VCN:
case AMD_IP_BLOCK_TYPE_VCE:
case AMD_IP_BLOCK_TYPE_SDMA:
if (swsmu)
ret = smu_dpm_set_power_gate(&adev->smu, block_type, gate);
else
@@ -956,7 +957,6 @@ int amdgpu_dpm_set_powergating_by_smu(struct amdgpu_device *adev, uint32_t block
break;
case AMD_IP_BLOCK_TYPE_GMC:
case AMD_IP_BLOCK_TYPE_ACP:
case AMD_IP_BLOCK_TYPE_SDMA:
ret = ((adev)->powerplay.pp_funcs->set_powergating_by_smu(
(adev)->powerplay.pp_handle, block_type, gate));
break;

View File

@@ -1012,11 +1012,16 @@ static const struct pci_device_id pciidlist[] = {
{0x1002, 0x731B, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI10},
{0x1002, 0x731F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI10},
/* Navi14 */
{0x1002, 0x7340, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14},
{0x1002, 0x7340, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14|AMD_EXP_HW_SUPPORT},
{0x1002, 0x7341, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14|AMD_EXP_HW_SUPPORT},
{0x1002, 0x7347, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14|AMD_EXP_HW_SUPPORT},
/* Renoir */
{0x1002, 0x1636, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU|AMD_EXP_HW_SUPPORT},
/* Navi12 */
{0x1002, 0x7360, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI12|AMD_EXP_HW_SUPPORT},
{0, 0, 0}
};

View File

@@ -143,7 +143,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
/* ring tests don't use a job */
if (job) {
vm = job->vm;
fence_ctx = job->base.s_fence->scheduled.context;
fence_ctx = job->base.s_fence ?
job->base.s_fence->scheduled.context : 0;
} else {
vm = NULL;
fence_ctx = 0;

View File

@@ -677,6 +677,9 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
if (sh_num == AMDGPU_INFO_MMR_SH_INDEX_MASK)
sh_num = 0xffffffff;
if (info->read_mmr_reg.count > 128)
return -EINVAL;
regs = kmalloc_array(info->read_mmr_reg.count, sizeof(*regs), GFP_KERNEL);
if (!regs)
return -ENOMEM;

View File

@@ -70,6 +70,11 @@ MODULE_FIRMWARE("amdgpu/navi10_mec.bin");
MODULE_FIRMWARE("amdgpu/navi10_mec2.bin");
MODULE_FIRMWARE("amdgpu/navi10_rlc.bin");
MODULE_FIRMWARE("amdgpu/navi14_ce_wks.bin");
MODULE_FIRMWARE("amdgpu/navi14_pfp_wks.bin");
MODULE_FIRMWARE("amdgpu/navi14_me_wks.bin");
MODULE_FIRMWARE("amdgpu/navi14_mec_wks.bin");
MODULE_FIRMWARE("amdgpu/navi14_mec2_wks.bin");
MODULE_FIRMWARE("amdgpu/navi14_ce.bin");
MODULE_FIRMWARE("amdgpu/navi14_pfp.bin");
MODULE_FIRMWARE("amdgpu/navi14_me.bin");
@@ -594,7 +599,8 @@ static void gfx_v10_0_check_gfxoff_flag(struct amdgpu_device *adev)
static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
{
const char *chip_name;
char fw_name[30];
char fw_name[40];
char wks[10];
int err;
struct amdgpu_firmware_info *info = NULL;
const struct common_firmware_header *header = NULL;
@@ -607,12 +613,16 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
DRM_DEBUG("\n");
memset(wks, 0, sizeof(wks));
switch (adev->asic_type) {
case CHIP_NAVI10:
chip_name = "navi10";
break;
case CHIP_NAVI14:
chip_name = "navi14";
if (!(adev->pdev->device == 0x7340 &&
adev->pdev->revision != 0x00))
snprintf(wks, sizeof(wks), "_wks");
break;
case CHIP_NAVI12:
chip_name = "navi12";
@@ -621,7 +631,7 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
BUG();
}
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_pfp.bin", chip_name);
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_pfp%s.bin", chip_name, wks);
err = request_firmware(&adev->gfx.pfp_fw, fw_name, adev->dev);
if (err)
goto out;
@@ -632,7 +642,7 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
adev->gfx.pfp_fw_version = le32_to_cpu(cp_hdr->header.ucode_version);
adev->gfx.pfp_feature_version = le32_to_cpu(cp_hdr->ucode_feature_version);
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_me.bin", chip_name);
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_me%s.bin", chip_name, wks);
err = request_firmware(&adev->gfx.me_fw, fw_name, adev->dev);
if (err)
goto out;
@@ -643,7 +653,7 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
adev->gfx.me_fw_version = le32_to_cpu(cp_hdr->header.ucode_version);
adev->gfx.me_feature_version = le32_to_cpu(cp_hdr->ucode_feature_version);
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_ce.bin", chip_name);
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_ce%s.bin", chip_name, wks);
err = request_firmware(&adev->gfx.ce_fw, fw_name, adev->dev);
if (err)
goto out;
@@ -708,7 +718,7 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
if (adev->gfx.rlc.is_rlc_v2_1)
gfx_v10_0_init_rlc_ext_microcode(adev);
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name);
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec%s.bin", chip_name, wks);
err = request_firmware(&adev->gfx.mec_fw, fw_name, adev->dev);
if (err)
goto out;
@@ -719,7 +729,7 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
adev->gfx.mec_fw_version = le32_to_cpu(cp_hdr->header.ucode_version);
adev->gfx.mec_feature_version = le32_to_cpu(cp_hdr->ucode_feature_version);
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec2.bin", chip_name);
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec2%s.bin", chip_name, wks);
err = request_firmware(&adev->gfx.mec2_fw, fw_name, adev->dev);
if (!err) {
err = amdgpu_ucode_validate(adev->gfx.mec2_fw);

View File

@@ -1650,7 +1650,6 @@ static int gfx_v9_0_rlc_init(struct amdgpu_device *adev)
switch (adev->asic_type) {
case CHIP_RAVEN:
case CHIP_RENOIR:
gfx_v9_0_init_lbpw(adev);
break;
case CHIP_VEGA20:
@@ -3026,7 +3025,6 @@ static int gfx_v9_0_rlc_resume(struct amdgpu_device *adev)
switch (adev->asic_type) {
case CHIP_RAVEN:
case CHIP_RENOIR:
if (amdgpu_lbpw == 0)
gfx_v9_0_enable_lbpw(adev, false);
else

View File

@@ -1889,8 +1889,9 @@ static int sdma_v4_0_hw_init(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
if (adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs &&
adev->powerplay.pp_funcs->set_powergating_by_smu)
if ((adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs &&
adev->powerplay.pp_funcs->set_powergating_by_smu) ||
adev->asic_type == CHIP_RENOIR)
amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_SDMA, false);
if (!amdgpu_sriov_vf(adev))
@@ -1917,8 +1918,9 @@ static int sdma_v4_0_hw_fini(void *handle)
sdma_v4_0_ctx_switch_enable(adev, false);
sdma_v4_0_enable(adev, false);
if (adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs
&& adev->powerplay.pp_funcs->set_powergating_by_smu)
if ((adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs
&& adev->powerplay.pp_funcs->set_powergating_by_smu) ||
adev->asic_type == CHIP_RENOIR)
amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_SDMA, true);
return 0;

View File

@@ -493,7 +493,15 @@ static void smu_v11_0_i2c_fini(struct i2c_adapter *control)
}
/* Restore clock gating */
smu_v11_0_i2c_set_clock_gating(control, true);
/*
* TODO Reenabling clock gating seems to break subsequent SMU operation
* on the I2C bus. My guess is that SMU doesn't disable clock gating like
* we do here before working with the bus. So for now just don't restore
* it but later work with SMU to see if they have this issue and can
* update their code appropriately
*/
/* smu_v11_0_i2c_set_clock_gating(control, true); */
}

View File

@@ -694,10 +694,10 @@ static const uint32_t cwsr_trap_gfx10_hex[] = {
0x003f8000, 0x8f6f896f,
0x88776f77, 0x8a6eff6e,
0x023f8000, 0xb9eef807,
0xb970f812, 0xb971f813,
0x8ff08870, 0xf4051bb8,
0xb97af812, 0xb97bf813,
0x8ffa887a, 0xf4051bbd,
0xfa000000, 0xbf8cc07f,
0xf4051c38, 0xfa000008,
0xf4051ebd, 0xfa000008,
0xbf8cc07f, 0x87ee6e6e,
0xbf840001, 0xbe80206e,
0xb971f803, 0x8771ff71,

View File

@@ -187,12 +187,12 @@ L_FETCH_2ND_TRAP:
// Read second-level TBA/TMA from first-level TMA and jump if available.
// ttmp[2:5] and ttmp12 can be used (others hold SPI-initialized debug data)
// ttmp12 holds SQ_WAVE_STATUS
s_getreg_b32 ttmp4, hwreg(HW_REG_SHADER_TMA_LO)
s_getreg_b32 ttmp5, hwreg(HW_REG_SHADER_TMA_HI)
s_lshl_b64 [ttmp4, ttmp5], [ttmp4, ttmp5], 0x8
s_load_dwordx2 [ttmp2, ttmp3], [ttmp4, ttmp5], 0x0 glc:1 // second-level TBA
s_getreg_b32 ttmp14, hwreg(HW_REG_SHADER_TMA_LO)
s_getreg_b32 ttmp15, hwreg(HW_REG_SHADER_TMA_HI)
s_lshl_b64 [ttmp14, ttmp15], [ttmp14, ttmp15], 0x8
s_load_dwordx2 [ttmp2, ttmp3], [ttmp14, ttmp15], 0x0 glc:1 // second-level TBA
s_waitcnt lgkmcnt(0)
s_load_dwordx2 [ttmp4, ttmp5], [ttmp4, ttmp5], 0x8 glc:1 // second-level TMA
s_load_dwordx2 [ttmp14, ttmp15], [ttmp14, ttmp15], 0x8 glc:1 // second-level TMA
s_waitcnt lgkmcnt(0)
s_and_b64 [ttmp2, ttmp3], [ttmp2, ttmp3], [ttmp2, ttmp3]
s_cbranch_scc0 L_NO_NEXT_TRAP // second-level trap handler not been set

View File

@@ -2113,6 +2113,7 @@ static int amdgpu_dm_backlight_get_brightness(struct backlight_device *bd)
}
static const struct backlight_ops amdgpu_dm_backlight_ops = {
.options = BL_CORE_SUSPENDRESUME,
.get_brightness = amdgpu_dm_backlight_get_brightness,
.update_status = amdgpu_dm_backlight_update_status,
};
@@ -2384,6 +2385,8 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev)
if (adev->asic_type != CHIP_CARRIZO && adev->asic_type != CHIP_STONEY)
dm->dc->debug.disable_stutter = amdgpu_pp_feature_mask & PP_STUTTER_MODE ? false : true;
if (adev->asic_type == CHIP_RENOIR)
dm->dc->debug.disable_stutter = true;
return 0;
fail:
@@ -5770,8 +5773,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
* change FB pitch, DCC state, rotation or mirroing.
*/
bundle->flip_addrs[planes_count].flip_immediate =
(crtc->state->pageflip_flags &
DRM_MODE_PAGE_FLIP_ASYNC) != 0 &&
crtc->state->async_flip &&
acrtc_state->update_type == UPDATE_TYPE_FAST;
timestamp_ns = ktime_get_ns();
@@ -6348,7 +6350,7 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
amdgpu_dm_enable_crtc_interrupts(dev, state, true);
for_each_new_crtc_in_state(state, crtc, new_crtc_state, j)
if (new_crtc_state->pageflip_flags & DRM_MODE_PAGE_FLIP_ASYNC)
if (new_crtc_state->async_flip)
wait_for_vblank = false;
/* update planes when needed per crtc*/

View File

@@ -708,6 +708,10 @@ static void hack_bounding_box(struct dcn_bw_internal_vars *v,
unsigned int get_highest_allowed_voltage_level(uint32_t hw_internal_rev)
{
/* for dali, the highest voltage level we want is 0 */
if (ASICREV_IS_DALI(hw_internal_rev))
return 0;
/* we are ok with all levels */
return 4;
}

View File

@@ -98,11 +98,14 @@ uint32_t dce110_get_min_vblank_time_us(const struct dc_state *context)
struct dc_stream_state *stream = context->streams[j];
uint32_t vertical_blank_in_pixels = 0;
uint32_t vertical_blank_time = 0;
uint32_t vertical_total_min = stream->timing.v_total;
struct dc_crtc_timing_adjust adjust = stream->adjust;
if (adjust.v_total_max != adjust.v_total_min)
vertical_total_min = adjust.v_total_min;
vertical_blank_in_pixels = stream->timing.h_total *
(stream->timing.v_total
(vertical_total_min
- stream->timing.v_addressable);
vertical_blank_time = vertical_blank_in_pixels
* 10000 / stream->timing.pix_clk_100hz;
@@ -171,6 +174,10 @@ void dce11_pplib_apply_display_requirements(
struct dc_state *context)
{
struct dm_pp_display_configuration *pp_display_cfg = &context->pp_display_cfg;
int memory_type_multiplier = MEMORY_TYPE_MULTIPLIER_CZ;
if (dc->bw_vbios && dc->bw_vbios->memory_type == bw_def_hbm)
memory_type_multiplier = MEMORY_TYPE_HBM;
pp_display_cfg->all_displays_in_sync =
context->bw_ctx.bw.dce.all_displays_in_sync;
@@ -183,8 +190,20 @@ void dce11_pplib_apply_display_requirements(
pp_display_cfg->cpu_pstate_separation_time =
context->bw_ctx.bw.dce.blackout_recovery_time_us;
pp_display_cfg->min_memory_clock_khz = context->bw_ctx.bw.dce.yclk_khz
/ MEMORY_TYPE_MULTIPLIER_CZ;
/*
* TODO: determine whether the bandwidth has reached memory's limitation
* , then change minimum memory clock based on real-time bandwidth
* limitation.
*/
if (ASICREV_IS_VEGA20_P(dc->ctx->asic_id.hw_internal_rev) && (context->stream_count >= 2)) {
pp_display_cfg->min_memory_clock_khz = max(pp_display_cfg->min_memory_clock_khz,
(uint32_t) div64_s64(
div64_s64(dc->bw_vbios->high_yclk.value,
memory_type_multiplier), 10000));
} else {
pp_display_cfg->min_memory_clock_khz = context->bw_ctx.bw.dce.yclk_khz
/ memory_type_multiplier;
}
pp_display_cfg->min_engine_clock_khz = determine_sclk_from_bounding_box(
dc,

View File

@@ -148,7 +148,7 @@ static void dce_mi_program_pte_vm(
pte->min_pte_before_flip_horiz_scan;
REG_UPDATE(GRPH_PIPE_OUTSTANDING_REQUEST_LIMIT,
GRPH_PIPE_OUTSTANDING_REQUEST_LIMIT, 0xff);
GRPH_PIPE_OUTSTANDING_REQUEST_LIMIT, 0x7f);
REG_UPDATE_3(DVMM_PTE_CONTROL,
DVMM_PAGE_WIDTH, page_width,
@@ -157,7 +157,7 @@ static void dce_mi_program_pte_vm(
REG_UPDATE_2(DVMM_PTE_ARB_CONTROL,
DVMM_PTE_REQ_PER_CHUNK, pte->pte_req_per_chunk,
DVMM_MAX_PTE_REQ_OUTSTANDING, 0xff);
DVMM_MAX_PTE_REQ_OUTSTANDING, 0x7f);
}
static void program_urgency_watermark(

View File

@@ -1091,6 +1091,7 @@ struct resource_pool *dce100_create_resource_pool(
if (construct(num_virtual_links, dc, pool))
return &pool->base;
kfree(pool);
BREAK_TO_DEBUGGER();
return NULL;
}

View File

@@ -1462,6 +1462,7 @@ struct resource_pool *dce110_create_resource_pool(
if (construct(num_virtual_links, dc, pool, asic_id))
return &pool->base;
kfree(pool);
BREAK_TO_DEBUGGER();
return NULL;
}

View File

@@ -987,6 +987,10 @@ static void bw_calcs_data_update_from_pplib(struct dc *dc)
struct dm_pp_clock_levels_with_latency mem_clks = {0};
struct dm_pp_wm_sets_with_clock_ranges clk_ranges = {0};
struct dm_pp_clock_levels clks = {0};
int memory_type_multiplier = MEMORY_TYPE_MULTIPLIER_CZ;
if (dc->bw_vbios && dc->bw_vbios->memory_type == bw_def_hbm)
memory_type_multiplier = MEMORY_TYPE_HBM;
/*do system clock TODO PPLIB: after PPLIB implement,
* then remove old way
@@ -1026,12 +1030,12 @@ static void bw_calcs_data_update_from_pplib(struct dc *dc)
&clks);
dc->bw_vbios->low_yclk = bw_frc_to_fixed(
clks.clocks_in_khz[0] * MEMORY_TYPE_MULTIPLIER_CZ, 1000);
clks.clocks_in_khz[0] * memory_type_multiplier, 1000);
dc->bw_vbios->mid_yclk = bw_frc_to_fixed(
clks.clocks_in_khz[clks.num_levels>>1] * MEMORY_TYPE_MULTIPLIER_CZ,
clks.clocks_in_khz[clks.num_levels>>1] * memory_type_multiplier,
1000);
dc->bw_vbios->high_yclk = bw_frc_to_fixed(
clks.clocks_in_khz[clks.num_levels-1] * MEMORY_TYPE_MULTIPLIER_CZ,
clks.clocks_in_khz[clks.num_levels-1] * memory_type_multiplier,
1000);
return;
@@ -1067,12 +1071,12 @@ static void bw_calcs_data_update_from_pplib(struct dc *dc)
* YCLK = UMACLK*m_memoryTypeMultiplier
*/
dc->bw_vbios->low_yclk = bw_frc_to_fixed(
mem_clks.data[0].clocks_in_khz * MEMORY_TYPE_MULTIPLIER_CZ, 1000);
mem_clks.data[0].clocks_in_khz * memory_type_multiplier, 1000);
dc->bw_vbios->mid_yclk = bw_frc_to_fixed(
mem_clks.data[mem_clks.num_levels>>1].clocks_in_khz * MEMORY_TYPE_MULTIPLIER_CZ,
mem_clks.data[mem_clks.num_levels>>1].clocks_in_khz * memory_type_multiplier,
1000);
dc->bw_vbios->high_yclk = bw_frc_to_fixed(
mem_clks.data[mem_clks.num_levels-1].clocks_in_khz * MEMORY_TYPE_MULTIPLIER_CZ,
mem_clks.data[mem_clks.num_levels-1].clocks_in_khz * memory_type_multiplier,
1000);
/* Now notify PPLib/SMU about which Watermarks sets they should select
@@ -1338,6 +1342,7 @@ struct resource_pool *dce112_create_resource_pool(
if (construct(num_virtual_links, dc, pool))
return &pool->base;
kfree(pool);
BREAK_TO_DEBUGGER();
return NULL;
}

View File

@@ -847,6 +847,8 @@ static void bw_calcs_data_update_from_pplib(struct dc *dc)
int i;
unsigned int clk;
unsigned int latency;
/*original logic in dal3*/
int memory_type_multiplier = MEMORY_TYPE_MULTIPLIER_CZ;
/*do system clock*/
if (!dm_pp_get_clock_levels_by_type_with_latency(
@@ -905,13 +907,16 @@ static void bw_calcs_data_update_from_pplib(struct dc *dc)
* ALSO always convert UMA clock (from PPLIB) to YCLK (HW formula):
* YCLK = UMACLK*m_memoryTypeMultiplier
*/
if (dc->bw_vbios->memory_type == bw_def_hbm)
memory_type_multiplier = MEMORY_TYPE_HBM;
dc->bw_vbios->low_yclk = bw_frc_to_fixed(
mem_clks.data[0].clocks_in_khz * MEMORY_TYPE_MULTIPLIER_CZ, 1000);
mem_clks.data[0].clocks_in_khz * memory_type_multiplier, 1000);
dc->bw_vbios->mid_yclk = bw_frc_to_fixed(
mem_clks.data[mem_clks.num_levels>>1].clocks_in_khz * MEMORY_TYPE_MULTIPLIER_CZ,
mem_clks.data[mem_clks.num_levels>>1].clocks_in_khz * memory_type_multiplier,
1000);
dc->bw_vbios->high_yclk = bw_frc_to_fixed(
mem_clks.data[mem_clks.num_levels-1].clocks_in_khz * MEMORY_TYPE_MULTIPLIER_CZ,
mem_clks.data[mem_clks.num_levels-1].clocks_in_khz * memory_type_multiplier,
1000);
/* Now notify PPLib/SMU about which Watermarks sets they should select
@@ -1203,6 +1208,7 @@ struct resource_pool *dce120_create_resource_pool(
if (construct(num_virtual_links, dc, pool))
return &pool->base;
kfree(pool);
BREAK_TO_DEBUGGER();
return NULL;
}

View File

@@ -1570,6 +1570,7 @@ struct resource_pool *dcn10_create_resource_pool(
if (construct(init_data->num_virtual_links, dc, pool))
return &pool->base;
kfree(pool);
BREAK_TO_DEBUGGER();
return NULL;
}

View File

@@ -23,6 +23,8 @@
*
*/
#include <linux/slab.h>
#include "dm_services.h"
#include "dc.h"

View File

@@ -35,12 +35,10 @@
#include "hw_factory_dcn21.h"
#include "dcn/dcn_2_1_0_offset.h"
#include "dcn/dcn_2_1_0_sh_mask.h"
#include "renoir_ip_offset.h"
#include "reg_helper.h"
#include "../hpd_regs.h"
/* begin *********************
@@ -136,6 +134,39 @@ static const struct ddc_sh_mask ddc_mask[] = {
DDC_MASK_SH_LIST_DCN2(_MASK, 6)
};
#include "../generic_regs.h"
/* set field name */
#define SF_GENERIC(reg_name, field_name, post_fix)\
.field_name = reg_name ## __ ## field_name ## post_fix
#define generic_regs(id) \
{\
GENERIC_REG_LIST(id)\
}
static const struct generic_registers generic_regs[] = {
generic_regs(A),
};
static const struct generic_sh_mask generic_shift[] = {
GENERIC_MASK_SH_LIST(__SHIFT, A),
};
static const struct generic_sh_mask generic_mask[] = {
GENERIC_MASK_SH_LIST(_MASK, A),
};
static void define_generic_registers(struct hw_gpio_pin *pin, uint32_t en)
{
struct hw_generic *generic = HW_GENERIC_FROM_BASE(pin);
generic->regs = &generic_regs[en];
generic->shifts = &generic_shift[en];
generic->masks = &generic_mask[en];
generic->base.regs = &generic_regs[en].gpio;
}
static void define_ddc_registers(
struct hw_gpio_pin *pin,
uint32_t en)
@@ -181,7 +212,8 @@ static const struct hw_factory_funcs funcs = {
.get_hpd_pin = dal_hw_hpd_get_pin,
.get_generic_pin = dal_hw_generic_get_pin,
.define_hpd_registers = define_hpd_registers,
.define_ddc_registers = define_ddc_registers
.define_ddc_registers = define_ddc_registers,
.define_generic_registers = define_generic_registers
};
/*
* dal_hw_factory_dcn10_init

View File

@@ -58,7 +58,6 @@
#define SF_HPD(reg_name, field_name, post_fix)\
.field_name = reg_name ## __ ## field_name ## post_fix
/* macros to expend register list macro defined in HW object header file
* end *********************/
@@ -71,7 +70,7 @@ static bool offset_to_id(
{
switch (offset) {
/* GENERIC */
case REG(DC_GENERICA):
case REG(DC_GPIO_GENERIC_A):
*id = GPIO_ID_GENERIC;
switch (mask) {
case DC_GPIO_GENERIC_A__DC_GPIO_GENERICA_A_MASK:

View File

@@ -31,6 +31,8 @@
#include "dm_pp_smu.h"
#define MEMORY_TYPE_MULTIPLIER_CZ 4
#define MEMORY_TYPE_HBM 2
enum dce_version resource_parse_asic_id(
struct hw_asic_id asic_id);

View File

@@ -137,10 +137,13 @@
#define RAVEN1_F0 0xF0
#define RAVEN_UNKNOWN 0xFF
#define PICASSO_15D8_REV_E3 0xE3
#define PICASSO_15D8_REV_E4 0xE4
#define ASICREV_IS_RAVEN(eChipRev) ((eChipRev >= RAVEN_A0) && eChipRev < RAVEN_UNKNOWN)
#define ASICREV_IS_PICASSO(eChipRev) ((eChipRev >= PICASSO_A0) && (eChipRev < RAVEN2_A0))
#define ASICREV_IS_RAVEN2(eChipRev) ((eChipRev >= RAVEN2_A0) && (eChipRev < 0xF0))
#define ASICREV_IS_RAVEN2(eChipRev) ((eChipRev >= RAVEN2_A0) && (eChipRev < PICASSO_15D8_REV_E3))
#define ASICREV_IS_DALI(eChipRev) ((eChipRev >= PICASSO_15D8_REV_E3) && (eChipRev < RAVEN1_F0))
#define ASICREV_IS_RV1_F0(eChipRev) ((eChipRev >= RAVEN1_F0) && (eChipRev < RAVEN_UNKNOWN))

View File

@@ -155,7 +155,7 @@ static const struct IP_BASE MP0_BASE ={ { { { 0x00016000, 0x0243FC00, 0x00DC0000
{ { 0, 0, 0, 0, 0 } },
{ { 0, 0, 0, 0, 0 } },
{ { 0, 0, 0, 0, 0 } } } };
static const struct IP_BASE MP1_BASE ={ { { { 0x00016200, 0x02400400, 0x00E80000, 0x00EC0000, 0x00F00000 } },
static const struct IP_BASE MP1_BASE ={ { { { 0x00016000, 0x02400400, 0x00E80000, 0x00EC0000, 0x00F00000 } },
{ { 0, 0, 0, 0, 0 } },
{ { 0, 0, 0, 0, 0 } },
{ { 0, 0, 0, 0, 0 } },

View File

@@ -1531,6 +1531,7 @@ static int pp_asic_reset_mode_2(void *handle)
static int pp_smu_i2c_bus_access(void *handle, bool acquire)
{
struct pp_hwmgr *hwmgr = handle;
int ret = 0;
if (!hwmgr || !hwmgr->pm_en)
return -EINVAL;
@@ -1540,7 +1541,11 @@ static int pp_smu_i2c_bus_access(void *handle, bool acquire)
return -EINVAL;
}
return hwmgr->hwmgr_func->smu_i2c_bus_access(hwmgr, acquire);
mutex_lock(&hwmgr->smu_lock);
ret = hwmgr->hwmgr_func->smu_i2c_bus_access(hwmgr, acquire);
mutex_unlock(&hwmgr->smu_lock);
return ret;
}
static const struct amd_pm_funcs pp_dpm_funcs = {

View File

@@ -354,6 +354,9 @@ int smu_dpm_set_power_gate(struct smu_context *smu, uint32_t block_type,
case AMD_IP_BLOCK_TYPE_GFX:
ret = smu_gfx_off_control(smu, gate);
break;
case AMD_IP_BLOCK_TYPE_SDMA:
ret = smu_powergate_sdma(smu, gate);
break;
default:
break;
}

View File

@@ -177,12 +177,82 @@ static int renoir_get_dpm_uclk_limited(struct smu_context *smu, uint32_t *clock,
}
static int renoir_print_clk_levels(struct smu_context *smu,
enum smu_clk_type clk_type, char *buf)
{
int i, size = 0, ret = 0;
uint32_t cur_value = 0, value = 0, count = 0, min = 0, max = 0;
DpmClocks_t *clk_table = smu->smu_table.clocks_table;
SmuMetrics_t metrics = {0};
if (!clk_table || clk_type >= SMU_CLK_COUNT)
return -EINVAL;
ret = smu_update_table(smu, SMU_TABLE_SMU_METRICS, 0,
(void *)&metrics, false);
if (ret)
return ret;
switch (clk_type) {
case SMU_GFXCLK:
case SMU_SCLK:
/* retirve table returned paramters unit is MHz */
cur_value = metrics.ClockFrequency[CLOCK_GFXCLK];
ret = smu_get_dpm_freq_range(smu, SMU_GFXCLK, &min, &max);
if (!ret) {
/* driver only know min/max gfx_clk, Add level 1 for all other gfx clks */
if (cur_value == max)
i = 2;
else if (cur_value == min)
i = 0;
else
i = 1;
size += sprintf(buf + size, "0: %uMhz %s\n", min,
i == 0 ? "*" : "");
size += sprintf(buf + size, "1: %uMhz %s\n",
i == 1 ? cur_value : RENOIR_UMD_PSTATE_GFXCLK,
i == 1 ? "*" : "");
size += sprintf(buf + size, "2: %uMhz %s\n", max,
i == 2 ? "*" : "");
}
return size;
case SMU_SOCCLK:
count = NUM_SOCCLK_DPM_LEVELS;
cur_value = metrics.ClockFrequency[CLOCK_SOCCLK];
break;
case SMU_MCLK:
count = NUM_MEMCLK_DPM_LEVELS;
cur_value = metrics.ClockFrequency[CLOCK_UMCCLK];
break;
case SMU_DCEFCLK:
count = NUM_DCFCLK_DPM_LEVELS;
cur_value = metrics.ClockFrequency[CLOCK_DCFCLK];
break;
case SMU_FCLK:
count = NUM_FCLK_DPM_LEVELS;
cur_value = metrics.ClockFrequency[CLOCK_FCLK];
break;
default:
return -EINVAL;
}
for (i = 0; i < count; i++) {
GET_DPM_CUR_FREQ(clk_table, clk_type, i, value);
size += sprintf(buf + size, "%d: %uMhz %s\n", i, value,
cur_value == value ? "*" : "");
}
return size;
}
static const struct pptable_funcs renoir_ppt_funcs = {
.get_smu_msg_index = renoir_get_smu_msg_index,
.get_smu_table_index = renoir_get_smu_table_index,
.tables_init = renoir_tables_init,
.set_power_state = NULL,
.get_dpm_uclk_limited = renoir_get_dpm_uclk_limited,
.print_clk_levels = renoir_print_clk_levels,
};
void renoir_set_ppt_funcs(struct smu_context *smu)

View File

@@ -25,4 +25,29 @@
extern void renoir_set_ppt_funcs(struct smu_context *smu);
/* UMD PState Renoir Msg Parameters in MHz */
#define RENOIR_UMD_PSTATE_GFXCLK 700
#define RENOIR_UMD_PSTATE_SOCCLK 678
#define RENOIR_UMD_PSTATE_FCLK 800
#define GET_DPM_CUR_FREQ(table, clk_type, dpm_level, freq) \
do { \
switch (clk_type) { \
case SMU_SOCCLK: \
freq = table->SocClocks[dpm_level].Freq; \
break; \
case SMU_MCLK: \
freq = table->MemClocks[dpm_level].Freq; \
break; \
case SMU_DCEFCLK: \
freq = table->DcfClocks[dpm_level].Freq; \
break; \
case SMU_FCLK: \
freq = table->FClocks[dpm_level].Freq; \
break; \
default: \
break; \
} \
} while (0)
#endif

View File

@@ -874,6 +874,9 @@ static int adv7511_bridge_attach(struct drm_bridge *bridge)
&adv7511_connector_helper_funcs);
drm_connector_attach_encoder(&adv->connector, bridge->encoder);
if (adv->type == ADV7533)
ret = adv7533_attach_dsi(adv);
if (adv->i2c_main->irq)
regmap_write(adv->regmap, ADV7511_REG_INT_ENABLE(0),
ADV7511_INT0_HPD);
@@ -1219,17 +1222,8 @@ static int adv7511_probe(struct i2c_client *i2c, const struct i2c_device_id *id)
drm_bridge_add(&adv7511->bridge);
adv7511_audio_init(dev, adv7511);
if (adv7511->type == ADV7533) {
ret = adv7533_attach_dsi(adv7511);
if (ret)
goto err_remove_bridge;
}
return 0;
err_remove_bridge:
drm_bridge_remove(&adv7511->bridge);
err_unregister_cec:
i2c_unregister_device(adv7511->i2c_cec);
if (adv7511->cec_clk)

View File

@@ -26,6 +26,7 @@
*/
#include <linux/dma-fence.h>
#include <linux/ktime.h>
#include <drm/drm_atomic.h>
#include <drm/drm_atomic_helper.h>
@@ -1580,9 +1581,23 @@ static void commit_tail(struct drm_atomic_state *old_state)
{
struct drm_device *dev = old_state->dev;
const struct drm_mode_config_helper_funcs *funcs;
ktime_t start;
s64 commit_time_ms;
funcs = dev->mode_config.helper_private;
/*
* We're measuring the _entire_ commit, so the time will vary depending
* on how many fences and objects are involved. For the purposes of self
* refresh, this is desirable since it'll give us an idea of how
* congested things are. This will inform our decision on how often we
* should enter self refresh after idle.
*
* These times will be averaged out in the self refresh helpers to avoid
* overreacting over one outlier frame
*/
start = ktime_get();
drm_atomic_helper_wait_for_fences(dev, old_state, false);
drm_atomic_helper_wait_for_dependencies(old_state);
@@ -1592,6 +1607,11 @@ static void commit_tail(struct drm_atomic_state *old_state)
else
drm_atomic_helper_commit_tail(old_state);
commit_time_ms = ktime_ms_delta(ktime_get(), start);
if (commit_time_ms > 0)
drm_self_refresh_helper_update_avg_times(old_state,
(unsigned long)commit_time_ms);
drm_atomic_helper_commit_cleanup_done(old_state);
drm_atomic_state_put(old_state);
@@ -3275,7 +3295,7 @@ static int page_flip_common(struct drm_atomic_state *state,
return PTR_ERR(crtc_state);
crtc_state->event = event;
crtc_state->pageflip_flags = flags;
crtc_state->async_flip = flags & DRM_MODE_PAGE_FLIP_ASYNC;
plane_state = drm_atomic_get_plane_state(state, plane);
if (IS_ERR(plane_state))

View File

@@ -128,7 +128,7 @@ void __drm_atomic_helper_crtc_duplicate_state(struct drm_crtc *crtc,
state->zpos_changed = false;
state->commit = NULL;
state->event = NULL;
state->pageflip_flags = 0;
state->async_flip = false;
/* Self refresh should be canceled when a new update is available */
state->active = drm_atomic_crtc_effectively_active(state);

View File

@@ -1305,8 +1305,7 @@ int drm_mode_atomic_ioctl(struct drm_device *dev,
if (arg->reserved)
return -EINVAL;
if ((arg->flags & DRM_MODE_PAGE_FLIP_ASYNC) &&
!dev->mode_config.async_page_flip)
if (arg->flags & DRM_MODE_PAGE_FLIP_ASYNC)
return -EINVAL;
/* can't test and expect an event at the same time. */

View File

@@ -976,14 +976,14 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
if (ret)
goto err_minors;
dev->registered = true;
if (dev->driver->load) {
ret = dev->driver->load(dev, flags);
if (ret)
goto err_minors;
}
dev->registered = true;
if (drm_core_check_feature(dev, DRIVER_MODESET))
drm_modeset_register_all(dev);

View File

@@ -336,7 +336,12 @@ drm_setclientcap(struct drm_device *dev, void *data, struct drm_file *file_priv)
case DRM_CLIENT_CAP_ATOMIC:
if (!drm_core_check_feature(dev, DRIVER_ATOMIC))
return -EOPNOTSUPP;
if (req->value > 1)
/* The modesetting DDX has a totally broken idea of atomic. */
if (current->comm[0] == 'X' && req->value == 1) {
pr_info("broken atomic modeset userspace detected, disabling atomic\n");
return -EOPNOTSUPP;
}
if (req->value > 2)
return -EINVAL;
file_priv->atomic = req->value;
file_priv->universal_planes = req->value;

View File

@@ -42,7 +42,7 @@ int __drm_mode_object_add(struct drm_device *dev, struct drm_mode_object *obj,
{
int ret;
WARN_ON(dev->registered && !obj_free_cb);
WARN_ON(!dev->driver->load && dev->registered && !obj_free_cb);
mutex_lock(&dev->mode_config.idr_mutex);
ret = idr_alloc(&dev->mode_config.object_idr, register_obj ? obj : NULL,
@@ -104,7 +104,7 @@ void drm_mode_object_register(struct drm_device *dev,
void drm_mode_object_unregister(struct drm_device *dev,
struct drm_mode_object *object)
{
WARN_ON(dev->registered && !object->free_cb);
WARN_ON(!dev->driver->load && dev->registered && !object->free_cb);
mutex_lock(&dev->mode_config.idr_mutex);
if (object->id) {

View File

@@ -5,6 +5,7 @@
* Authors:
* Sean Paul <seanpaul@chromium.org>
*/
#include <linux/average.h>
#include <linux/bitops.h>
#include <linux/slab.h>
#include <linux/workqueue.h>
@@ -50,11 +51,17 @@
* atomic_check when &drm_crtc_state.self_refresh_active is true.
*/
#define SELF_REFRESH_AVG_SEED_MS 200
DECLARE_EWMA(psr_time, 4, 4)
struct drm_self_refresh_data {
struct drm_crtc *crtc;
struct delayed_work entry_work;
struct drm_atomic_state *save_state;
unsigned int entry_delay_ms;
struct mutex avg_mutex;
struct ewma_psr_time entry_avg_ms;
struct ewma_psr_time exit_avg_ms;
};
static void drm_self_refresh_helper_entry_work(struct work_struct *work)
@@ -122,6 +129,44 @@ out_drop_locks:
drm_modeset_acquire_fini(&ctx);
}
/**
* drm_self_refresh_helper_update_avg_times - Updates a crtc's SR time averages
* @state: the state which has just been applied to hardware
* @commit_time_ms: the amount of time in ms that this commit took to complete
*
* Called after &drm_mode_config_funcs.atomic_commit_tail, this function will
* update the average entry/exit self refresh times on self refresh transitions.
* These averages will be used when calculating how long to delay before
* entering self refresh mode after activity.
*/
void drm_self_refresh_helper_update_avg_times(struct drm_atomic_state *state,
unsigned int commit_time_ms)
{
struct drm_crtc *crtc;
struct drm_crtc_state *old_crtc_state, *new_crtc_state;
int i;
for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
new_crtc_state, i) {
struct drm_self_refresh_data *sr_data = crtc->self_refresh_data;
struct ewma_psr_time *time;
if (old_crtc_state->self_refresh_active ==
new_crtc_state->self_refresh_active)
continue;
if (new_crtc_state->self_refresh_active)
time = &sr_data->entry_avg_ms;
else
time = &sr_data->exit_avg_ms;
mutex_lock(&sr_data->avg_mutex);
ewma_psr_time_add(time, commit_time_ms);
mutex_unlock(&sr_data->avg_mutex);
}
}
EXPORT_SYMBOL(drm_self_refresh_helper_update_avg_times);
/**
* drm_self_refresh_helper_alter_state - Alters the atomic state for SR exit
* @state: the state currently being checked
@@ -153,6 +198,7 @@ void drm_self_refresh_helper_alter_state(struct drm_atomic_state *state)
for_each_new_crtc_in_state(state, crtc, crtc_state, i) {
struct drm_self_refresh_data *sr_data;
unsigned int delay;
/* Don't trigger the entry timer when we're already in SR */
if (crtc_state->self_refresh_active)
@@ -162,8 +208,13 @@ void drm_self_refresh_helper_alter_state(struct drm_atomic_state *state)
if (!sr_data)
continue;
mutex_lock(&sr_data->avg_mutex);
delay = (ewma_psr_time_read(&sr_data->entry_avg_ms) +
ewma_psr_time_read(&sr_data->exit_avg_ms)) * 2;
mutex_unlock(&sr_data->avg_mutex);
mod_delayed_work(system_wq, &sr_data->entry_work,
msecs_to_jiffies(sr_data->entry_delay_ms));
msecs_to_jiffies(delay));
}
}
EXPORT_SYMBOL(drm_self_refresh_helper_alter_state);
@@ -171,12 +222,10 @@ EXPORT_SYMBOL(drm_self_refresh_helper_alter_state);
/**
* drm_self_refresh_helper_init - Initializes self refresh helpers for a crtc
* @crtc: the crtc which supports self refresh supported displays
* @entry_delay_ms: amount of inactivity to wait before entering self refresh
*
* Returns zero if successful or -errno on failure
*/
int drm_self_refresh_helper_init(struct drm_crtc *crtc,
unsigned int entry_delay_ms)
int drm_self_refresh_helper_init(struct drm_crtc *crtc)
{
struct drm_self_refresh_data *sr_data = crtc->self_refresh_data;
@@ -190,8 +239,18 @@ int drm_self_refresh_helper_init(struct drm_crtc *crtc,
INIT_DELAYED_WORK(&sr_data->entry_work,
drm_self_refresh_helper_entry_work);
sr_data->entry_delay_ms = entry_delay_ms;
sr_data->crtc = crtc;
mutex_init(&sr_data->avg_mutex);
ewma_psr_time_init(&sr_data->entry_avg_ms);
ewma_psr_time_init(&sr_data->exit_avg_ms);
/*
* Seed the averages so they're non-zero (and sufficiently large
* for even poorly performing panels). As time goes on, this will be
* averaged out and the values will trend to their true value.
*/
ewma_psr_time_add(&sr_data->entry_avg_ms, SELF_REFRESH_AVG_SEED_MS);
ewma_psr_time_add(&sr_data->exit_avg_ms, SELF_REFRESH_AVG_SEED_MS);
crtc->self_refresh_data = sr_data;
return 0;

View File

@@ -267,7 +267,7 @@ nv50_wndw_atomic_check_acquire(struct nv50_wndw *wndw, bool modeset,
asyw->image.pitch[0] = fb->base.pitches[0];
}
if (!(asyh->state.pageflip_flags & DRM_MODE_PAGE_FLIP_ASYNC))
if (!asyh->state.async_flip)
asyw->image.interval = 1;
else
asyw->image.interval = 0;
@@ -383,7 +383,7 @@ nv50_wndw_atomic_check_lut(struct nv50_wndw *wndw,
}
/* Can't do an immediate flip while changing the LUT. */
asyh->state.pageflip_flags &= ~DRM_MODE_PAGE_FLIP_ASYNC;
asyh->state.async_flip = false;
}
static int

View File

@@ -39,7 +39,7 @@ static int panfrost_devfreq_target(struct device *dev, unsigned long *freq,
* If frequency scaling from low to high, adjust voltage first.
* If frequency scaling from high to low, adjust frequency first.
*/
if (old_clk_rate < target_rate && pfdev->regulator) {
if (old_clk_rate < target_rate) {
err = regulator_set_voltage(pfdev->regulator, target_volt,
target_volt);
if (err) {
@@ -53,14 +53,12 @@ static int panfrost_devfreq_target(struct device *dev, unsigned long *freq,
if (err) {
dev_err(dev, "Cannot set frequency %lu (%d)\n", target_rate,
err);
if (pfdev->regulator)
regulator_set_voltage(pfdev->regulator,
pfdev->devfreq.cur_volt,
pfdev->devfreq.cur_volt);
regulator_set_voltage(pfdev->regulator, pfdev->devfreq.cur_volt,
pfdev->devfreq.cur_volt);
return err;
}
if (old_clk_rate > target_rate && pfdev->regulator) {
if (old_clk_rate > target_rate) {
err = regulator_set_voltage(pfdev->regulator, target_volt,
target_volt);
if (err)

View File

@@ -89,12 +89,9 @@ static int panfrost_regulator_init(struct panfrost_device *pfdev)
{
int ret;
pfdev->regulator = devm_regulator_get_optional(pfdev->dev, "mali");
pfdev->regulator = devm_regulator_get(pfdev->dev, "mali");
if (IS_ERR(pfdev->regulator)) {
ret = PTR_ERR(pfdev->regulator);
pfdev->regulator = NULL;
if (ret == -ENODEV)
return 0;
dev_err(pfdev->dev, "failed to get regulator: %d\n", ret);
return ret;
}
@@ -110,8 +107,7 @@ static int panfrost_regulator_init(struct panfrost_device *pfdev)
static void panfrost_regulator_fini(struct panfrost_device *pfdev)
{
if (pfdev->regulator)
regulator_disable(pfdev->regulator);
regulator_disable(pfdev->regulator);
}
int panfrost_device_init(struct panfrost_device *pfdev)

View File

@@ -394,28 +394,40 @@ void panfrost_mmu_pgtable_free(struct panfrost_file_priv *priv)
free_io_pgtable_ops(mmu->pgtbl_ops);
}
static struct drm_mm_node *addr_to_drm_mm_node(struct panfrost_device *pfdev, int as, u64 addr)
static struct panfrost_gem_object *
addr_to_drm_mm_node(struct panfrost_device *pfdev, int as, u64 addr)
{
struct drm_mm_node *node = NULL;
struct panfrost_gem_object *bo = NULL;
struct panfrost_file_priv *priv;
struct drm_mm_node *node;
u64 offset = addr >> PAGE_SHIFT;
struct panfrost_mmu *mmu;
spin_lock(&pfdev->as_lock);
list_for_each_entry(mmu, &pfdev->as_lru_list, list) {
struct panfrost_file_priv *priv;
if (as != mmu->as)
continue;
if (as == mmu->as)
break;
}
if (as != mmu->as)
goto out;
priv = container_of(mmu, struct panfrost_file_priv, mmu);
drm_mm_for_each_node(node, &priv->mm) {
if (offset >= node->start && offset < (node->start + node->size))
goto out;
priv = container_of(mmu, struct panfrost_file_priv, mmu);
spin_lock(&priv->mm_lock);
drm_mm_for_each_node(node, &priv->mm) {
if (offset >= node->start &&
offset < (node->start + node->size)) {
bo = drm_mm_node_to_panfrost_bo(node);
drm_gem_object_get(&bo->base.base);
break;
}
}
spin_unlock(&priv->mm_lock);
out:
spin_unlock(&pfdev->as_lock);
return node;
return bo;
}
#define NUM_FAULT_PAGES (SZ_2M / PAGE_SIZE)
@@ -423,29 +435,28 @@ out:
int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, u64 addr)
{
int ret, i;
struct drm_mm_node *node;
struct panfrost_gem_object *bo;
struct address_space *mapping;
pgoff_t page_offset;
struct sg_table *sgt;
struct page **pages;
node = addr_to_drm_mm_node(pfdev, as, addr);
if (!node)
bo = addr_to_drm_mm_node(pfdev, as, addr);
if (!bo)
return -ENOENT;
bo = drm_mm_node_to_panfrost_bo(node);
if (!bo->is_heap) {
dev_WARN(pfdev->dev, "matching BO is not heap type (GPU VA = %llx)",
node->start << PAGE_SHIFT);
return -EINVAL;
bo->node.start << PAGE_SHIFT);
ret = -EINVAL;
goto err_bo;
}
WARN_ON(bo->mmu->as != as);
/* Assume 2MB alignment and size multiple */
addr &= ~((u64)SZ_2M - 1);
page_offset = addr >> PAGE_SHIFT;
page_offset -= node->start;
page_offset -= bo->node.start;
mutex_lock(&bo->base.pages_lock);
@@ -454,7 +465,8 @@ int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, u64 addr)
sizeof(struct sg_table), GFP_KERNEL | __GFP_ZERO);
if (!bo->sgts) {
mutex_unlock(&bo->base.pages_lock);
return -ENOMEM;
ret = -ENOMEM;
goto err_bo;
}
pages = kvmalloc_array(bo->base.base.size >> PAGE_SHIFT,
@@ -463,7 +475,8 @@ int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, u64 addr)
kfree(bo->sgts);
bo->sgts = NULL;
mutex_unlock(&bo->base.pages_lock);
return -ENOMEM;
ret = -ENOMEM;
goto err_bo;
}
bo->base.pages = pages;
bo->base.pages_use_count = 1;
@@ -501,12 +514,16 @@ int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, u64 addr)
dev_dbg(pfdev->dev, "mapped page fault @ AS%d %llx", as, addr);
drm_gem_object_put_unlocked(&bo->base.base);
return 0;
err_map:
sg_free_table(sgt);
err_pages:
drm_gem_shmem_put_pages(&bo->base);
err_bo:
drm_gem_object_put_unlocked(&bo->base.base);
return ret;
}

View File

@@ -324,8 +324,39 @@ bool radeon_device_is_virtual(void);
static int radeon_pci_probe(struct pci_dev *pdev,
const struct pci_device_id *ent)
{
unsigned long flags = 0;
int ret;
if (!ent)
return -ENODEV; /* Avoid NULL-ptr deref in drm_get_pci_dev */
flags = ent->driver_data;
if (!radeon_si_support) {
switch (flags & RADEON_FAMILY_MASK) {
case CHIP_TAHITI:
case CHIP_PITCAIRN:
case CHIP_VERDE:
case CHIP_OLAND:
case CHIP_HAINAN:
dev_info(&pdev->dev,
"SI support disabled by module param\n");
return -ENODEV;
}
}
if (!radeon_cik_support) {
switch (flags & RADEON_FAMILY_MASK) {
case CHIP_KAVERI:
case CHIP_BONAIRE:
case CHIP_HAWAII:
case CHIP_KABINI:
case CHIP_MULLINS:
dev_info(&pdev->dev,
"CIK support disabled by module param\n");
return -ENODEV;
}
}
if (vga_switcheroo_client_probe_defer(pdev))
return -EPROBE_DEFER;

View File

@@ -100,31 +100,6 @@ int radeon_driver_load_kms(struct drm_device *dev, unsigned long flags)
struct radeon_device *rdev;
int r, acpi_status;
if (!radeon_si_support) {
switch (flags & RADEON_FAMILY_MASK) {
case CHIP_TAHITI:
case CHIP_PITCAIRN:
case CHIP_VERDE:
case CHIP_OLAND:
case CHIP_HAINAN:
dev_info(dev->dev,
"SI support disabled by module param\n");
return -ENODEV;
}
}
if (!radeon_cik_support) {
switch (flags & RADEON_FAMILY_MASK) {
case CHIP_KAVERI:
case CHIP_BONAIRE:
case CHIP_HAWAII:
case CHIP_KABINI:
case CHIP_MULLINS:
dev_info(dev->dev,
"CIK support disabled by module param\n");
return -ENODEV;
}
}
rdev = kzalloc(sizeof(struct radeon_device), GFP_KERNEL);
if (rdev == NULL) {
return -ENOMEM;

View File

@@ -39,8 +39,6 @@
#include "rockchip_drm_vop.h"
#include "rockchip_rgb.h"
#define VOP_SELF_REFRESH_ENTRY_DELAY_MS 100
#define VOP_WIN_SET(vop, win, name, v) \
vop_reg_set(vop, &win->phy->name, win->base, ~0, v, #name)
#define VOP_SCL_SET(vop, win, name, v) \
@@ -1563,8 +1561,7 @@ static int vop_create_crtc(struct vop *vop)
init_completion(&vop->line_flag_completion);
crtc->port = port;
ret = drm_self_refresh_helper_init(crtc,
VOP_SELF_REFRESH_ENTRY_DELAY_MS);
ret = drm_self_refresh_helper_init(crtc);
if (ret)
DRM_DEV_DEBUG_KMS(vop->dev,
"Failed to init %s with SR helpers %d, ignoring\n",

View File

@@ -78,7 +78,7 @@ static int ndev_mw_to_bar(struct amd_ntb_dev *ndev, int idx)
if (idx < 0 || idx > ndev->mw_count)
return -EINVAL;
return 1 << idx;
return ndev->dev_data->mw_idx << idx;
}
static int amd_ntb_mw_count(struct ntb_dev *ntb, int pidx)
@@ -909,7 +909,7 @@ static int amd_init_ntb(struct amd_ntb_dev *ndev)
{
void __iomem *mmio = ndev->self_mmio;
ndev->mw_count = AMD_MW_CNT;
ndev->mw_count = ndev->dev_data->mw_count;
ndev->spad_count = AMD_SPADS_CNT;
ndev->db_count = AMD_DB_CNT;
@@ -1069,6 +1069,8 @@ static int amd_ntb_pci_probe(struct pci_dev *pdev,
goto err_ndev;
}
ndev->dev_data = (struct ntb_dev_data *)id->driver_data;
ndev_init_struct(ndev, pdev);
rc = amd_ntb_init_pci(ndev, pdev);
@@ -1123,9 +1125,21 @@ static const struct file_operations amd_ntb_debugfs_info = {
.read = ndev_debugfs_read,
};
static const struct ntb_dev_data dev_data[] = {
{ /* for device 145b */
.mw_count = 3,
.mw_idx = 1,
},
{ /* for device 148b */
.mw_count = 2,
.mw_idx = 2,
},
};
static const struct pci_device_id amd_ntb_pci_tbl[] = {
{PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_NTB)},
{0}
{ PCI_VDEVICE(AMD, 0x145b), (kernel_ulong_t)&dev_data[0] },
{ PCI_VDEVICE(AMD, 0x148b), (kernel_ulong_t)&dev_data[1] },
{ 0, }
};
MODULE_DEVICE_TABLE(pci, amd_ntb_pci_tbl);

View File

@@ -52,7 +52,6 @@
#include <linux/ntb.h>
#include <linux/pci.h>
#define PCI_DEVICE_ID_AMD_NTB 0x145B
#define AMD_LINK_HB_TIMEOUT msecs_to_jiffies(1000)
#define AMD_LINK_STATUS_OFFSET 0x68
#define NTB_LIN_STA_ACTIVE_BIT 0x00000002
@@ -93,7 +92,6 @@ static inline void _write64(u64 val, void __iomem *mmio)
enum {
/* AMD NTB Capability */
AMD_MW_CNT = 3,
AMD_DB_CNT = 16,
AMD_MSIX_VECTOR_CNT = 24,
AMD_SPADS_CNT = 16,
@@ -170,6 +168,11 @@ enum {
AMD_PEER_OFFSET = 0x400,
};
struct ntb_dev_data {
const unsigned char mw_count;
const unsigned int mw_idx;
};
struct amd_ntb_dev;
struct amd_ntb_vec {
@@ -185,6 +188,7 @@ struct amd_ntb_dev {
u32 cntl_sta;
u32 peer_sta;
struct ntb_dev_data *dev_data;
unsigned char mw_count;
unsigned char spad_count;
unsigned char db_count;

Some files were not shown because too many files have changed in this diff Show More