Commit Graph

378682 Commits

Author SHA1 Message Date
Andy Honig
bc071b4cc7 KVM: Improve create VCPU parameter (CVE-2013-4587)
In multiple functions the vcpu_id is used as an offset into a bitfield.  Ag
malicious user could specify a vcpu_id greater than 255 in order to set or
clear bits in kernel memory.  This could be used to elevate priveges in the
kernel.  This patch verifies that the vcpu_id provided is less than 255.
The api documentation already specifies that the vcpu_id must be less than
max_vcpus, but this is currently not checked.

Reported-by: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 338c7dbadd)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:21 +02:00
Santosh Shilimkar
4479f3b006 arm/arm64: kvm: Use virt_to_idmap instead of virt_to_phys for idmap mappings
KVM initialisation fails on architectures implementing virt_to_idmap()
because virt_to_phys() on such architectures won't fetch you the correct
idmap page.

So update the KVM ARM code to use the virt_to_idmap() to fix the issue.
Since the KVM code is shared between arm and arm64, we create
kvm_virt_to_phys() and handle the redirection in respective headers.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 4fda342cc7)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:20 +02:00
Heiko Carstens
4ae68e1ee6 KVM: kvm_clear_guest_page(): fix empty_zero_page usage
Using the address of 'empty_zero_page' as source address in order to
clear a page is wrong. On some architectures empty_zero_page is only the
pointer to the struct page of the empty_zero_page.  Therefore the clear
page operation would copy the contents of a couple of struct pages instead
of clearing a page.  For kvm only arm/arm64 are affected by this bug.

To fix this use the ZERO_PAGE macro instead which will return the struct
page address of the empty_zero_page on all architectures.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit 8a3caa6d74)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:20 +02:00
Christoffer Dall
52031ff601 arm/arm64: KVM: Fix hyp mappings of vmalloc regions
Using virt_to_phys on percpu mappings is horribly wrong as it may be
backed by vmalloc.  Introduce kvm_kaddr_to_phys which translates both
types of valid kernel addresses to the corresponding physical address.

At the same time resolves a typing issue where we were storing the
physical address as a 32 bit unsigned long (on arm), truncating the
physical address for addresses above the 4GB limit.  This caused
breakage on Keystone.

Cc: <stable@vger.kernel.org>	[3.10+]
Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 40c2729bab)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:20 +02:00
Marc Zyngier
53e3896440 arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
When booting a vcpu using PSCI, make sure we start it with the
endianness of the caller. Otherwise, secondaries can be pretty
unhappy to execute a BE kernel in LE mode...

This conforms to PSCI spec Rev B, 5.13.3.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit ce94fe93d5)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:19 +02:00
Marc Zyngier
dd8858820e arm/arm64: KVM: MMIO support for BE guest
Do the necessary byteswap when host and guest have different
views of the universe. Actually, the only case we need to take
care of is when the guest is BE. All the other cases are naturally
handled.

Also be careful about endianness when the data is being memcopy-ed
from/to the run buffer.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit 6d89d2d9b5)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:19 +02:00
Marc Zyngier
128a021aa8 arm64: KVM: vgic: byteswap GICv2 access on world switch if BE
Ensure that accesses to the GICH_* registers are byteswapped
when the kernel is compiled as big-endian.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit c5b2c0f520)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:19 +02:00
Marc Zyngier
60979ebbbb arm64: KVM: initialize HYP mode following the kernel endianness
Force SCTLR_EL2.EE to 1 if the kernel is compiled as BE.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 18ea3dbc9e)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:18 +02:00
Gleb Natapov
d49c7a4473 KVM: remove vm mmap method
It was used in conjunction with KVM_SET_MEMORY_REGION ioctl which was
removed by b74a07beed in 2010, QEMU stopped using it in 2008, so
it is time to remove the code finally.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit 80f5b5e700)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:18 +02:00
Michael S. Tsirkin
7f17a13bdd kvm_host: typo fix
fix up typo in comment.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 81e87e2679)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:17 +02:00
Alex Williamson
a7dc5f5535 kvm: Add VFIO device
So far we've succeeded at making KVM and VFIO mostly unaware of each
other, but areas are cropping up where a connection beyond eventfds
and irqfds needs to be made.  This patch introduces a KVM-VFIO device
that is meant to be a gateway for such interaction.  The user creates
the device and can add and remove VFIO groups to it via file
descriptors.  When a group is added, KVM verifies the group is valid
and gets a reference to it via the VFIO external user interface.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ec53500fae)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:17 +02:00
Borislav Petkov
b4fe3057a0 kvm: Add KVM_GET_EMULATED_CPUID
Add a kvm ioctl which states which system functionality kvm emulates.
The format used is that of CPUID and we return the corresponding CPUID
bits set for which we do emulate functionality.

Make sure ->padding is being passed on clean from userspace so that we
can use it for something in the future, after the ioctl gets cast in
stone.

s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at
it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9c15bb1d0a)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:17 +02:00
Paolo Bonzini
e845f9d367 KVM: use a more sensible error number when debugfs directory creation fails
I don't know if this was due to cut and paste, or somebody was really
using a D20 to pick the error code for kvm_init_debugfs as suggested by
Linus (EFAULT is 14, so the possibility cannot be entirely ruled out).

In any case, this patch fixes it.

Reported-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0c8eb04a62)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:16 +02:00
Marc Zyngier
c0cdef185a arm64: KVM: Yield CPU when vcpu executes a WFE
On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.

The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit d241aac798)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:16 +02:00
Yang Zhang
b695de8341 KVM: Mapping IOMMU pages after updating memslot
In kvm_iommu_map_pages(), we need to know the page size via call
kvm_host_page_size(). And it will check whether the target slot
is valid before return the right page size.
Currently, we will map the iommu pages when creating a new slot.
But we call kvm_iommu_map_pages() during preparing the new slot.
At that time, the new slot is not visible by domain(still in preparing).
So we cannot get the right page size from kvm_host_page_size() and
this will break the IOMMU super page logic.
The solution is to map the iommu pages after we insert the new slot
into domain.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Tested-by: Patrick Lu <patrick.lu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e0230e1327)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:16 +02:00
Marc Zyngier
3d6b7ab302 arm/arm64: KVM: PSCI: use MPIDR to identify a target CPU
The KVM PSCI code blindly assumes that vcpu_id and MPIDR are
the same thing. This is true when vcpus are organized as a flat
topology, but is wrong when trying to emulate any other topology
(such as A15 clusters).

Change the KVM PSCI CPU_ON code to look at the MPIDR instead
of the vcpu_id to pick a target CPU.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 79c648806f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:15 +02:00
Marc Zyngier
c93a79267f ARM: KVM: drop limitation to 4 CPU VMs
Now that the KVM/arm code knows about affinity, remove the hard
limit of 4 vcpus per VM.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 7999b4d182)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:15 +02:00
Marc Zyngier
8c02aa5001 ARM: KVM: fix L2CTLR to be per-cluster
The L2CTLR register contains the number of CPUs in this cluster.

Make sure the register content is actually relevant to the vcpu
that is being configured by computing the number of cores that are
part of its cluster.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 9cbb6d969c)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:14 +02:00
Marc Zyngier
c0acda6fd7 ARM: KVM: Fix MPIDR computing to support virtual clusters
In order to be able to support more than 4 A7 or A15 CPUs,
we need to fix the MPIDR computing to reflect the fact that
both A15 and A7 can only exist in clusters of at most 4 CPUs.

Fix the MPIDR computing to allow virtual clusters to be exposed
to the guest.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 2d1d841bd4)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:14 +02:00
Christoffer Dall
ad8b3ca795 KVM: ARM: Transparent huge page (THP) support
Support transparent huge pages in KVM/ARM and KVM/ARM64.  The
transparent_hugepage_adjust is not very pretty, but this is also how
it's solved on x86 and seems to be simply an artifact on how THPs
behave.  This should eventually be shared across architectures if
possible, but that can always be changed down the road.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 9b5fdb9781)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:14 +02:00
Christoffer Dall
cd709bb9fb KVM: ARM: Support hugetlbfs backed huge pages
Support huge pages in KVM/ARM and KVM/ARM64.  The pud_huge checking on
the unmap path may feel a bit silly as the pud_huge check is always
defined to false, but the compiler should be smart about this.

Note: This deals only with VMAs marked as huge which are allocated by
users through hugetlbfs only.  Transparent huge pages can only be
detected by looking at the underlying pages (or the page tables
themselves) and this patch so far simply maps these on a page-by-page
level in the Stage-2 page tables.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit ad361f093c)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:13 +02:00
Christoffer Dall
1391c263d2 KVM: ARM: Update comments for kvm_handle_wfi
Update comments to reflect what is really going on and add the TWE bit
to the comments in kvm_arm.h.

Also renames the function to kvm_handle_wfx like is done on arm64 for
consistency and uber-correctness.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 86ed81aa2e)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:13 +02:00
Marc Zyngier
46a037fd36 ARM: KVM: Yield CPU when vcpu executes a WFE
On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.

This creates contention, and the observed slowdown is 40x for
hackbench. No, this isn't a typo.

The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.

Quick test to estimate the performance: hackbench 1 process 1000

2xA15 host (baseline):	1.843s

2xA15 guest w/o patch:	2.083s
4xA15 guest w/o patch:	80.212s
8xA15 guest w/o patch:	Could not be bothered to find out

2xA15 guest w/ patch:	2.102s
4xA15 guest w/ patch:	3.205s
8xA15 guest w/ patch:	6.887s

So we go from a 40x degradation to 1.5x in the 2x overcommit case,
which is vaguely more acceptable.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 58d5ec8f8e)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:12 +02:00
Aneesh Kumar K.V
8413726807 kvm: Add struct kvm arg to memslot APIs
We will use that in the later patch to find the kvm ops handler

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 5587027ce9)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:12 +02:00
chai wen
0a2e6af7be KVM: Drop FOLL_GET in GUP when doing async page fault
Page pinning is not mandatory in kvm async page fault processing since
after async page fault event is delivered to a guest it accesses page once
again and does its own GUP.  Drop the FOLL_GET flag in GUP in async_pf
code, and do some simplifying in check/clear processing.

Suggested-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Gu zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit f2e106692d)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:12 +02:00
Christoffer Dall
1feff9299a KVM: arm64: Get rid of KVM_HPAGE defines
Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit ef0cfe71c2)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:11 +02:00
Christoffer Dall
4b0745341e KVM: ARM: Get rid of KVM_HPAGE defines
The KVM_HPAGE_DEFINES are a little artificial on ARM, since the huge
page size is statically defined at compile time and there is only a
single huge page size.

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit dc6f6763df)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:11 +02:00
Christoffer Dall
c7a9a5f3f0 KVM: Move gfn_to_index to x86 specific code
The gfn_to_index function relies on huge page defines which either may
not make sense on systems that don't support huge pages or are defined
in an unconvenient way for other architectures.  Since this is
x86-specific, move the function to arch/x86/include/asm/kvm_host.h.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit 6d9d41e574)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:11 +02:00
Jonathan Austin
5e1ddf60b6 KVM: ARM: Add support for Cortex-A7
This patch adds support for running Cortex-A7 guests on Cortex-A7 hosts.

As Cortex-A7 is architecturally compatible with A15, this patch is largely just
generalising existing code. Areas where 'implementation defined' behaviour
is identical for A7 and A15 is moved to allow it to be used by both cores.

The check to ensure that coprocessor register tables are sorted correctly is
also moved in to 'common' code to avoid each new cpu doing its own check
(and possibly forgetting to do so!)

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit e8c2d99f82)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:10 +02:00
Jonathan Austin
e82866a69b KVM: ARM: fix the size of TTBCR_{T0SZ,T1SZ} masks
The T{0,1}SZ fields of TTBCR are 3 bits wide when using the long descriptor
format. Likewise, the T0SZ field of the HTCR is 3-bits. KVM currently
defines TTBCR_T{0,1}SZ as 3, not 7.

The T0SZ mask is used to calculate the value for the HTCR, both to pick out
TTBCR.T0SZ and mask off the equivalent field in the HTCR during
read-modify-write. The incorrect mask size causes the (UNKNOWN) reset value
of HTCR.T0SZ to leak in to the calculated HTCR value. Linux will hang when
initializing KVM if HTCR's reset value has bit 2 set (sometimes the case on
A7/TC2)

Fixing T0SZ allows A7 cores to boot and T1SZ is also fixed for completeness.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 5e497046f0)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:10 +02:00
Jonathan Austin
c1b378c34e KVM: ARM: Fix calculation of virtual CPU ID
KVM does not have a notion of multiple clusters for CPUs, just a linear
array of CPUs. When using a system with cores in more than one cluster, the
current method for calculating the virtual MPIDR will leak the (physical)
cluster information into the virtual MPIDR. One effect of this is that
Linux under KVM fails to boot multiple CPUs that aren't in the 0th cluster.

This patch does away with exposing the real MPIDR fields in favour of simply
using the virtual CPU number (but preserving the U bit, as before).

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 1158fca401)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:09 +02:00
Andre Richter
708854902d virt/kvm/iommu.c: Add leading zeros to device's BDF notation in debug messages
When KVM (de)assigns PCI(e) devices to VMs, a debug message is printed
including the BDF notation of the respective device. Currently, the BDF
notation does not have the commonly used leading zeros. This produces
messages like "assign device 0:1:8.0", which look strange at first sight.

The patch fixes this by exchanging the printk(KERN_DEBUG ...) with dev_info()
and also inserts "kvm" into the debug message, so that it is obvious where
the message comes from. Also reduces LoC.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Andre Richter <andre.o.richter@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit 29242cb5c6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:09 +02:00
Gleb Natapov
89c774beb2 Fix NULL dereference in gfn_to_hva_prot()
gfn_to_memslot() can return NULL or invalid slot. We need to check slot
validity before accessing it.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit a2ac07fe29)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:09 +02:00
Anup Patel
df50e5da2c ARM/ARM64: KVM: Implement KVM_ARM_PREFERRED_TARGET ioctl
For implementing CPU=host, we need a mechanism for querying
preferred VCPU target type on underlying Host.

This patch implements KVM_ARM_PREFERRED_TARGET vm ioctl which
returns struct kvm_vcpu_init instance containing information
about preferred VCPU target type and target specific features
available for it.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 42c4e0c77a)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:08 +02:00
Anup Patel
9778336fe1 ARM64: KVM: Implement kvm_vcpu_preferred_target() function
This patch implements kvm_vcpu_preferred_target() function for
KVM ARM64 which will help us implement KVM_ARM_PREFERRED_TARGET
ioctl for user space.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 473bdc0e65)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:08 +02:00
Anup Patel
f166457e08 ARM: KVM: Implement kvm_vcpu_preferred_target() function
This patch implements kvm_vcpu_preferred_target() function for
KVM ARM which will help us implement KVM_ARM_PREFERRED_TARGET ioctl
for user space.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 4a6fee805d)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:08 +02:00
Anup Patel
5c1d6aafed KVM: ARM: Fix typo in comments of inject_abt()
Very minor typo in comments of inject_abt() when we update fault status
register for injecting prefetch abort.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit b373e492f3)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:07 +02:00
Paolo Bonzini
9c904866d4 KVM: Convert kvm_lock back to non-raw spinlock
In commit e935b8372c ("KVM: Convert kvm_lock to raw_spinlock"),
the kvm_lock was made a raw lock.  However, the kvm mmu_shrink()
function tries to grab the (non-raw) mmu_lock within the scope of
the raw locked kvm_lock being held.  This leads to the following:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 55, name: kswapd0
Preemption disabled at:[<ffffffffa0376eac>] mmu_shrink+0x5c/0x1b0 [kvm]

Pid: 55, comm: kswapd0 Not tainted 3.4.34_preempt-rt
Call Trace:
 [<ffffffff8106f2ad>] __might_sleep+0xfd/0x160
 [<ffffffff817d8d64>] rt_spin_lock+0x24/0x50
 [<ffffffffa0376f3c>] mmu_shrink+0xec/0x1b0 [kvm]
 [<ffffffff8111455d>] shrink_slab+0x17d/0x3a0
 [<ffffffff81151f00>] ? mem_cgroup_iter+0x130/0x260
 [<ffffffff8111824a>] balance_pgdat+0x54a/0x730
 [<ffffffff8111fe47>] ? set_pgdat_percpu_threshold+0xa7/0xd0
 [<ffffffff811185bf>] kswapd+0x18f/0x490
 [<ffffffff81070961>] ? get_parent_ip+0x11/0x50
 [<ffffffff81061970>] ? __init_waitqueue_head+0x50/0x50
 [<ffffffff81118430>] ? balance_pgdat+0x730/0x730
 [<ffffffff81060d2b>] kthread+0xdb/0xe0
 [<ffffffff8106e122>] ? finish_task_switch+0x52/0x100
 [<ffffffff817e1e94>] kernel_thread_helper+0x4/0x10
 [<ffffffff81060c50>] ? __init_kthread_worker+0x

After the previous patch, kvm_lock need not be a raw spinlock anymore,
so change it back.

Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2f303b74a6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:07 +02:00
Paolo Bonzini
753f251708 KVM: protect kvm_usage_count with its own spinlock
The VM list need not be protected by a raw spinlock.  Separate the
two so that kvm_lock can be made non-raw.

Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4a937f96f3)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:06 +02:00
Paolo Bonzini
ae0e4b34f8 KVM: cleanup (physical) CPU hotplug
Remove the useless argument, and do not do anything if there are no
VMs running at the time of the hotplug.

Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4fa92fb25a)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:06 +02:00
Olof Johansson
c3832c083a ARM: kvm: rename cpu_reset to avoid name clash
cpu_reset is already #defined in <asm/proc-fns.h> as processor.reset,
so it expands here and causes problems.

Cc: <stable@vger.kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit ac570e0493)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:06 +02:00
Radim Krčmář
78169fda11 kvm: remove .done from struct kvm_async_pf
'.done' is used to mark the completion of 'async_pf_execute()', but
'cancel_work_sync()' returns true when the work was canceled, so we
use it instead.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 98fda16929)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:05 +02:00
Radim Krčmář
ed363014a2 kvm: free resources after canceling async_pf
When we cancel 'async_pf_execute()', we should behave as if the work was
never scheduled in 'kvm_setup_async_pf()'.
Fixes a bug when we can't unload module because the vm wasn't destroyed.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 28b441e240)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:05 +02:00
Paolo Bonzini
02ef4f0c0d KVM: mmu: allow page tables to be in read-only slots
Page tables in a read-only memory slot will currently cause a triple
fault because the page walker uses gfn_to_hva and it fails on such a slot.

OVMF uses such a page table; however, real hardware seems to be fine with
that as long as the accessed/dirty bits are set.  Save whether the slot
is readonly, and later check it when updating the accessed and dirty bits.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ba6a354154)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:05 +02:00
Christoffer Dall
215ed7558d ARM: KVM: Add newlines to panic strings
The panic strings are hard to read and on narrow terminals some
characters are simply truncated off the panic message.

Make is slightly prettier with a newline in the Hyp panic strings.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 1fe40f6d39)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:04 +02:00
Christoffer Dall
411b0c9901 ARM: KVM: Work around older compiler bug
Compilers before 4.6 do not behave well with unnamed fields in structure
initializers and therefore produces build errors:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676

By refering to the unnamed union using braces, both older and newer
compilers produce the same result.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Russell King <linux@arm.linux.org.uk>
Tested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 6833d83891)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:04 +02:00
Christoffer Dall
14cffe44b8 ARM: KVM: Simplify tracepoint text
The tracepoint for kvm_guest_fault was extremely long, make it a
slightly bit shorter.

Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 6e72cc5700)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:03 +02:00
Christoffer Dall
07557caabb ARM: KVM: Fix kvm_set_pte assignment
THe kvm_set_pte function was actually assigning the entire struct to the
structure member, which should work because the structure only has that
one member, but it is still not very nice.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 0963e5d0f2)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:03 +02:00
Christoffer Dall
de5324b844 ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256
The Versatile Express TC2 board, which we use as our main emulated
platform in QEMU, defines 160+32 == 192 interrupts, so limiting the
number of interrupts to 128 is not quite going to cut it for real board
emulation.

Note that this didn't use to be a problem because QEMU was buggy and
only defined 128 interrupts until recently.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit 9b2d2e0df8)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:03 +02:00
Christoffer Dall
a144ec826c ARM: KVM: Bugfix: vgic_bytemap_get_reg per cpu regs
For bytemaps each IRQ field is 1 byte wide, so we pack 4 irq fields in
one word and since there are 32 private (per cpu) irqs, we have 8
private u32 fields on the vgic_bytemap struct.  We shift the offset from
the base of the register group right by 2, giving us the word index
instead of the field index.  But then there are 8 private words, not 4,
which is also why we subtract 8 words from the offset of the shared
words.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit 8d98915b6b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-02 17:18:02 +02:00