Commit Graph

1062817 Commits

Author SHA1 Message Date
David Brazdil
c6fb848826 ANDROID: KVM: arm64: Initialize pkvm_pgtable.mm_ops earlier
The `init` callback of an IOMMU driver is called just before
`finalize_host_mappings` so that EL2 mappings created by drivers are
subsequently unmapped from host stage-2. However, at this point hyp has
already switched to the buddy allocator, having reserved pages allocated
by the early allocator, but `pkvm_pgtable.mm_ops` have not been switched
to buddy allocator callbacks. As a result, pages allocated for EL2
mappings of the IOMMU driver are allocated by the obsoleted early
allocator and remain treated as free by the buddy allocator. This likely
leads to a corruption in the free page lists and a later hyp panic.

Move the initialization of `pkvm_pgtable.mm_ops` before
`finalize_host_mappings` and the call to IOMMU's `init`.

Test: run a VM
Test: adb shell cmd jobscheduler run -f android 5132250
Bug: 190463801
Bug: 209004831
Change-Id: I1f6e00bca087d889b0cad4bd43d044895e37006c
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 395d045123)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
b2ba2c2eb4 ANDROID: KVM: arm64: Mark select_iommu_ops static
The function is only used in the compilation unit where it is defined.
Silence a warning by marking it static.

Test: builds
Bug: 190463801
Change-Id: I296cffefdef4639ef2bab644d42f1374ee1a2f60
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 91abc8ece2)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
778b01ae9c ANDROID: Enable KVM_S2MPU in gki_defconfig
Enable the KVM S2MPU driver in GKI.

Test: builds, boots
Bug: 190463801
Change-Id: I653cac7622e8b6e7f6484d7d8d9ee0b192edb705
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 6d44858773)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
96c70f65d3 ANDROID: KVM: arm64: Unmap S2MPU MMIO registers from host stage-2
The S2MPU driver needs to protect its MMIO registers from the host.
Implement the host_stage2_adjust_mmio_range callback and restrict
the address range that is about to be mapped in to avoid the known
S2MPU MMIO regions.

Test: builds, boots
Bug: 190463801
Change-Id: Ib46f5dd651b9368c31940035e4c28a7324fc4160
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 8f23406153)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
0abb73d83a ANDROID: KVM: arm64: Implement MMIO handler in S2MPU driver
The host should not have access to the vast majority of S2MPU MMIO
registers. Currently it only needs access to fault information, in
the future maybe also performance registers.

Implement an MMIO trap handler for the S2MPU, allowing read-only
access to FAULT_* registers, and a write-only access to
INTERRUPT_CLEAR.

Test: builds, boots
Bug: 190463801
Change-Id: Ia482cc65642ba9ec303f443591e8f0fe192d4d27
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 81e70911d6)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
061e532ecb ANDROID: KVM: arm64: Modify S2MPU MPT in 'host_stage2_set_owner'
The 'host_stage2_set_owner' callback indicates that a range of
PA-contiguous pages changed owner. With all devices owned by the host,
the driver sets the protection bits in the corresponding FMPT/SMPT to
either MPT_PROT_RW if owned by the host or MPT_PROT_NONE otherwise.

For each gigabyte region, the implementation will select between 1G and
4K/64K (depending on PAGE_SIZE) mappings and populate the L1ENTRY_ATTR
register or SMPT bitmap, respectivelly.

The driver never dynamically switches between two granularities which
both require a SMPT. This is because the L1ENTRY_ATTR and
L1ENTRY_L2TABLE_ADDR registers would need to be set atomically.

Test: builds, boots
Bug: 190463801
Change-Id: Ifb0bdcaa143ef8eb213ba4133ac86d8b610a4bcf
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 4475d993aa)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
76b86ca887 ANDROID: KVM: arm64: Set up S2MPU Memory Protection Table
S2MPU Second-level Memory Protection Table is a PA-contiguous buffer
containing an array of 2-bit read/write entries at given granularity
for a given gigabyte physical address space region. The size of SMPT
varies per granularity but at the finest 4K granularity it is 64KB
PA-contiguous, aligned to 64KB.

Allocate sufficient number of SMPT buffers for the S2MPU driver assuming
4K granularity for 4K/16K PAGE_SIZE, and 64K granularity for 64K
PAGE_SIZE. We also assume that all S2MPUs share SMPTs for a given
gigabyte region. There are 34 gigabyte regions that can be set by the
driver (GBs 4-33 always block all traffic).

Hyp takes ownership of the memory in s2mpu_init and assigns pointers to
the buffers to L1ENTRY_L2TABLE_ADDR registers on init and power-on
events. The pointers remain static as the driver will only change
granularity between 1G and 4K/64K (depending on PAGE_SIZE).

Test: builds, boots
Bug: 190463801
Change-Id: I3fcad8b3ce5d194a987b09d042bd56d59bb35e5e
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit f0e1de52ef)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
e1a271f580 ANDROID: KVM: arm64: Reprogram S2MPUs in 'host_smc_handler'
Intercept SMCs known to be used by the host to inform EL3 about power
events, either powering SoC blocks on or off.

Test: builds, boots
Bug: 190463801
Change-Id: I306433c8c1b712df24569cbd4dc346f72b4c9650
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 8ca0b34fe4)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
13149798d2 ANDROID: KVM: arm64: Enable S2MPUs in __pkvm_init_stage2_iommu
Initialize the S2MPU driver in __pkvm_init_stage2_iommu if requested by
the host. The driver sets kvm_iommu_ops and configures all S2MPUs which
are powered on at that point (ie. all S2MPUs on currently supported
devices).

The S2MPU L1ENTRY registers are set to 1G granularity and R/W access.
CTRL0/CTRL1/CFG as set to reasonable defaults, though the code relies on
the reset state blocking all traffic as well.

On fault the S2MPUs are configured to return SLVERR/DECERR (v8/9) to the
master. Interrupts are enabled for all VIDs and trigger an IRQ handler
if EL1 init registered a handler as a result of a DT interrupts entry.

Because the host can configure the SSMTs freely, all permission bits are
configured for all VIDs. For v9 CONTEXT_CFG_VALID_VIDS is set to the
value precomputed at EL1, allocating a context ID to each VID.

Test: builds, boots
Bug: 190463801
Change-Id: I4a824e90b5d474dd83c97ef53e4df3c8b68da6ba
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 8aa6c440da)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
308eeb0dd6 ANDROID: KVM: arm64: Copy S2MPU configuration to hyp
Create variables in hyp that will hold the DT information about S2MPUs
to use by hyp at runtime. Copy the information from EL1 to EL2.

The EL1 code computes the size of the data and allocates a sufficient
number of pages, which hyp will later take ownership of.

Test: builds, boots
Bug: 190463801
Change-Id: Ic3d4bfa3ec11f7c2e1b4474910e2f57a62139a75
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit bc80f81582)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
6641cb974b ANDROID: KVM: arm64: Implement IRQ handler for S2MPU faults
The S2MPU can be configured to trigger an interrupt on faults: access
permission (both regular and during page table walks) and if no matching
context ID is found for request's VID (v9 only).

When interrupt information is provided in the S2MPU's DT node, parse the
information and enable an IRQ handler. Later patch will enable the
functionality in the S2MPU.

Test: builds, boots
Bug: 190463801
Change-Id: I11d1a896406011cff1506ee1bd124bfc66ffa914
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 2517c4e5f0)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
f4c7329489 ANDROID: KVM: arm64: Allocate context IDs for valid VIDs
S2MPU_CONTEXT_CFG_VALID_VID register must be configured on v9,
allocating a context ID in range 0 to S2MPU_NUM_CONTEXT to each valid
VID. For now assume that all 8 VIDs are valid. This will change once
the hypervisor takes control over SSMT configuration as well.

If there are more VIDs than available context IDs, the driver prints
a warning that DMA may be blocked and continues.

Test: builds, boots
Bug: 190463801
Change-Id: I0c9e0a5c9470b27debaade2c4e02e16c6577fbfe
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 923353be1e)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
b6866a5146 ANDROID: KVM: arm64: Read and check S2MPU_VERSION
Read S2MPU_VERSION during driver init and check it against list of
supported versions. The register fields are as follows:
  - MAJOR_ARCH_VER,
  - MINOR_ARCH_VER,
  - REV_ARCH_VER,
  - RTL_VER.
Their exact use is not documented. For now, we mask out RTL_VER and
expect a match on MAJOR_, MINOR_ and REV_ARCH_VER. This may be tweaked
in the future.

Test: builds, boots
Bug: 190463801
Change-Id: I9709fde5f4d3ca4c23f84919c37b081302846917
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 4a7da93bdb)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
0762b05d9c ANDROID: KVM: arm64: Parse S2MPU MMIO region
Start EL1 portion of the S2MPU driver with an init function which
probes the Device tree for nodes compatible with 'google,s2mpu'.
Parse and check the base, size and power domain ID.

Test: builds, boots
Bug: 190463801
Change-Id: I5f0b32febb4e922fdfdfe10a9a9c823e20b8e26f
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 4e91a00153)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
0295ee70f1 ANDROID: KVM: arm64: Create empty S2MPU driver
Create a skeleton driver for the S2MPU - an EL1 portion called during
KVM init which will parse the DT and configure the kernel, and an EL2
portion which will program the S2MPUs later at runtime. The code is
behind CONFIG_KVM_S2MPU.

Test: builds, boots
Bug: 190463801
Change-Id: I58206535f3493e1d989576a9db2112d370a1cb4d
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit b2de5483b7)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
4c1082edb0 ANDROID: KVM: arm64: Add 'host_stage2_adjust_mmio_range' to kvm_iommu_ops
Add a new kvm_iommu_ops hook to the lower-EL instruction/data abort
handler, which allows the IOMMU driver to restrict the region of device
memory that is about to be mapped in the host stage-2.

This can be used by the IOMMU driver to restrict access to the MMIO
registers of the IOMMU itself.

Test: builds, boots
Bug: 190463801
Change-Id: I51cf3cfd84c889627e290d74579657447964ca16
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit cc1ad46fb2)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
e3ff71324c ANDROID: KVM: arm64: Add 'host_mmio_dabt_handler' to kvm_iommu_ops
Add a new kvm_iommu_ops hook which allows the IOMMU driver to handle
data aborts in unmapped device memory regions. If the abort is handled
by the driver, the global abort handler will not attempt to map in the
page.

For example, this enables the IOMMU driver to virtualize access to
the underlying IOMMU hardware, or to allow access to a subset of the
functionality, eg. performance counters.

Test: builds, boots
Bug: 190463801
Change-Id: I84adbc992e577ac6ceb09f4856e1c648df580f76
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 25f81ec77b)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
9092a6ac8b ANDROID: KVM: arm64: Add 'host_stage2_set_owner' to kvm_iommu_ops
Add a new hook to kvm_iommu_ops that is invoked whenever a range of
pages changes their owner in the host stage2. This is currently limited
to finalize_host_mappings, which changes the owner of EL2-mapped pages
from host to hyp.

The driver is expected to apply corresponding changes in the IOMMU it
controls, so that only the new owner can access the page range.

Test: builds, boots
Bug: 190463801
Change-Id: I0809f4859a9117d1a37506b7aa9e19c6bd25ffdb
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 3cd8b5b00b)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
71ecd86274 ANDROID: KVM: arm64: Add 'host_smc_handler' to kvm_iommu_ops
IOMMU drivers need to intercept power management SMCs between the host
and EL3. Add a hook to hyp's 'handle_host_smc'.

Test: builds, boots
Bug: 190463801
Change-Id: Ied34b60d4bb0e5ae0fbf03f8ce1dc22a09679e37
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit d2efcdcb2b)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
David Brazdil
350761f479 ANDROID: KVM: arm64: Introduce IOMMU driver infrastructure
Bootstrap infrastructure for IOMMU drivers by introducing kvm_iommu_ops
struct in EL2 that is populated based on a iommu_driver parameter to
__pkvm_init hypercall and selected in EL1 early init.

An 'init' operation is called in __pkvm_init_finalise, giving the driver
an opportunity to initialize itself in EL2 and create any EL2 mappings
that it will need. 'init' is specifically called before
'finalize_host_mappings' so that:
  (a) pages mapped by the driver change owner to hyp,
  (b) ownership changes in 'finalize_host_mappings' get reflected in
      IOMMU mappings (added in a future patch).

Test: builds, boots
Bug: 190463801
Change-Id: I04c9f32c6eda846e6e377cb3d23330eb143b6242
Signed-off-by: David Brazdil <dbrazdil@google.com>
(cherry picked from commit 79775d0225)
Signed-off-by: Mostafa Saleh <smostafa@google.com>
2022-11-24 12:41:32 +00:00
Will Deacon
1805b3a396 ANDROID: KVM: arm64: Update pKVM hyp state series to v6
aosp/2257747 merged v5 of the pKVM hypervisor state series as FROMLIST.
Since then, version 6 was posted and queued by the upstream maintainer:

  https://lore.kernel.org/r/166819337067.3836113.13147674500457473286.b4-ty@kernel.org

Rather than revert v5 from android (and the dozens of dependent patches),
snap to v6 so that we're in-sync with upstream.

Bug: 233587962
[willdeacon@: Fix conflicts with 'stage2_mc' introduced by accounting work]
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I137bbd611c180cbe03e63a55705150f8f9c2ae31
2022-11-24 09:32:20 +00:00
Suren Baghdasaryan
50d2b75b86 ANDROID: mm: preserve vma->anon_vma after MREMAP_DONTUNMAP for SPF
The optimizations [1] and [2] to reset vma->anon_vma during
MREMAP_DONTUNMAP can affect speculative page fault handler. If
vma->anon_vma reset happens after do_anonymous_page verified no
changes to the vma and obtained the ptl lock but before it calls
page_add_new_anon_rmap() then __page_set_anon_rmap() will stumble
on BUG_ON(!anon_vma). Disable these optimizations if SPF is enabled
to avoid such situations. As a result the reverse map walk will
consider the old VMA as it did before these optimizations were
introduced.

[1] 1583aa278f ("mm: mremap: unlink anon_vmas when mremap with MREMAP_DONTUNMAP success")
[2] ee8ab1903e ("mm: rmap: explicitly reset vma->anon_vma in unlink_anon_vmas()")

Bug: 257443051
Change-Id: I4e7611137f4a49c94bfe73532b4b06cbb0d2405b
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:27:02 -08:00
Suren Baghdasaryan
5844c8e7aa ANDROID: mm: disable speculative page faults for CONFIG_NUMA
do_numa_page() uses pte_offset_map() directly and needs to implement
additional mechanisms to ensure the mempolicy object used in
numa_migrate_prep() is not destroyed from under it when speculating.
Rather than fixing this, just disable speculation for CONFIG_NUMA
for now and fix it if it's ever needed in Android.

Bug: 257443051
Change-Id: Ib5750b9809979a69a42ebfa6c130e123f416f1aa
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:25:44 -08:00
Suren Baghdasaryan
4ea18cd059 ANDROID: mm: fix speculative walk which is unsafe under RCU
Speculative page fault handling expects MMU_GATHER_RCU_TABLE_FREE to
guarantee that page tables are stable, however tlb_remove_table() has
a slow-path fall-back case when __get_free_page() returns NULL and
tlb_remove_table_one() gets called. The way synchronization is
implemented in that function is not RCU-safe and require IRQs to be
disabled (see the comment in tlb_remove_table_sync_one()).
Fix the invalid assumption to disable IRQs even when
MMU_GATHER_RCU_TABLE_FREE=y.

Bug: 257443051
Change-Id: I227f351607cf73022cb31f6f7a232cab41cf6a5a
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:25:28 -08:00
Suren Baghdasaryan
ca96bd7bf1 ANDROID: mm: avoid using vmacache in lockless vma search
When searching vma under RCU protection vmcache should be avoided because
a race with munmap() might result in finding a vma and placing it into
vmcache after munmap() removed that vma and called vmcache_invalidate.
Once that vma is freed, vmcache will be left with an invalid vma pointer.

Bug: 257443051
Change-Id: I62438305fcf5139974f4f7d3bae5b22c74084a59
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:25:27 -08:00
Suren Baghdasaryan
533a88fed7 ANDROID: disable page table moves when speculative page faults are enabled
move_page_tables() can move entire pmd or pud without locking individual
ptes. This is problematic for speculative page faults which do not take
mmap_lock because they rely on ptl lock when writing new pte value. To
avoid possible race, disable move_page_tables() optimization when
CONFIG_SPECULATIVE_PAGE_FAULT is enabled.

Bug: 257443051
Change-Id: Ib48dda08ecad1abc60d08fc089a6566a63393c13
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:25:27 -08:00
Suren Baghdasaryan
a1f65b39ba ANDROID: mm: skip pte_alloc during speculative page fault
Speculative page fault checks pmd to be valid before starting to handle
the page fault and pte_alloc() should do nothing if pmd stays valid.
If pmd gets changed during speculative page fault, we will detect the
change later and retry with mmap_lock. Therefore pte_alloc() can be
safely skipped and this prevents the racy pmd_lock() call which can
access pmd->ptl after pmd was cleared.

Bug: 257443051
Change-Id: Iec57df5530dba6e0e0bdf9f7500f910851c3d3fd
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:25:27 -08:00
Suren Baghdasaryan
3f311327f9 ANDROID: mm: introduce vma refcounting to protect vma during SPF
Current mechanism to stabilize a vma during speculative page fault
handling makes a copy of the faulting vma under RCU protection. This
makes it hard to protect elements which do not belong to the vma but
are used by the page fault handler like vma->vm_file.
The problems is that a copy of the vma can't be used to safely
protect the file attached to the original vma unless the file is
also released after RCU grace period (which is how SPF was designed
originally but that caused performance regression and had to be
changed).
To avoid these complications, introduce vma refcounting to stabilize
and operate on the original vma during page fault handling. Page
fault handler finds the vma and increases its refcount under RCU
protection, vma is freed after RCU grace period, vma->vm_file is
released only after refcount indicates no users. This mechanism
guarantees that once get_vma returns a vma, both the vma itself and
vma->vm_file are stable.
Additional benefits of this patch are: we don't need to copy the vma
and no additional logic is needed to stabilize vma->vm_file.

Bug: 257443051
Change-Id: I59d373926d687fcbd56847a8c3500c43bf1844c8
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:25:27 -08:00
Suren Baghdasaryan
50567620db ANDROID: reimplement vm_file protection during speculative page fault
Use vma->vm_file refcounting to protect the file during speculative page
fault handling.

Bug: 258731892
Change-Id: I222c23785391bea7d95c4506d70d6f68029ec45f
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:25:27 -08:00
Suren Baghdasaryan
c11ef6356b Revert "ANDROID: add vma->file_ref_count to synchronize vma->vm_file destruction"
This reverts commit a3fe25d92303739a0515c92cb1febb46a920d4d9.

File refcounting implemented in this patch is broken and needs to be
redone.
The change in include/linux/mm_types.h which adds file_ref_count into
vm_area_struct is left untouched to keep ABI intact.

Bug: 258731892
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I37984eb2f0981a989f74bcaaa6be42040a2f241e
2022-11-23 10:25:26 -08:00
Suren Baghdasaryan
9fe88266f2 Revert "ANDROID: arm64/mm: protect vm_file during speculative page fault handling"
This reverts commit 0f4ea1e59394908a0c1c7619c7a24fd7f790586f.

File refcounting implemented in this patch is broken and needs to be
redone.

Bug: 258731892
Change-Id: I3ae5a78b871edaf655d1c9a7868c8543e27f39e5
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:25:26 -08:00
Suren Baghdasaryan
218d6f9d77 Revert "ANDROID: x86/mm: protect vm_file during speculative page fault handling"
This reverts commit 4fc18576ca94ca9620bd03e0fc7a64467c1ea0c2.

File refcounting implemented in this patch is broken and needs to be
redone.

Bug: 258731892
Change-Id: Ibcefaf6aa72c60c9627d0ea7d473a3ec806535f4
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:25:26 -08:00
Suren Baghdasaryan
d4a5296efa Revert "ANDROID: powerpc/mm: protect vm_file during speculative page fault handling"
This reverts commit 6551a55c4dc5492dcae3dc340c376ed160ab9928.

File refcounting implemented in this patch is broken and needs to be
redone.

Bug: 258731892
Change-Id: I425517a07d1fdcf5cd1842733a4c6c70ef0608b4
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-11-23 10:25:26 -08:00
Vincent Donnefort
e45f4b3c5a ANDROID: KVM: arm64: Add protected_shared_mem statistic
When using nVHE in protected mode, protected memory can be between
host and a guest. Tracking this value is interesting from a debug
perspective, to identify potential leaks.

Keeping the count of memory sharing is easy, each share/unshare will return
to the host where the accounting will take place.

Bug: 222044477
Change-Id: I43dcd258789f79dbfe489e5bf721e606c5e6e022
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2022-11-23 17:11:25 +00:00
Vincent Donnefort
9c9e41a043 ANDROID: KVM: arm64: count KVM s2 mmu usage in nVHE protected mode
When using the nVHE protected mode, the stage-2 page tables are handled by
the hypervisor, but are backed by memory donated by the host. That memory
is accounted during the donation (add to the vCPUs hyp_memcache) under
secondary pagetable stats.

On VM teardown, those pages are mixed with others in the teardown_mc, so use
a separated teardown_stage2_mc to deduct them from accounting after
reclaim.

Bug: 222044477
Change-Id: I2a45ce65c5ce9cf96aabd1b66d6f83ffe4808a0c
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2022-11-23 17:11:25 +00:00
Vincent Donnefort
36b536a5a5 ANDROID: KVM: arm64: Add protected_hyp_mem VM statistic
When using nVHE in protected mode, the host allocates memory for the
hypervisor to store shadow structures and the stage-2 page tables. This has
been proven to be an interesting value to follow, for debug and health
purpose. Account for those allocations in bytes, in a newly created VM
statistic "protected_hyp_mem".

It is expected, on VM teardown to reclaim all that memory. Raise a warning
if not all the donations are recovered.

Bug: 222044477
Change-Id: I18657d275f2ced67ceb6d0e4bd5ce41cf1d41dc8
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2022-11-23 17:11:25 +00:00
Yosry Ahmed
da27463ad6 UPSTREAM: KVM: arm64/mmu: count KVM s2 mmu usage in secondary pagetable stats
Count the pages used by KVM in arm64 for stage2 mmu in memory stats
under secondary pagetable stats (e.g. "SecPageTables" in /proc/meminfo)
to give better visibility into the memory consumption of KVM mmu in a
similar way to how normal user page tables are accounted.

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220823004639.2387269-5-yosryahmed@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

Bug: 222044477
(cherry picked from commit d38ba8ccd9)
Change-Id: I042d6804dd542bb0f25c7f1b040f5b1e5260c0e6
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2022-11-23 17:11:25 +00:00
Yosry Ahmed
2af25795b7 BACKPORT: KVM: x86/mmu: count KVM mmu usage in secondary pagetable stats.
Count the pages used by KVM mmu on x86 in memory stats under secondary
pagetable stats (e.g. "SecPageTables" in /proc/meminfo) to give better
visibility into the memory consumption of KVM mmu in a similar way to
how normal user page tables are accounted.

Add the inner helper in common KVM, ARM will also use it to count stats
in a future commit.

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Acked-by: Marc Zyngier <maz@kernel.org> # generic KVM changes
Link: https://lore.kernel.org/r/20220823004639.2387269-3-yosryahmed@google.com
Link: https://lore.kernel.org/r/20220823004639.2387269-4-yosryahmed@google.com
[sean: squash x86 usage to workaround modpost issues]
Signed-off-by: Sean Christopherson <seanjc@google.com>

Bug: 222044477
(cherry picked from commit 43a063cab3)
[vdonnefort@: Fix conflicts in mmu.c and tdp_mmu.c]
Change-Id: I9b81155758e513504a87ea2d634f341652ed0630
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2022-11-23 17:11:25 +00:00
Yosry Ahmed
4445b043d4 BACKPORT: mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses.
We keep track of several kernel memory stats (total kernel memory, page
tables, stack, vmalloc, etc) on multiple levels (global, per-node,
per-memcg, etc). These stats give insights to users to how much memory
is used by the kernel and for what purposes.

Currently, memory used by KVM mmu is not accounted in any of those
kernel memory stats. This patch series accounts the memory pages
used by KVM for page tables in those stats in a new
NR_SECONDARY_PAGETABLE stat. This stat can be later extended to account
for other types of secondary pages tables (e.g. iommu page tables).

KVM has a decent number of large allocations that aren't for page
tables, but for most of them, the number/size of those allocations
scales linearly with either the number of vCPUs or the amount of memory
assigned to the VM. KVM's secondary page table allocations do not scale
linearly, especially when nested virtualization is in use.

From a KVM perspective, NR_SECONDARY_PAGETABLE will scale with KVM's
per-VM pages_{4k,2m,1g} stats unless the guest is doing something
bizarre (e.g. accessing only 4kb chunks of 2mb pages so that KVM is
forced to allocate a large number of page tables even though the guest
isn't accessing that much memory). However, someone would need to either
understand how KVM works to make that connection, or know (or be told) to
go look at KVM's stats if they're running VMs to better decipher the stats.

Furthermore, having NR_PAGETABLE side-by-side with NR_SECONDARY_PAGETABLE
is informative. For example, when backing a VM with THP vs. HugeTLB,
NR_SECONDARY_PAGETABLE is roughly the same, but NR_PAGETABLE is an order
of magnitude higher with THP. So having this stat will at the very least
prove to be useful for understanding tradeoffs between VM backing types,
and likely even steer folks towards potential optimizations.

The original discussion with more details about the rationale:
https://lore.kernel.org/all/87ilqoi77b.wl-maz@kernel.org

This stat will be used by subsequent patches to count KVM mmu
memory usage.

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220823004639.2387269-2-yosryahmed@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

Bug: 222044477
(cherry picked from commit ebc97a52b5)
[vdonnefort@: Fix trivial documentation conflict]
Change-Id: I16976e21d2e68ebbcd49e9f1275055e81ec82881
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2022-11-23 17:11:25 +00:00
Keir Fraser
b1b649a291 Revert "ANDROID: virtio_balloon: New module parameter "pkvm""
This reverts commit 87bcd3edf3.

Reason for revert: Memory reclaim capability will be checked by the
host before configuring the virtio_balloon device.

Bug: 240239989
Change-Id: I03e7c39ec6d671babeace4040138b416c7e201cf
Signed-off-by: Keir Fraser <keirf@google.com>
2022-11-23 14:42:18 +00:00
Dan Vacura
034f49ebf6 ANDROID: gki_defconfig: enable CONFIG_USB_CONFIGFS_F_UVC
Enable the UVC function driver to allow USB gadgets
to connect as a standard video device to a host.

Bug: 200712777
Bug: 242344221
Signed-off-by: Dan Vacura <w36195@motorola.com>
Change-Id: Ia037f8560664f9e98f28f3fede609764d5d5699d
(cherry picked from commit 8d5dd0a5a4)
(cherry picked from commit 885f16fab68e456b9dc9856641b706ce17551456)
2022-11-23 07:33:30 +00:00
Yifan Hong
3d7c9fdef1 ANDROID: Remove virtgpu_trace.h from DDK unsafe headers.
With the following change merged, the unsafe header
is no longer necessary.

3b72a6405c0f301ed787d899077748f84c8bcafc
("kleaf: enable DDK for virtual devices")

Bug: 254735056

Change-Id: I2e89f5c921d641a486d4c06628d59551f61ba2ba
Signed-off-by: Yifan Hong <elsk@google.com>
2022-11-23 01:23:48 +00:00
Yifan Hong
b5b9d443ba ANDROID: Add ddk_headers for arm architecture.
Similar to aarch64 and x86_64, we also add ddk_headers
for arm so we can build DDK (Driver development kit) modules
for the arm architecture for virtual devices.

Test: Treehugger
Bug: 254735056
Change-Id: I7ade4a6053e59d84b825285fbc6162b6e642682e
Signed-off-by: Yifan Hong <elsk@google.com>
2022-11-23 01:23:48 +00:00
Yifan Hong
023b893955 Revert "ANDROID: kleaf: convert rockpi4 to mixed build."
This reverts commit 6100c90ef5.

Reason for revert: rockpi4 has DEVTMPFS enabled and GKI doesn't

Bug: 258841346
Change-Id: Icefb1bb4cf39004234513d307e385b04cb76e51d
2022-11-22 23:44:09 +00:00
Yifan Hong
6100c90ef5 ANDROID: kleaf: convert rockpi4 to mixed build.
Build the GKI //common:kernel_aarch64, then
build rockpi4 modules on top of it.

As a side effect of this change, rockpi4 will no longer
be able to be built with build.sh because it won't produce
vmlinux, etc..

Test: TH
Test: bazel run //common:rockpi4_dist
Bug: 258841346
Change-Id: I88989a265d0a90daddc85dd45a8736f942350522
Signed-off-by: Yifan Hong <elsk@google.com>
2022-11-22 23:42:33 +00:00
Will Deacon
ed40663592 ANDROID: KVM: arm64: Relax SMCCC version check during FF-A proxy init
Although FF-A claims to require version v1.2 of SMCCC, in reality the
current set of calls work just fine with v1.1 and some devices ship with
EL3 firmware that advertises this configuration.

Allow pKVM to proxy FF-A calls for these devices by relaxing our SMCCC
version check to permit SMCCC v1.1+

Reported-by: Alan Stokes <alanstokes@google.com>
Bug: 222663556
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I41e9ff35f169df3609acee7bbc67999c1d11c9d1
Signed-off-by: Quentin Perret <qperret@google.com>
2022-11-22 17:50:08 +00:00
Quentin Perret
c91cd1264a ANDROID: KVM: arm64: Increase size of FF-A buffer
As it turns out, the kernel's DMA code doesn't enforce the
SG_MAX_SEGMENTS limit on the number of elements in an sglist, which can
confuse the pKVM FF-A proxy which has a buffer sized to contain a
descriptor of at most SG_MAX_SEGMENTS constituents.

As the number of elements in an sglist doesn't seem to have an actual
upper bound, let's paper over the issue for now by increasing the size
of the pKVM buffer based on empirical 'measurements'. Longer term we
might need to make this value configurable on the kernel's cmdline, or
to rework the FF-A proxy to sanely handle large descriptors, although
this is not clear how at the time of writing.

Bug: 221256863
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: If252f01bec8ae71c0fe1f7007a3ca7b037924c84
2022-11-22 17:50:08 +00:00
Quentin Perret
d5e4e2b75f BACKPORT: FROMLIST: KVM: arm64: pkvm: Add support for fragmented FF-A descriptors
FF-A memory descriptors may need to be sent in fragments when they don't
fit in the mailboxes. Doing so involves using the FRAG_TX and FRAG_RX
primitives defined in the FF-A protocol.

Add support in the pKVM FF-A relayer for fragmented descriptors by
monitoring outgoing FRAG_TX transactions and by buffering large
descriptors on the reclaim path.

[ qperret: BACKPORT because I removed the erroneous ANDROID tag from the
  patch title posted upstream ]

Bug: 254811097
Co-developed-by: Andrew Walbran <qwandor@google.com>
Change-Id: I701f279cd4820abb0b6d7c2572ee28e0f943edad
Signed-off-by: Andrew Walbran <qwandor@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20221116170335.2341003-13-qperret@google.com
2022-11-22 17:50:08 +00:00
Will Deacon
6c417d4b04 FROMLIST: KVM: arm64: Handle FFA_MEM_LEND calls from the host
Handle FFA_MEM_LEND calls from the host by treating them identically to
FFA_MEM_SHARE calls for the purposes of the host stage-2 page-table, but
forwarding on the original request to EL3.

Bug: 254811097
Change-Id: I8f53bca6f0865fabd9938eefd8427fa0e78016ed
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20221116170335.2341003-12-qperret@google.com
2022-11-22 17:50:08 +00:00
Will Deacon
8c2dae8b16 FROMLIST: KVM: arm64: Handle FFA_MEM_RECLAIM calls from the host
Intecept FFA_MEM_RECLAIM calls from the host and transition the host
stage-2 page-table entries from the SHARED_OWNED state back to the OWNED
state once EL3 has confirmed that the secure mapping has been reclaimed.

Bug: 254811097
Change-Id: I58365e1b3fafa47f290a292fe57f6d2ed7f9091b
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20221116170335.2341003-11-qperret@google.com
2022-11-22 17:50:08 +00:00