Commit Graph

979689 Commits

Author SHA1 Message Date
Marc Zyngier
dd46b1bb46 UPSTREAM: KVM: arm64: Mark the kvmarm ML as moderated for non-subscribers
The kvmarm mailing list is moderated for non-subscriber, but that
was never advertised. Fix this with the hope that people will
eventually subscribe before posting, saving me the hassle of
letting their post through eventually.

Signed-off-by: Marc Zyngier <maz@kernel.org>
(cherry picked from commit 1a219e08ec)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I9c914e529e88646342c6090b7821056b3a8f21f4
2021-07-02 10:01:01 +01:00
Alexandru Elisei
e7f33ee145 UPSTREAM: Documentation: KVM: Document KVM_GUESTDBG_USE_HW control flag for arm64
Commit 21b6f32f94 ("KVM: arm64: guest debug, define API headers") added
the arm64 KVM_GUESTDBG_USE_HW flag for the KVM_SET_GUEST_DEBUG ioctl and
commit 834bf88726 ("KVM: arm64: enable KVM_CAP_SET_GUEST_DEBUG")
documented and implemented the flag functionality. Since its introduction,
at no point was the flag known by any name other than KVM_GUESTDBG_USE_HW
for the arm64 architecture, so refer to it as such in the documentation.

CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210407144857.199746-2-alexandru.elisei@arm.com
(cherry picked from commit feb5dc3de0)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I268c7954f61487ed88605cd02024aa90e2cd15b2
2021-07-02 10:01:01 +01:00
Jianyong Wu
bad9a9d084 UPSTREAM: ptp: arm/arm64: Enable ptp_kvm for arm/arm64
Currently, there is no mechanism to keep time sync between guest and host
in arm/arm64 virtualization environment. Time in guest will drift compared
with host after boot up as they may both use third party time sources
to correct their time respectively. The time deviation will be in order
of milliseconds. But in some scenarios,like in cloud environment, we ask
for higher time precision.

kvm ptp clock, which chooses the host clock source as a reference
clock to sync time between guest and host, has been adopted by x86
which takes the time sync order from milliseconds to nanoseconds.

This patch enables kvm ptp clock for arm/arm64 and improves clock sync precision
significantly.

Test result comparisons between with kvm ptp clock and without it in arm/arm64
are as follows. This test derived from the result of command 'chronyc
sources'. we should take more care of the last sample column which shows
the offset between the local clock and the source at the last measurement.

no kvm ptp in guest:
MS Name/IP address   Stratum Poll Reach LastRx Last sample
========================================================================
^* dns1.synet.edu.cn      2   6   377    13  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn      2   6   377    21  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn      2   6   377    29  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn      2   6   377    37  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn      2   6   377    45  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn      2   6   377    53  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn      2   6   377    61  +1040us[+1581us] +/-   21ms
^* dns1.synet.edu.cn      2   6   377     4   -130us[ +796us] +/-   21ms
^* dns1.synet.edu.cn      2   6   377    12   -130us[ +796us] +/-   21ms
^* dns1.synet.edu.cn      2   6   377    20   -130us[ +796us] +/-   21ms

in host:
MS Name/IP address   Stratum Poll Reach LastRx Last sample
========================================================================
^* 120.25.115.20          2   7   377    72   -470us[ -603us] +/-   18ms
^* 120.25.115.20          2   7   377    92   -470us[ -603us] +/-   18ms
^* 120.25.115.20          2   7   377   112   -470us[ -603us] +/-   18ms
^* 120.25.115.20          2   7   377     2   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20          2   7   377    22   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20          2   7   377    43   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20          2   7   377    63   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20          2   7   377    83   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20          2   7   377   103   +872ns[-6808ns] +/-   17ms
^* 120.25.115.20          2   7   377   123   +872ns[-6808ns] +/-   17ms

The dns1.synet.edu.cn is the network reference clock for guest and
120.25.115.20 is the network reference clock for host. we can't get the
clock error between guest and host directly, but a roughly estimated value
will be in order of hundreds of us to ms.

with kvm ptp in guest:
chrony has been disabled in host to remove the disturb by network clock.

MS Name/IP address         Stratum Poll Reach LastRx Last sample
========================================================================
* PHC0                    0   3   377     8     -7ns[   +1ns] +/-    3ns
* PHC0                    0   3   377     8     +1ns[  +16ns] +/-    3ns
* PHC0                    0   3   377     6     -4ns[   -0ns] +/-    6ns
* PHC0                    0   3   377     6     -8ns[  -12ns] +/-    5ns
* PHC0                    0   3   377     5     +2ns[   +4ns] +/-    4ns
* PHC0                    0   3   377    13     +2ns[   +4ns] +/-    4ns
* PHC0                    0   3   377    12     -4ns[   -6ns] +/-    4ns
* PHC0                    0   3   377    11     -8ns[  -11ns] +/-    6ns
* PHC0                    0   3   377    10    -14ns[  -20ns] +/-    4ns
* PHC0                    0   3   377     8     +4ns[   +5ns] +/-    4ns

The PHC0 is the ptp clock which choose the host clock as its source
clock. So we can see that the clock difference between host and guest
is in order of ns.

Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201209060932.212364-8-jianyong.wu@arm.com
(cherry picked from commit 300bb1fe76)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I34c8be42218ea909e8d0623836cd93442d1257da
2021-07-02 10:01:01 +01:00
Jianyong Wu
a669f441f8 BACKPORT: KVM: arm64: Add support for the KVM PTP service
Implement the hypervisor side of the KVM PTP interface.

The service offers wall time and cycle count from host to guest.
The caller must specify whether they want the host's view of
either the virtual or physical counter.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201209060932.212364-7-jianyong.wu@arm.com
(cherry picked from commit 3bf725699b)
[willdeacon@: Fixed UAPI #define and documentation conflicts]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Ib2071bbf4c57d5408f7ad7ab27bf97018b7fe535
2021-07-02 10:01:00 +01:00
Jianyong Wu
3f1c16fefd UPSTREAM: clocksource: Add clocksource id for arm arch counter
Add clocksource id to the ARM generic counter so that it can be easily
identified from callers such as ptp_kvm.

Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201209060932.212364-6-jianyong.wu@arm.com
(cherry picked from commit 100148d0fc)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I3a169a6f458889ab97bb344a60dbca5f50357968
2021-07-02 10:01:00 +01:00
Thomas Gleixner
45ac0e635e UPSTREAM: time: Add mechanism to recognize clocksource in time_get_snapshot
System time snapshots are not conveying information about the current
clocksource which was used, but callers like the PTP KVM guest
implementation have the requirement to evaluate the clocksource type to
select the appropriate mechanism.

Introduce a clocksource id field in struct clocksource which is by default
set to CSID_GENERIC (0). Clocksource implementations can set that field to
a value which allows to identify the clocksource.

Store the clocksource id of the current clocksource in the
system_time_snapshot so callers can evaluate which clocksource was used to
take the snapshot and act accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201209060932.212364-5-jianyong.wu@arm.com
(cherry picked from commit b2c67cbe9f)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I4660b3c4973ead7593fc957aad123ddddc10052a
2021-07-02 10:01:00 +01:00
Jianyong Wu
5cda50cb8d UPSTREAM: ptp: Reorganize ptp_kvm.c to make it arch-independent
Currently, the ptp_kvm module contains a lot of x86-specific code.
Let's move this code into a new arch-specific file in the same directory,
and rename the arch-independent file to ptp_kvm_common.c.

Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201209060932.212364-4-jianyong.wu@arm.com
(cherry picked from commit a8cf291bda)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Ieb93d04a05dd58a3d7b398d42a99a02ef247537d
2021-07-02 10:01:00 +01:00
Gavin Shan
d23a59f0b5 BACKPORT: KVM: arm64: Don't retrieve memory slot again in page fault handler
We needn't retrieve the memory slot again in user_mem_abort() because
the corresponding memory slot has been passed from the caller. This
would save some CPU cycles. For example, the time used to write 1GB
memory, which is backed by 2MB hugetlb pages and write-protected, is
dropped by 6.8% from 928ms to 864ms.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210316041126.81860-4-gshan@redhat.com
(cherry picked from commit 10ba2d17d2)
[willdeacon@: Drop superfluous arguments to __gfn_to_pfn_memslot() and
mark_page_dirty_in_slot()]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Ieddde429a392401e60ff6d171d35d51fde84ed56
2021-07-02 10:01:00 +01:00
Gavin Shan
a851c3f1de UPSTREAM: KVM: arm64: Use find_vma_intersection()
find_vma_intersection() has been existing to search the intersected
vma. This uses the function where it's applicable, to simplify the
code.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210316041126.81860-3-gshan@redhat.com
(cherry picked from commit c728fd4ce7)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Ie1b73c28b6fe106157959d81f71c9e4fd16d8766
2021-07-02 10:00:59 +01:00
Gavin Shan
0d9b90d7db UPSTREAM: KVM: arm64: Hide kvm_mmu_wp_memory_region()
We needn't expose the function as it's only used by mmu.c since it
was introduced by commit c64735554c ("KVM: arm: Add initial dirty
page locking support").

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210316041126.81860-2-gshan@redhat.com
(cherry picked from commit eab6214847)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Id87caacf381c9d642d757a23ff86e330e61f1737
2021-07-02 10:00:59 +01:00
Suzuki K Poulose
b6837bfff0 BACKPORT: arm64: KVM: Enable access to TRBE support for host
For a nvhe host, the EL2 must allow the EL1&0 translation
regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must
be saved/restored over a trip to the guest. Also, before
entering the guest, we must flush any trace data if the
TRBE was enabled. And we must prohibit the generation
of trace while we are in EL1 by clearing the TRFCR_EL1.

For vhe, the EL2 must prevent the EL1 access to the Trace
Buffer.

The MDCR_EL2 bit definitions for TRBE are available here :

  https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/

Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405164307.1720226-8-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit a1319260bf)
[willdeacon@: Fix conflicts in debug.c comments]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Iaa8b9214740ff3bdfa5dfbd8109673a7d68b6ae4
2021-07-02 10:00:59 +01:00
Suzuki K Poulose
e03444d5ab UPSTREAM: KVM: arm64: Move SPE availability check to VCPU load
At the moment, we check the availability of SPE on the given
CPU (i.e, SPE is implemented and is allowed at the host) during
every guest entry. This can be optimized a bit by moving the
check to vcpu_load time and recording the availability of the
feature on the current CPU via a new flag. This will also be useful
for adding the TRBE support.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Alexandru Elisei <Alexandru.Elisei@arm.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405164307.1720226-7-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit d2602bb4f5)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Id52f103b1a08f15f8bb28a9c472ae69d5ebc16de
2021-07-02 10:00:59 +01:00
Will Deacon
c8a9e5a36e ANDROID: Revert "FROMLIST: arm64: kvm: Enable access to TRBE support for host"
This FROMLIST: commit conflicts with other patches, so drop it for now
and replace it with the UPSTREAM: version in a subsequent commit.

This reverts commit ad5f52dce6.

Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I86b1ff75fb6784d983d5b29203f62aa13ae3ee58
2021-07-02 10:00:59 +01:00
Suzuki K Poulose
722844fff3 UPSTREAM: KVM: arm64: Handle access to TRFCR_EL1
Rather than falling to an "unhandled access", inject add an explicit
"undefined access" for TRFCR_EL1 access from the guest.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405164307.1720226-6-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit cc427cbb15)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I33c07da4784eb34f4ce3f77cc72c2206cac1d1e0
2021-07-02 10:00:59 +01:00
Eric Auger
2a6a71a6b7 UPSTREAM: KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace
Commit 23bde34771 ("KVM: arm64: vgic-v3: Drop the
reporting of GICR_TYPER.Last for userspace") temporarily fixed
a bug identified when attempting to access the GICR_TYPER
register before the redistributor region setting, but dropped
the support of the LAST bit.

Emulating the GICR_TYPER.Last bit still makes sense for
architecture compliance though. This patch restores its support
(if the redistributor region was set) while keeping the code safe.

We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
computes whether a redistributor is the highest one of a series
of redistributor contributor pages.

With this new implementation we do not need to have a uaccess
read accessor anymore.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405163941.510258-9-eric.auger@redhat.com
(cherry picked from commit 28e9d4bce3)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Iec787a002501beb2103f6170a18b034e943ff250
2021-07-02 10:00:59 +01:00
Eric Auger
ef319ee4c9 UPSTREAM: kvm: arm64: vgic-v3: Introduce vgic_v3_free_redist_region()
To improve the readability, we introduce the new
vgic_v3_free_redist_region helper and also rename
vgic_v3_insert_redist_region into vgic_v3_alloc_redist_region

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405163941.510258-8-eric.auger@redhat.com
(cherry picked from commit e5a3563546)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I35d8a26768f6dcda3b872805e34457f9d678c119
2021-07-02 10:00:58 +01:00
Eric Auger
b0f242a66f UPSTREAM: KVM: arm64: Simplify argument passing to vgic_uaccess_[read|write]
vgic_uaccess() takes a struct vgic_io_device argument, converts it
to a struct kvm_io_device and passes it to the read/write accessor
functions, which convert it back to a struct vgic_io_device.
Avoid the indirection by passing the struct vgic_io_device argument
directly to vgic_uaccess_{read,write}.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405163941.510258-7-eric.auger@redhat.com
(cherry picked from commit da38530976)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Ie26c66abeb1f09c9f7fadf4a66cda5dc3436a818
2021-07-02 10:00:58 +01:00
Eric Auger
3d8bea9440 UPSTREAM: docs: kvm: devices/arm-vgic-v3: enhance KVM_DEV_ARM_VGIC_CTRL_INIT doc
kvm_arch_vcpu_precreate() returns -EBUSY if the vgic is
already initialized. So let's document that KVM_DEV_ARM_VGIC_CTRL_INIT
must be called after all vcpu creations.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405163941.510258-6-eric.auger@redhat.com
(cherry picked from commit 298c41b8fa)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Ifcd71d9c52ca043f8bbaae350f376bc76124e5de
2021-07-02 10:00:58 +01:00
Eric Auger
b89ce4828a UPSTREAM: KVM: arm/arm64: vgic: Reset base address on kvm_vgic_dist_destroy()
On vgic_dist_destroy(), the addresses are not reset. However for
kvm selftest purpose this would allow to continue the test execution
even after a failure when running KVM_RUN. So let's reset the
base addresses.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405163941.510258-5-eric.auger@redhat.com
(cherry picked from commit 3a52116127)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Ia56d4ada6152081c1b70c56f64f33fca8ae06bce
2021-07-02 10:00:58 +01:00
Eric Auger
edadc09166 UPSTREAM: KVM: arm64: vgic-v3: Fix error handling in vgic_v3_set_redist_base()
vgic_v3_insert_redist_region() may succeed while
vgic_register_all_redist_iodevs fails. For example this happens
while adding a redistributor region overlapping a dist region. The
failure only is detected on vgic_register_all_redist_iodevs when
vgic_v3_check_base() gets called in vgic_register_redist_iodev().

In such a case, remove the newly added redistributor region and free
it.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405163941.510258-4-eric.auger@redhat.com
(cherry picked from commit 8542a8f95a)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I8d28dd40946c714dd3df902defc267c6c1b9267f
2021-07-02 10:00:58 +01:00
Eric Auger
032cf23b26 UPSTREAM: KVM: arm64: vgic-v3: Fix some error codes when setting RDIST base
KVM_DEV_ARM_VGIC_GRP_ADDR group doc says we should return
-EEXIST in case the base address of the redist is already set.
We currently return -EINVAL.

However we need to return -EINVAL in case a legacy REDIST address
is attempted to be set while REDIST_REGIONS were set. This case
is discriminated by looking at the count field.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405163941.510258-2-eric.auger@redhat.com
(cherry picked from commit d9b201e99c)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I227e7a9ecca105a796d71ea0a7de25f987956bba
2021-07-02 10:00:58 +01:00
Andrew Scull
4599dea4cb BACKPORT: KVM: arm64: Log source when panicking from nVHE hyp
To aid with debugging, add details of the source of a panic from nVHE
hyp. This is done by having nVHE hyp exit to nvhe_hyp_panic_handler()
rather than directly to panic(). The handler will then add the extra
details for debugging before panicking the kernel.

If the panic was due to a BUG(), look up the metadata to log the file
and line, if available, otherwise log an address that can be looked up
in vmlinux. The hyp offset is also logged to allow other hyp VAs to be
converted, similar to how the kernel offset is logged during a panic.

__hyp_panic_string is now inlined since it no longer needs to be
referenced as a symbol and the message is free to diverge between VHE
and nVHE.

The following is an example of the logs generated by a BUG in nVHE hyp.

[   46.754840] kvm [307]: nVHE hyp BUG at: arch/arm64/kvm/hyp/nvhe/switch.c:242!
[   46.755357] kvm [307]: Hyp Offset: 0xfffea6c58e1e0000
[   46.755824] Kernel panic - not syncing: HYP panic:
[   46.755824] PS:400003c9 PC:0000d93a82c705ac ESR:f2000800
[   46.755824] FAR:0000000080080000 HPFAR:0000000000800800 PAR:0000000000000000
[   46.755824] VCPU:0000d93a880d0000
[   46.756960] CPU: 3 PID: 307 Comm: kvm-vcpu-0 Not tainted 5.12.0-rc3-00005-gc572b99cf65b-dirty #133
[   46.757459] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
[   46.758366] Call trace:
[   46.758601]  dump_backtrace+0x0/0x1b0
[   46.758856]  show_stack+0x18/0x70
[   46.759057]  dump_stack+0xd0/0x12c
[   46.759236]  panic+0x16c/0x334
[   46.759426]  arm64_kernel_unmapped_at_el0+0x0/0x30
[   46.759661]  kvm_arch_vcpu_ioctl_run+0x134/0x750
[   46.759936]  kvm_vcpu_ioctl+0x2f0/0x970
[   46.760156]  __arm64_sys_ioctl+0xa8/0xec
[   46.760379]  el0_svc_common.constprop.0+0x60/0x120
[   46.760627]  do_el0_svc+0x24/0x90
[   46.760766]  el0_svc+0x2c/0x54
[   46.760915]  el0_sync_handler+0x1a4/0x1b0
[   46.761146]  el0_sync+0x170/0x180
[   46.761889] SMP: stopping secondary CPUs
[   46.762786] Kernel Offset: 0x3e1cd2820000 from 0xffff800010000000
[   46.763142] PHYS_OFFSET: 0xffffa9f680000000
[   46.763359] CPU features: 0x00240022,61806008
[   46.763651] Memory Limit: none
[   46.813867] ---[ end Kernel panic - not syncing: HYP panic:
[   46.813867] PS:400003c9 PC:0000d93a82c705ac ESR:f2000800
[   46.813867] FAR:0000000080080000 HPFAR:0000000000800800 PAR:0000000000000000
[   46.813867] VCPU:0000d93a880d0000 ]---

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210318143311.839894-6-ascull@google.com
(cherry picked from commit aec0fae62e)
[willdeacon@: Resolved __hyp_pa() conflicts in psci-relay.c; aligned BUG() usage
with upstream]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Ie1a500ab526de32abb3cb502319e3b88115f8038
2021-07-02 10:00:58 +01:00
Andrew Scull
b75a28595e UPSTREAM: KVM: arm64: Use BUG and BUG_ON in nVHE hyp
hyp_panic() reports the address of the panic by using ELR_EL2, but this
isn't a useful address when hyp_panic() is called directly. Replace such
direct calls with BUG() and BUG_ON() which use BRK to trigger an
exception that then goes to hyp_panic() with the correct address. Also
remove the hyp_panic() declaration from the header file to avoid
accidental misuse.

Signed-off-by: Andrew Scull <ascull@google.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210318143311.839894-5-ascull@google.com
(cherry picked from commit f79e616f27)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I1ba5edd4ddcdefd002d744ecd203d9db2b882bf6
2021-07-02 10:00:57 +01:00
Andrew Scull
088e2b0f58 UPSTREAM: bug: Assign values once in bug_get_file_line()
Set bug_get_file_line()'s output parameter values directly rather than
first nullifying them and then conditionally setting new values.

Signed-off-by: Andrew Scull <ascull@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210318143311.839894-4-ascull@google.com
(cherry picked from commit 5b8be5d875)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I95a8b2c98645b647a96adf0e183def8599764b46
2021-07-02 10:00:57 +01:00
Andrew Scull
fa6138d1ef UPSTREAM: bug: Factor out a getter for a bug's file line
There is some non-trivial config-based logic to get the file name and
line number associated with a bug. Factor this out to a getter that can
be resused.

Signed-off-by: Andrew Scull <ascull@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210318143311.839894-3-ascull@google.com
(cherry picked from commit 26dbc7e299)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Id9a0c9b72d162bb1f1b98c97a9bfd86acd32d7c0
2021-07-02 10:00:57 +01:00
Marc Zyngier
0148bc788a UPSTREAM: KVM: arm64: Elect Alexandru as a replacement for Julien as a reviewer
Julien's bandwidth for KVM reviewing has been pretty low lately,
and Alexandru has accepted to step in and help with the reviewing.

Many thanks to both!

Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20210331131620.4005931-1-maz@kernel.org
(cherry picked from commit 70f5e4a601)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Ibb008e463bee83b6cb8e689e42a0ccbf849a9ce5
2021-07-02 10:00:57 +01:00
Xiaofei Tan
9843b9bdfa UPSTREAM: arm64: sve: Provide sve_cond_update_zcr_vq fallback when !ARM64_SVE
Compilation fails when KVM is selected and ARM64_SVE isn't.

The root cause is that sve_cond_update_zcr_vq is not defined when
ARM64_SVE is not selected. Fix it by adding an empty definition
when CONFIG_ARM64_SVE=n.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
[maz: simplified commit message, fleshed out dummy #define]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1617183879-48748-1-git-send-email-tanxiaofei@huawei.com
(cherry picked from commit a9f8696d4b)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I55b8027a6025262e3921561d37d73d2199625be4
2021-07-02 10:00:57 +01:00
Will Deacon
c2f432149b UPSTREAM: KVM: arm64: Advertise KVM UID to guests via SMCCC
We can advertise ourselves to guests as KVM and provide a basic features
bitmap for discoverability of future hypervisor services.

Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201209060932.212364-3-jianyong.wu@arm.com
(cherry picked from commit 923961a7ff)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I6dc5cfeb2a1f7f966acf13bf55e346a133219ac5
2021-07-02 10:00:57 +01:00
Will Deacon
1e5e789283 UPSTREAM: arm/arm64: Probe for the presence of KVM hypervisor
Although the SMCCC specification provides some limited functionality for
describing the presence of hypervisor and firmware services, this is
generally applicable only to functions designated as "Arm Architecture
Service Functions" and no portable discovery mechanism is provided for
standard hypervisor services, despite having a designated range of
function identifiers reserved by the specification.

In an attempt to avoid the need for additional firmware changes every
time a new function is added, introduce a UID to identify the service
provider as being compatible with KVM. Once this has been established,
additional services can be discovered via a feature bitmap.

Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
[maz: move code to its own file, plug it into PSCI]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201209060932.212364-2-jianyong.wu@arm.com
(cherry picked from commit 6e085e0ac9)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I820ef716a2316a928d7cc8e5dda5befa543432d5
2021-07-02 10:00:56 +01:00
Ard Biesheuvel
e99de13b9f UPSTREAM: random: avoid arch_get_random_seed_long() when collecting IRQ randomness
When reseeding the CRNG periodically, arch_get_random_seed_long() is
called to obtain entropy from an architecture specific source if one
is implemented. In most cases, these are special instructions, but in
some cases, such as on ARM, we may want to back this using firmware
calls, which are considerably more expensive.

Another call to arch_get_random_seed_long() exists in the CRNG driver,
in add_interrupt_randomness(), which collects entropy by capturing
inter-interrupt timing and relying on interrupt jitter to provide
random bits. This is done by keeping a per-CPU state, and mixing in
the IRQ number, the cycle counter and the return address every time an
interrupt is taken, and mixing this per-CPU state into the entropy pool
every 64 invocations, or at least once per second. The entropy that is
gathered this way is credited as 1 bit of entropy. Every time this
happens, arch_get_random_seed_long() is invoked, and the result is
mixed in as well, and also credited with 1 bit of entropy.

This means that arch_get_random_seed_long() is called at least once
per second on every CPU, which seems excessive, and doesn't really
scale, especially in a virtualization scenario where CPUs may be
oversubscribed: in cases where arch_get_random_seed_long() is backed
by an instruction that actually goes back to a shared hardware entropy
source (such as RNDRRS on ARM), we will end up hitting it hundreds of
times per second.

So let's drop the call to arch_get_random_seed_long() from
add_interrupt_randomness(), and instead, rely on crng_reseed() to call
the arch hook to get random seed material from the platform.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20201105152944.16953-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 390596c995)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I03ef9372cf464a8b5de0e09e93cad1baa059ae40
2021-07-02 10:00:56 +01:00
Andre Przywara
3c4761c700 UPSTREAM: arm64: Add support for SMCCC TRNG entropy source
The ARM architected TRNG firmware interface, described in ARM spec
DEN0098, defines an ARM SMCCC based interface to a true random number
generator, provided by firmware.
This can be discovered via the SMCCC >=v1.1 interface, and provides
up to 192 bits of entropy per call.

Hook this SMC call into arm64's arch_get_random_*() implementation,
coming to the rescue when the CPU does not implement the ARM v8.5 RNG
system registers.

For the detection, we piggy back on the PSCI/SMCCC discovery (which gives
us the conduit to use (hvc/smc)), then try to call the
ARM_SMCCC_TRNG_VERSION function, which returns -1 if this interface is
not implemented.

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 38db987316)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I0c0cedbd7053cd322e119cb00ec5c05684458b7e
2021-07-02 10:00:56 +01:00
Andre Przywara
dc28cd7286 UPSTREAM: firmware: smccc: Introduce SMCCC TRNG framework
The ARM DEN0098 document describe an SMCCC based firmware service to
deliver hardware generated random numbers. Its existence is advertised
according to the SMCCC v1.1 specification.

Add a (dummy) call to probe functions implemented in each architecture
(ARM and arm64), to determine the existence of this interface.
For now this return false, but this will be overwritten by each
architecture's support patch.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit a37e31fc97)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I3e91678706efdaa992c2d52a9bcbf1bb994e93e4
2021-07-02 10:00:56 +01:00
Xu Jia
81aa1403c5 UPSTREAM: KVM: arm64: Make symbol '_kvm_host_prot_finalize' static
The sparse tool complains as follows:

arch/arm64/kvm/arm.c:1900:6: warning:
 symbol '_kvm_host_prot_finalize' was not declared. Should it be static?

This symbol is not used outside of arm.c, so this
commit marks it static.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xu Jia <xujia39@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1617176179-31931-1-git-send-email-xujia39@huawei.com
(cherry picked from commit b1306fef1f)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I320a0f3780f98dfc20ea382dd75f50ab38a6b308
2021-07-02 10:00:56 +01:00
Shenming Lu
b175373926 UPSTREAM: KVM: arm64: GICv4.1: Give a chance to save VLPI state
Before GICv4.1, we don't have direct access to the VLPI state. So
we simply let it fail early when encountering any VLPI in saving.

But now we don't have to return -EACCES directly if on GICv4.1. Let’s
change the hard code and give a chance to save the VLPI state (and
preserve the UAPI).

Signed-off-by: Shenming Lu <lushenming@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210322060158.1584-7-lushenming@huawei.com
(cherry picked from commit 8082d50f48)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I5f878c088b42772774d5189d32bf74814e932521
2021-07-02 10:00:56 +01:00
Zenghui Yu
88178a0571 UPSTREAM: KVM: arm64: GICv4.1: Restore VLPI pending state to physical side
When setting the forwarding path of a VLPI (switch to the HW mode),
we can also transfer the pending state from irq->pending_latch to
VPT (especially in migration, the pending states of VLPIs are restored
into kvm’s vgic first). And we currently send "INT+VSYNC" to trigger
a VLPI to pending.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Shenming Lu <lushenming@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210322060158.1584-6-lushenming@huawei.com
(cherry picked from commit 12df742921)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I7eb7e5db5d6e251c0f65f00c814a887ef44e260f
2021-07-02 10:00:56 +01:00
Shenming Lu
3f50de15c2 UPSTREAM: KVM: arm64: GICv4.1: Try to save VLPI state in save_pending_tables
After pausing all vCPUs and devices capable of interrupting, in order
to save the states of all interrupts, besides flushing the states in
kvm’s vgic, we also try to flush the states of VLPIs in the virtual
pending tables into guest RAM, but we need to have GICv4.1 and safely
unmap the vPEs first.

As for the saving of VSGIs, which needs the vPEs to be mapped and might
conflict with the saving of VLPIs, but since we will map the vPEs back
at the end of save_pending_tables and both savings require the kvm->lock
to be held (thus only happen serially), it will work fine.

Signed-off-by: Shenming Lu <lushenming@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210322060158.1584-5-lushenming@huawei.com
(cherry picked from commit f66b7b151e)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: Ia375c8400861ef93cded852c05e4fe169445fe13
2021-07-02 10:00:55 +01:00
Shenming Lu
89cb7159f5 UPSTREAM: KVM: arm64: GICv4.1: Add function to get VLPI state
With GICv4.1 and the vPE unmapped, which indicates the invalidation
of any VPT caches associated with the vPE, we can get the VLPI state
by peeking at the VPT. So we add a function for this.

Signed-off-by: Shenming Lu <lushenming@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210322060158.1584-4-lushenming@huawei.com
(cherry picked from commit 80317fe4a6)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I1816d16525bb17962761e0caa683a72dbca6bf3e
2021-07-02 10:00:55 +01:00
Shenming Lu
46c27765a2 UPSTREAM: irqchip/gic-v3-its: Drop the setting of PTZ altogether
GICv4.1 gives a way to get the VLPI state, which needs to map the
vPE first, and after the state read, we may remap the vPE back while
the VPT is not empty. So we can't assume that the VPT is empty at
the first map. Besides, the optimization of PTZ is probably limited
since the HW should be fairly efficient to parse the empty VPT. Let's
drop the setting of PTZ altogether.

Signed-off-by: Shenming Lu <lushenming@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210322060158.1584-3-lushenming@huawei.com
(cherry picked from commit c21bc068cd)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I599b10b7b196fba5515d835b340721b23990387c
2021-07-02 10:00:55 +01:00
Marc Zyngier
4761649e56 UPSTREAM: irqchip/gic-v3-its: Add a cache invalidation right after vPE unmapping
In order to be able to manipulate the VPT once a vPE has been
unmapped, perform the required CMO to invalidate the CPU view
of the VPT.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Shenming Lu <lushenming@huawei.com>
Link: https://lore.kernel.org/r/20210322060158.1584-2-lushenming@huawei.com
(cherry picked from commit 301beaf197)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 190594147
Change-Id: I4ef4c5626fe4611995f564853e114da7da615bc7
2021-07-02 10:00:55 +01:00
Maciej Żenczykowski
5f0426076b FROMGIT: bpf: Support all gso types in bpf_skb_change_proto()
Since we no longer modify gso_size, it is now theoretically
safe to not set SKB_GSO_DODGY and reset gso_segs to zero.

This also means the skb_is_gso_tcp() check should no longer
be necessary.

Unfortunately we cannot remove the skb_{decrease,increase}_gso_size()
helpers, as they are still used elsewhere:

  bpf_skb_net_grow() without BPF_F_ADJ_ROOM_FIXED_GSO
  bpf_skb_net_shrink() without BPF_F_ADJ_ROOM_FIXED_GSO
  net/core/lwt_bpf.c's handle_gso_type()

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dongseok Yi <dseok.yi@samsung.com>
Cc: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/bpf/20210617000953.2787453-3-zenczykowski@gmail.com
(cherry picked from commit 0bc919d3e0 https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=0bc919d3e0b8149a60d2444c6a8e2b5974556522)
Test: builds, TreeHugger
Bug: 188690383
Change-Id: I46036bbacae9d1af7364ec0623dd75f0df5845fa
2021-07-01 17:08:45 +00:00
Greg Kroah-Hartman
71759356aa Merge 5.10.47 into android13-5.10
Changes in 5.10.47
	module: limit enabling module.sig_enforce
	Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue."
	Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell."
	drm: add a locked version of drm_is_current_master
	drm/nouveau: wait for moving fence after pinning v2
	drm/radeon: wait for moving fence after pinning
	drm/amdgpu: wait for moving fence after pinning
	ARM: 9081/1: fix gcc-10 thumb2-kernel regression
	mmc: meson-gx: use memcpy_to/fromio for dram-access-quirk
	MIPS: generic: Update node names to avoid unit addresses
	arm64: Ignore any DMA offsets in the max_zone_phys() calculation
	arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
	spi: spi-nxp-fspi: move the register operation after the clock enable
	Revert "PCI: PM: Do not read power state in pci_enable_device_flags()"
	drm/vc4: hdmi: Move the HSM clock enable to runtime_pm
	drm/vc4: hdmi: Make sure the controller is powered in detect
	x86/entry: Fix noinstr fail in __do_fast_syscall_32()
	x86/xen: Fix noinstr fail in exc_xen_unknown_trap()
	locking/lockdep: Improve noinstr vs errors
	perf/x86/lbr: Remove cpuc->lbr_xsave allocation from atomic context
	perf/x86/intel/lbr: Zero the xstate buffer on allocation
	dmaengine: zynqmp_dma: Fix PM reference leak in zynqmp_dma_alloc_chan_resourc()
	dmaengine: stm32-mdma: fix PM reference leak in stm32_mdma_alloc_chan_resourc()
	dmaengine: xilinx: dpdma: Add missing dependencies to Kconfig
	dmaengine: xilinx: dpdma: Limit descriptor IDs to 16 bits
	mac80211: remove warning in ieee80211_get_sband()
	mac80211_hwsim: drop pending frames on stop
	cfg80211: call cfg80211_leave_ocb when switching away from OCB
	dmaengine: rcar-dmac: Fix PM reference leak in rcar_dmac_probe()
	dmaengine: mediatek: free the proper desc in desc_free handler
	dmaengine: mediatek: do not issue a new desc if one is still current
	dmaengine: mediatek: use GFP_NOWAIT instead of GFP_ATOMIC in prep_dma
	net: ipv4: Remove unneed BUG() function
	mac80211: drop multicast fragments
	net: ethtool: clear heap allocations for ethtool function
	inet: annotate data race in inet_send_prepare() and inet_dgram_connect()
	ping: Check return value of function 'ping_queue_rcv_skb'
	net: annotate data race in sock_error()
	inet: annotate date races around sk->sk_txhash
	net/packet: annotate data race in packet_sendmsg()
	net: phy: dp83867: perform soft reset and retain established link
	riscv32: Use medany C model for modules
	net: caif: fix memory leak in ldisc_open
	net/packet: annotate accesses to po->bind
	net/packet: annotate accesses to po->ifindex
	r8152: Avoid memcpy() over-reading of ETH_SS_STATS
	sh_eth: Avoid memcpy() over-reading of ETH_SS_STATS
	r8169: Avoid memcpy() over-reading of ETH_SS_STATS
	KVM: selftests: Fix kvm_check_cap() assertion
	net: qed: Fix memcpy() overflow of qed_dcbx_params()
	mac80211: reset profile_periodicity/ema_ap
	mac80211: handle various extensible elements correctly
	recordmcount: Correct st_shndx handling
	PCI: Add AMD RS690 quirk to enable 64-bit DMA
	net: ll_temac: Add memory-barriers for TX BD access
	net: ll_temac: Avoid ndo_start_xmit returning NETDEV_TX_BUSY
	perf/x86: Track pmu in per-CPU cpu_hw_events
	pinctrl: stm32: fix the reported number of GPIO lines per bank
	i2c: i801: Ensure that SMBHSTSTS_INUSE_STS is cleared when leaving i801_access
	gpiolib: cdev: zero padding during conversion to gpioline_info_changed
	scsi: sd: Call sd_revalidate_disk() for ioctl(BLKRRPART)
	nilfs2: fix memory leak in nilfs_sysfs_delete_device_group
	s390/stack: fix possible register corruption with stack switch helper
	KVM: do not allow mapping valid but non-reference-counted pages
	i2c: robotfuzz-osif: fix control-request directions
	ceph: must hold snap_rwsem when filling inode for async create
	kthread_worker: split code for canceling the delayed work timer
	kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
	x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate()
	x86/fpu: Make init_fpstate correct with optimized XSAVE
	mm: add VM_WARN_ON_ONCE_PAGE() macro
	mm/rmap: remove unneeded semicolon in page_not_mapped()
	mm/rmap: use page_not_mapped in try_to_unmap()
	mm, thp: use head page in __migration_entry_wait()
	mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
	mm/thp: make is_huge_zero_pmd() safe and quicker
	mm/thp: try_to_unmap() use TTU_SYNC for safe splitting
	mm/thp: fix vma_address() if virtual address below file offset
	mm/thp: fix page_address_in_vma() on file THP tails
	mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()
	mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split
	mm: page_vma_mapped_walk(): use page for pvmw->page
	mm: page_vma_mapped_walk(): settle PageHuge on entry
	mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd
	mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block
	mm: page_vma_mapped_walk(): crossing page table boundary
	mm: page_vma_mapped_walk(): add a level of indentation
	mm: page_vma_mapped_walk(): use goto instead of while (1)
	mm: page_vma_mapped_walk(): get vma_address_end() earlier
	mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes
	mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()
	mm, futex: fix shared futex pgoff on shmem huge page
	KVM: SVM: Call SEV Guest Decommission if ASID binding fails
	swiotlb: manipulate orig_addr when tlb_addr has offset
	netfs: fix test for whether we can skip read when writing beyond EOF
	Revert "drm: add a locked version of drm_is_current_master"
	certs: Add EFI_CERT_X509_GUID support for dbx entries
	certs: Move load_system_certificate_list to a common function
	certs: Add ability to preload revocation certs
	integrity: Load mokx variables into the blacklist keyring
	Linux 5.10.47

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I48fc1e94475e27da558c2cbedcd253a3d06e535e
2021-06-30 19:29:52 +02:00
Sasha Levin
4357ae26d4 Linux 5.10.47
Tested-by: Fox Chen <foxhlchen@gmail.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Hulk Robot <hulkrobot@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-30 09:04:24 -04:00
Eric Snowberg
1573d595e2 integrity: Load mokx variables into the blacklist keyring
[ Upstream commit ebd9c2ae36 ]

During boot the Secure Boot Forbidden Signature Database, dbx,
is loaded into the blacklist keyring.  Systems booted with shim
have an equivalent Forbidden Signature Database called mokx.
Currently mokx is only used by shim and grub, the contents are
ignored by the kernel.

Add the ability to load mokx into the blacklist keyring during boot.

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Suggested-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
cc: keyrings@vger.kernel.org
Link: https://lore.kernel.org/r/c33c8e3839a41e9654f41cc92c7231104931b1d7.camel@HansenPartnership.com/
Link: https://lore.kernel.org/r/20210122181054.32635-5-eric.snowberg@oracle.com/ # v5
Link: https://lore.kernel.org/r/161428674320.677100.12637282414018170743.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/161433313205.902181.2502803393898221637.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/161529607422.163428.13530426573612578854.stgit@warthog.procyon.org.uk/ # v3
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-30 08:47:30 -04:00
Eric Snowberg
c6ae6f89fc certs: Add ability to preload revocation certs
[ Upstream commit d1f044103d ]

Add a new Kconfig option called SYSTEM_REVOCATION_KEYS. If set,
this option should be the filename of a PEM-formated file containing
X.509 certificates to be included in the default blacklist keyring.

DH Changes:
 - Make the new Kconfig option depend on SYSTEM_REVOCATION_LIST.
 - Fix SYSTEM_REVOCATION_KEYS=n, but CONFIG_SYSTEM_REVOCATION_LIST=y[1][2].
 - Use CONFIG_SYSTEM_REVOCATION_LIST for extract-cert[3].
 - Use CONFIG_SYSTEM_REVOCATION_LIST for revocation_certificates.o[3].

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Randy Dunlap <rdunlap@infradead.org>
cc: keyrings@vger.kernel.org
Link: https://lore.kernel.org/r/e1c15c74-82ce-3a69-44de-a33af9b320ea@infradead.org/ [1]
Link: https://lore.kernel.org/r/20210303034418.106762-1-eric.snowberg@oracle.com/ [2]
Link: https://lore.kernel.org/r/20210304175030.184131-1-eric.snowberg@oracle.com/ [3]
Link: https://lore.kernel.org/r/20200930201508.35113-3-eric.snowberg@oracle.com/
Link: https://lore.kernel.org/r/20210122181054.32635-4-eric.snowberg@oracle.com/ # v5
Link: https://lore.kernel.org/r/161428673564.677100.4112098280028451629.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/161433312452.902181.4146169951896577982.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/161529606657.163428.3340689182456495390.stgit@warthog.procyon.org.uk/ # v3
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-30 08:47:30 -04:00
Eric Snowberg
72d6f5d982 certs: Move load_system_certificate_list to a common function
[ Upstream commit 2565ca7f5e ]

Move functionality within load_system_certificate_list to a common
function, so it can be reused in the future.

DH Changes:
 - Added inclusion of common.h to common.c (Eric [1]).

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: keyrings@vger.kernel.org
Link: https://lore.kernel.org/r/EDA280F9-F72D-4181-93C7-CDBE95976FF7@oracle.com/ [1]
Link: https://lore.kernel.org/r/20200930201508.35113-2-eric.snowberg@oracle.com/
Link: https://lore.kernel.org/r/20210122181054.32635-3-eric.snowberg@oracle.com/ # v5
Link: https://lore.kernel.org/r/161428672825.677100.7545516389752262918.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/161433311696.902181.3599366124784670368.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/161529605850.163428.7786675680201528556.stgit@warthog.procyon.org.uk/ # v3
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-30 08:47:30 -04:00
Eric Snowberg
45109066f6 certs: Add EFI_CERT_X509_GUID support for dbx entries
[ Upstream commit 56c5812623 ]

This fixes CVE-2020-26541.

The Secure Boot Forbidden Signature Database, dbx, contains a list of now
revoked signatures and keys previously approved to boot with UEFI Secure
Boot enabled.  The dbx is capable of containing any number of
EFI_CERT_X509_SHA256_GUID, EFI_CERT_SHA256_GUID, and EFI_CERT_X509_GUID
entries.

Currently when EFI_CERT_X509_GUID are contained in the dbx, the entries are
skipped.

Add support for EFI_CERT_X509_GUID dbx entries. When a EFI_CERT_X509_GUID
is found, it is added as an asymmetrical key to the .blacklist keyring.
Anytime the .platform keyring is used, the keys in the .blacklist keyring
are referenced, if a matching key is found, the key will be rejected.

[DH: Made the following changes:
 - Added to have a config option to enable the facility.  This allows a
   Kconfig solution to make sure that pkcs7_validate_trust() is
   enabled.[1][2]
 - Moved the functions out from the middle of the blacklist functions.
 - Added kerneldoc comments.]

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
cc: Randy Dunlap <rdunlap@infradead.org>
cc: Mickaël Salaün <mic@digikod.net>
cc: Arnd Bergmann <arnd@kernel.org>
cc: keyrings@vger.kernel.org
Link: https://lore.kernel.org/r/20200901165143.10295-1-eric.snowberg@oracle.com/ # rfc
Link: https://lore.kernel.org/r/20200909172736.73003-1-eric.snowberg@oracle.com/ # v2
Link: https://lore.kernel.org/r/20200911182230.62266-1-eric.snowberg@oracle.com/ # v3
Link: https://lore.kernel.org/r/20200916004927.64276-1-eric.snowberg@oracle.com/ # v4
Link: https://lore.kernel.org/r/20210122181054.32635-2-eric.snowberg@oracle.com/ # v5
Link: https://lore.kernel.org/r/161428672051.677100.11064981943343605138.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/161433310942.902181.4901864302675874242.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/161529605075.163428.14625520893961300757.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/bc2c24e3-ed68-2521-0bf4-a1f6be4a895d@infradead.org/ [1]
Link: https://lore.kernel.org/r/20210225125638.1841436-1-arnd@kernel.org/ [2]
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-30 08:47:30 -04:00
Daniel Vetter
0ba128fa68 Revert "drm: add a locked version of drm_is_current_master"
commit f54b3ca7ea upstream.

This reverts commit 1815d9c86e.

Unfortunately this inverts the locking hierarchy, so back to the
drawing board. Full lockdep splat below:

======================================================
WARNING: possible circular locking dependency detected
5.13.0-rc7-CI-CI_DRM_10254+ #1 Not tainted
------------------------------------------------------
kms_frontbuffer/1087 is trying to acquire lock:
ffff88810dcd01a8 (&dev->master_mutex){+.+.}-{3:3}, at: drm_is_current_master+0x1b/0x40
but task is already holding lock:
ffff88810dcd0488 (&dev->mode_config.mutex){+.+.}-{3:3}, at: drm_mode_getconnector+0x1c6/0x4a0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&dev->mode_config.mutex){+.+.}-{3:3}:
       __mutex_lock+0xab/0x970
       drm_client_modeset_probe+0x22e/0xca0
       __drm_fb_helper_initial_config_and_unlock+0x42/0x540
       intel_fbdev_initial_config+0xf/0x20 [i915]
       async_run_entry_fn+0x28/0x130
       process_one_work+0x26d/0x5c0
       worker_thread+0x37/0x380
       kthread+0x144/0x170
       ret_from_fork+0x1f/0x30
-> #1 (&client->modeset_mutex){+.+.}-{3:3}:
       __mutex_lock+0xab/0x970
       drm_client_modeset_commit_locked+0x1c/0x180
       drm_client_modeset_commit+0x1c/0x40
       __drm_fb_helper_restore_fbdev_mode_unlocked+0x88/0xb0
       drm_fb_helper_set_par+0x34/0x40
       intel_fbdev_set_par+0x11/0x40 [i915]
       fbcon_init+0x270/0x4f0
       visual_init+0xc6/0x130
       do_bind_con_driver+0x1e5/0x2d0
       do_take_over_console+0x10e/0x180
       do_fbcon_takeover+0x53/0xb0
       register_framebuffer+0x22d/0x310
       __drm_fb_helper_initial_config_and_unlock+0x36c/0x540
       intel_fbdev_initial_config+0xf/0x20 [i915]
       async_run_entry_fn+0x28/0x130
       process_one_work+0x26d/0x5c0
       worker_thread+0x37/0x380
       kthread+0x144/0x170
       ret_from_fork+0x1f/0x30
-> #0 (&dev->master_mutex){+.+.}-{3:3}:
       __lock_acquire+0x151e/0x2590
       lock_acquire+0xd1/0x3d0
       __mutex_lock+0xab/0x970
       drm_is_current_master+0x1b/0x40
       drm_mode_getconnector+0x37e/0x4a0
       drm_ioctl_kernel+0xa8/0xf0
       drm_ioctl+0x1e8/0x390
       __x64_sys_ioctl+0x6a/0xa0
       do_syscall_64+0x39/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: &dev->master_mutex --> &client->modeset_mutex --> &dev->mode_config.mutex
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&dev->mode_config.mutex);
                               lock(&client->modeset_mutex);
                               lock(&dev->mode_config.mutex);
  lock(&dev->master_mutex);
2021-06-30 08:47:30 -04:00
Jeff Layton
0463b49e02 netfs: fix test for whether we can skip read when writing beyond EOF
commit 827a746f40 upstream.

It's not sufficient to skip reading when the pos is beyond the EOF.
There may be data at the head of the page that we need to fill in
before the write.

Add a new helper function that corrects and clarifies the logic of
when we can skip reads, and have it only zero out the part of the page
that won't have data copied in for the write.

Finally, don't set the page Uptodate after zeroing. It's not up to date
since the write data won't have been copied in yet.

[DH made the following changes:

 - Prefixed the new function with "netfs_".

 - Don't call zero_user_segments() for a full-page write.

 - Altered the beyond-last-page check to avoid a DIV instruction and got
   rid of then-redundant zero-length file check.
]

[ Note: this fix is commit 827a746f40 in mainline kernels. The
	original bug was in ceph, but got lifted into the fs/netfs
	library for v5.13. This backport should apply to stable
	kernels v5.10 though v5.12. ]

Fixes: e1b1240c1f ("netfs: Add write_begin helper")
Reported-by: Andrew W Elble <aweits@rit.edu>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
cc: ceph-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20210613233345.113565-1-jlayton@kernel.org/
Link: https://lore.kernel.org/r/162367683365.460125.4467036947364047314.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/162391826758.1173366.11794946719301590013.stgit@warthog.procyon.org.uk/ # v2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-30 08:47:29 -04:00
Bumyong Lee
e6108147dd swiotlb: manipulate orig_addr when tlb_addr has offset
commit 5f89468e2f upstream.

in case of driver wants to sync part of ranges with offset,
swiotlb_tbl_sync_single() copies from orig_addr base to tlb_addr with
offset and ends up with data mismatch.

It was removed from
"swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single",
but said logic has to be added back in.

From Linus's email:
"That commit which the removed the offset calculation entirely, because the old

        (unsigned long)tlb_addr & (IO_TLB_SIZE - 1)

was wrong, but instead of removing it, I think it should have just
fixed it to be

        (tlb_addr - mem->start) & (IO_TLB_SIZE - 1);

instead. That way the slot offset always matches the slot index calculation."

(Unfortunatly that broke NVMe).

The use-case that drivers are hitting is as follow:

1. Get dma_addr_t from dma_map_single()

dma_addr_t tlb_addr = dma_map_single(dev, vaddr, vsize, DMA_TO_DEVICE);

    |<---------------vsize------------->|
    +-----------------------------------+
    |                                   | original buffer
    +-----------------------------------+
  vaddr

 swiotlb_align_offset
     |<----->|<---------------vsize------------->|
     +-------+-----------------------------------+
     |       |                                   | swiotlb buffer
     +-------+-----------------------------------+
          tlb_addr

2. Do something
3. Sync dma_addr_t through dma_sync_single_for_device(..)

dma_sync_single_for_device(dev, tlb_addr + offset, size, DMA_TO_DEVICE);

  Error case.
    Copy data to original buffer but it is from base addr (instead of
  base addr + offset) in original buffer:

 swiotlb_align_offset
     |<----->|<- offset ->|<- size ->|
     +-------+-----------------------------------+
     |       |            |##########|           | swiotlb buffer
     +-------+-----------------------------------+
          tlb_addr

    |<- size ->|
    +-----------------------------------+
    |##########|                        | original buffer
    +-----------------------------------+
  vaddr

The fix is to copy the data to the original buffer and take into
account the offset, like so:

 swiotlb_align_offset
     |<----->|<- offset ->|<- size ->|
     +-------+-----------------------------------+
     |       |            |##########|           | swiotlb buffer
     +-------+-----------------------------------+
          tlb_addr

    |<- offset ->|<- size ->|
    +-----------------------------------+
    |            |##########|           | original buffer
    +-----------------------------------+
  vaddr

[One fix which was Linus's that made more sense to as it created a
symmetry would break NVMe. The reason for that is the:
 unsigned int offset = (tlb_addr - mem->start) & (IO_TLB_SIZE - 1);

would come up with the proper offset, but it would lose the
alignment (which this patch contains).]

Fixes: 16fc3cef33 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single")
Signed-off-by: Bumyong Lee <bumyong.lee@samsung.com>
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reported-by: Dominique MARTINET <dominique.martinet@atmark-techno.com>
Reported-by: Horia Geantă <horia.geanta@nxp.com>
Tested-by: Horia Geantă <horia.geanta@nxp.com>
CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-30 08:47:29 -04:00
Alper Gun
7570a8b5dd KVM: SVM: Call SEV Guest Decommission if ASID binding fails
commit 934002cd66 upstream.

Send SEV_CMD_DECOMMISSION command to PSP firmware if ASID binding
fails. If a failure happens after  a successful LAUNCH_START command,
a decommission command should be executed. Otherwise, guest context
will be unfreed inside the AMD SP. After the firmware will not have
memory to allocate more SEV guest context, LAUNCH_START command will
begin to fail with SEV_RET_RESOURCE_LIMIT error.

The existing code calls decommission inside sev_unbind_asid, but it is
not called if a failure happens before guest activation succeeds. If
sev_bind_asid fails, decommission is never called. PSP firmware has a
limit for the number of guests. If sev_asid_binding fails many times,
PSP firmware will not have resources to create another guest context.

Cc: stable@vger.kernel.org
Fixes: 59414c9892 ("KVM: SVM: Add support for KVM_SEV_LAUNCH_START command")
Reported-by: Peter Gonda <pgonda@google.com>
Signed-off-by: Alper Gun <alpergun@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210610174604.2554090-1-alpergun@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-30 08:47:29 -04:00