Commit Graph

134721 Commits

Author SHA1 Message Date
Oliver Upton
4ddee168f4 BACKPORT: KVM: arm64: Implement PSCI SYSTEM_SUSPEND
ARM DEN0022D.b 5.19 "SYSTEM_SUSPEND" describes a PSCI call that allows
software to request that a system be placed in the deepest possible
low-power state. Effectively, software can use this to suspend itself to
RAM.

Unfortunately, there really is no good way to implement a system-wide
PSCI call in KVM. Any precondition checks done in the kernel will need
to be repeated by userspace since there is no good way to protect a
critical section that spans an exit to userspace. SYSTEM_RESET and
SYSTEM_OFF are equally plagued by this issue, although no users have
seemingly cared for the relatively long time these calls have been
supported.

The solution is to just make the whole implementation userspace's
problem. Introduce a new system event, KVM_SYSTEM_EVENT_SUSPEND, that
indicates to userspace a calling vCPU has invoked PSCI SYSTEM_SUSPEND.
Additionally, add a CAP to get buy-in from userspace for this new exit
type.

Only advertise the SYSTEM_SUSPEND PSCI call if userspace has opted in.
If a vCPU calls SYSTEM_SUSPEND, punt straight to userspace. Provide
explicit documentation of userspace's responsibilites for the exit and
point to the PSCI specification to describe the actual PSCI call.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220504032446.4133305-8-oupton@google.com
(cherry picked from commit bfbab44568)
[willdeacon@: Fixed conflicts in api.rst and uapi header due to missing KVM_CAP_*]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I3b9c5633ca1a65aa1c144db1c1bbc34aebfb0b15
2022-08-10 08:58:47 +01:00
Oliver Upton
df1a2d7289 BACKPORT: KVM: arm64: Add support for userspace to suspend a vCPU
Introduce a new MP state, KVM_MP_STATE_SUSPENDED, which indicates a vCPU
is in a suspended state. In the suspended state the vCPU will block
until a wakeup event (pending interrupt) is recognized.

Add a new system event type, KVM_SYSTEM_EVENT_WAKEUP, to indicate to
userspace that KVM has recognized one such wakeup event. It is the
responsibility of userspace to then make the vCPU runnable, or leave it
suspended until the next wakeup event.

Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220504032446.4133305-7-oupton@google.com
(cherry picked from commit 7b33a09d03)
[willdeacon@: Drop riscv hunk from Documentation update]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I5d7ad6365ad95ec9554a4857949bf94b3bb1454e
2022-08-10 08:58:46 +01:00
Raghavendra Rao Ananta
acd49814e4 UPSTREAM: KVM: arm64: Setup a framework for hypercall bitmap firmware registers
KVM regularly introduces new hypercall services to the guests without
any consent from the userspace. This means, the guests can observe
hypercall services in and out as they migrate across various host
kernel versions. This could be a major problem if the guest
discovered a hypercall, started using it, and after getting migrated
to an older kernel realizes that it's no longer available. Depending
on how the guest handles the change, there's a potential chance that
the guest would just panic.

As a result, there's a need for the userspace to elect the services
that it wishes the guest to discover. It can elect these services
based on the kernels spread across its (migration) fleet. To remedy
this, extend the existing firmware pseudo-registers, such as
KVM_REG_ARM_PSCI_VERSION, but by creating a new COPROC register space
for all the hypercall services available.

These firmware registers are categorized based on the service call
owners, but unlike the existing firmware pseudo-registers, they hold
the features supported in the form of a bitmap.

During the VM initialization, the registers are set to upper-limit of
the features supported by the corresponding registers. It's expected
that the VMMs discover the features provided by each register via
GET_ONE_REG, and write back the desired values using SET_ONE_REG.
KVM allows this modification only until the VM has started.

Some of the standard features are not mapped to any bits of the
registers. But since they can recreate the original problem of
making it available without userspace's consent, they need to
be explicitly added to the case-list in
kvm_hvc_call_default_allowed(). Any function-id that's not enabled
via the bitmap, or not listed in kvm_hvc_call_default_allowed, will
be returned as SMCCC_RET_NOT_SUPPORTED to the guest.

Older userspace code can simply ignore the feature and the
hypercall services will be exposed unconditionally to the guests,
thus ensuring backward compatibility.

In this patch, the framework adds the register only for ARM's standard
secure services (owner value 4). Currently, this includes support only
for ARM True Random Number Generator (TRNG) service, with bit-0 of the
register representing mandatory features of v1.0. Other services are
momentarily added in the upcoming patches.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
[maz: reduced the scope of some helpers, tidy-up bitmap max values,
 dropped error-only fast path]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220502233853.1233742-3-rananta@google.com
(cherry picked from commit 05714cab7d)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I5ea92032540fbc6a2b6f778858d20b1e6272f147
2022-08-10 08:58:37 +01:00
Raghavendra Rao Ananta
1a1038b5fe UPSTREAM: KVM: arm64: Factor out firmware register handling from psci.c
Common hypercall firmware register handing is currently employed
by psci.c. Since the upcoming patches add more of these registers,
it's better to move the generic handling to hypercall.c for a
cleaner presentation.

While we are at it, collect all the firmware registers under
fw_reg_ids[] to help implement kvm_arm_get_fw_num_regs() and
kvm_arm_copy_fw_reg_indices() in a generic way. Also, define
KVM_REG_FEATURE_LEVEL_MASK using a GENMASK instead.

No functional change intended.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
[maz: fixed KVM_REG_FEATURE_LEVEL_MASK]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220502233853.1233742-2-rananta@google.com
(cherry picked from commit 85fbe08e4d)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I6e7e0b35bd5c4a9fb6a2b3f4b3bde9db644c1191
2022-08-10 08:58:36 +01:00
Mark Brown
80c133094b UPSTREAM: arm64/sme: Add ptrace support for ZA
The ZA array can be read and written with the NT_ARM_ZA.  Similarly to
our interface for the SVE vector registers the regset consists of a
header with information on the current vector length followed by an
optional register data payload, represented as for signals as a series
of horizontal vectors from 0 to VL/8 in the endianness independent
format used for vectors.

On get if ZA is enabled then register data will be provided, otherwise
it will be omitted.  On set if register data is provided then ZA is
enabled and initialized using the provided data, otherwise it is
disabled.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220419112247.711548-22-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 776b4a1cf3)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ibdf9669b3aafe579850e200d19bc013b66d0d96e
2022-08-10 08:58:19 +01:00
Mark Brown
0b6cbc7e9a UPSTREAM: arm64/sme: Implement ptrace support for streaming mode SVE registers
The streaming mode SVE registers are represented using the same data
structures as for SVE but since the vector lengths supported and in use
may not be the same as SVE we represent them with a new type NT_ARM_SSVE.
Unfortunately we only have a single 16 bit reserved field available in
the header so there is no space to fit the current and maximum vector
length for both standard and streaming SVE mode without redefining the
structure in a way the creates a complicatd and fragile ABI. Since FFR
is not present in streaming mode it is read and written as zero.

Setting NT_ARM_SSVE registers will put the task into streaming mode,
similarly setting NT_ARM_SVE registers will exit it. Reads that do not
correspond to the current mode of the task will return the header with
no register data. For compatibility reasons on write setting no flag for
the register type will be interpreted as setting SVE registers, though
users can provide no register data as an alternative mechanism for doing
so.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220419112247.711548-21-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit e12310a0d3)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I3174e91e2ad91fb8c7dbb1d00bdf6b5994a94744
2022-08-10 08:58:19 +01:00
Mark Brown
12efc92c13 UPSTREAM: arm64/sme: Implement vector length configuration prctl()s
As for SVE provide a prctl() interface which allows processes to
configure their SME vector length.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220419112247.711548-12-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 9e4ab6c891)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I99b778aea77b19bac68f1ddb61d77b51f64b49cb
2022-08-10 08:58:11 +01:00
Marc Zyngier
5f72c1a135 UPSTREAM: KVM: arm64: Simplify kvm_cpu_has_pending_timer()
kvm_cpu_has_pending_timer() ends up checking all the possible
timers for a wake-up cause. However, we already check for
pending interrupts whenever we try to wake-up a vcpu, including
the timer interrupts.

Obviously, doing the same work twice is once too many. Reduce
this helper to almost nothing, but keep it around, as we are
going to make use of it soon.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220419182755.601427-4-maz@kernel.org
(cherry picked from commit b57de4ffd7)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I051492d81a822d21c9434d5a09e8f670fe1d788d
2022-08-10 08:58:02 +01:00
Will Deacon
dd68dbc82b UPSTREAM: KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest
PSCI v1.1 introduces the optional SYSTEM_RESET2 call, which allows the
caller to provide a vendor-specific "reset type" and "cookie" to request
a particular form of reset or shutdown.

Expose this call to the guest and handle it in the same way as PSCI
SYSTEM_RESET, along with some basic range checking on the type argument.

Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220221153524.15397-3-will@kernel.org
(cherry picked from commit d43583b890)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ief05340203492a807a6dec27022947aea8635434
2022-08-10 07:29:13 +00:00
Will Deacon
eba129ae15 BACKPORT: KVM: arm64: Bump guest PSCI version to 1.1
Expose PSCI version v1.1 to the guest by default. The only difference
for now is that an updated version number is reported by PSCI_VERSION.

Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220221153524.15397-2-will@kernel.org
(cherry picked from commit 512865d83f)
[willdeacon@: Resolved conflict in arm64/kvm/psci.c the same way as upstream
 merge commit 1a48ce9264 ("Merge branch kvm-arm64/psci-1.1 into
 kvmarm-master/next")]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Icb7bfc49936ab5214370cb2c53087d6cca1b6154
2022-08-10 07:29:13 +00:00
Alexandru Elisei
064ec137e9 UPSTREAM: KVM: arm64: Keep a list of probed PMUs
The ARM PMU driver calls kvm_host_pmu_init() after probing to tell KVM that
a hardware PMU is available for guest emulation. Heterogeneous systems can
have more than one PMU present, and the callback gets called multiple
times, once for each of them. Keep track of all the PMUs available to KVM,
as they're going to be needed later.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220127161759.53553-5-alexandru.elisei@arm.com
(cherry picked from commit db858060b1)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I5ec338a629e2d3bc1e6a34d99c21b63b0d4ac029
2022-08-10 07:29:13 +00:00
Alexandru Elisei
9144aafdb6 UPSTREAM: perf: Fix wrong name in comment for struct perf_cpu_context
Commit 0793a61d4d ("performance counters: core code") added the perf
subsystem (then called Performance Counters) to Linux, creating the struct
perf_cpu_context. The comment for the struct referred to it as a "struct
perf_counter_cpu_context".

Commit cdd6c482c9 ("perf: Do the big rename: Performance Counters ->
Performance Events") changed the comment to refer to a "struct
perf_event_cpu_context", which was still the wrong name for the struct.

Change the comment to say "struct perf_cpu_context".

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220127161759.53553-3-alexandru.elisei@arm.com
(cherry picked from commit 2093057ab8)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I37fd3f756d3745635dd34d1adf0a638f4057f706
2022-08-10 07:29:13 +00:00
Oliver Upton
8ef2ec618a UPSTREAM: KVM: arm64: Drop unused param from kvm_psci_version()
kvm_psci_version() consumes a pointer to struct kvm in addition to a
vcpu pointer. Drop the kvm pointer as it is unused. While the comment
suggests the explicit kvm pointer was useful for calling from hyp, there
exist no such callsite in hyp.

Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220208012705.640444-1-oupton@google.com
(cherry picked from commit dfefa04a90)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I7c28e5d30f917b3684abaea71f653cf333ccf220
2022-08-10 07:29:13 +00:00
Sean Christopherson
995a174025 UPSTREAM: perf: Drop guest callback (un)register stubs
Drop perf's stubs for (un)registering guest callbacks now that KVM
registration of callbacks is hidden behind GUEST_PERF_EVENTS=y.  The only
other user is x86 XEN_PV, and x86 unconditionally selects PERF_EVENTS.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20211111020738.2512932-18-seanjc@google.com
(cherry picked from commit a9f4a6e92b)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I8bedaf81e5818e910e1de94dc71e3f575d63dab4
2022-08-10 07:29:13 +00:00
Sean Christopherson
6213429c0f UPSTREAM: KVM: arm64: Hide kvm_arm_pmu_available behind CONFIG_HW_PERF_EVENTS=y
Move the definition of kvm_arm_pmu_available to pmu-emul.c and, out of
"necessity", hide it behind CONFIG_HW_PERF_EVENTS.  Provide a stub for
the key's wrapper, kvm_arm_support_pmu_v3().  Moving the key's definition
out of perf.c will allow a future commit to delete perf.c entirely.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211111020738.2512932-16-seanjc@google.com
(cherry picked from commit be399d824b)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I3cf3214b05d69a6ddb1f76bad0d08116036356ca
2022-08-10 07:29:13 +00:00
Sean Christopherson
a2aa2a6c94 BACKPORT: KVM: Move x86's perf guest info callbacks to generic KVM
Move x86's perf guest callbacks into common KVM, as they are semantically
identical to arm64's callbacks (the only other such KVM callbacks).
arm64 will convert to the common versions in a future patch.

Implement the necessary arm64 arch hooks now to avoid having to provide
stubs or a temporary #define (from x86) to avoid arm64 compilation errors
when CONFIG_GUEST_PERF_EVENTS=y.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211111020738.2512932-13-seanjc@google.com
(cherry picked from commit e1bfc24577)
[willdeacon@: Fix context conflict in x86 asm/kvm_host.h]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Id8184005f4ccc370512bdbf3d3f18612dcee1cd8
2022-08-10 07:29:13 +00:00
Sean Christopherson
6590d00bbd UPSTREAM: perf/core: Use static_call to optimize perf_guest_info_callbacks
Use static_call to optimize perf's guest callbacks on arm64 and x86,
which are now the only architectures that define the callbacks.  Use
DEFINE_STATIC_CALL_RET0 as the default/NULL for all guest callbacks, as
the callback semantics are that a return value '0' means "not in guest".

static_call obviously avoids the overhead of CONFIG_RETPOLINE=y, but is
also advantageous versus other solutions, e.g. per-cpu callbacks, in that
a per-cpu memory load is not needed to detect the !guest case.

Based on code from Peter and Like.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20211111020738.2512932-10-seanjc@google.com
(cherry picked from commit 87b940a067)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Ia6b15724966e0b07b1d0794490ff067a7319c18a
2022-08-10 07:29:13 +00:00
Sean Christopherson
5945311ca4 UPSTREAM: perf: Force architectures to opt-in to guest callbacks
Introduce GUEST_PERF_EVENTS and require architectures to select it to
allow registering and using guest callbacks in perf.  This will hopefully
make it more difficult for new architectures to add useless "support" for
guest callbacks, e.g. via copy+paste.

Stubbing out the helpers has the happy bonus of avoiding a load of
perf_guest_cbs when GUEST_PERF_EVENTS=n on arm64/x86.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20211111020738.2512932-9-seanjc@google.com
(cherry picked from commit 2aef6f306b)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I098ae1986a3749ecef2d318c88094f551f3a128f
2022-08-10 07:29:13 +00:00
Sean Christopherson
d97ef4912b UPSTREAM: perf: Add wrappers for invoking guest callbacks
Add helpers for the guest callbacks to prepare for burying the callbacks
behind a Kconfig (it's a lot easier to provide a few stubs than to #ifdef
piles of code), and also to prepare for converting the callbacks to
static_call().  perf_instruction_pointer() in particular will have subtle
semantics with static_call(), as the "no callbacks" case will return 0 if
the callbacks are unregistered between querying guest state and getting
the IP.  Implement the change now to avoid a functional change when adding
static_call() support, and because the new helper needs to return
_something_ in this case.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20211111020738.2512932-8-seanjc@google.com
(cherry picked from commit 1c3430516b)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: Iab2b8f21081e7e08514637da9da64a853dcf25d3
2022-08-10 07:29:13 +00:00
Like Xu
637e737ad5 UPSTREAM: perf/core: Rework guest callbacks to prepare for static_call support
To prepare for using static_calls to optimize perf's guest callbacks,
replace ->is_in_guest and ->is_user_mode with a new multiplexed hook
->state, tweak ->handle_intel_pt_intr to play nice with being called when
there is no active guest, and drop "guest" from ->get_guest_ip.

Return '0' from ->state and ->handle_intel_pt_intr to indicate "not in
guest" so that DEFINE_STATIC_CALL_RET0 can be used to define the static
calls, i.e. no callback == !guest.

[sean: extracted from static_call patch, fixed get_ip() bug, wrote changelog]
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20211111020738.2512932-7-seanjc@google.com
(cherry picked from commit b9f5621c95)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I6b48d04e791954a83487d58c369cee3093e7fe2b
2022-08-10 07:29:13 +00:00
Sean Christopherson
f2ac8ce7c2 UPSTREAM: perf: Stop pretending that perf can handle multiple guest callbacks
Drop the 'int' return value from the perf (un)register callbacks helpers
and stop pretending perf can support multiple callbacks.  The 'int'
returns are not future proofing anything as none of the callers take
action on an error.  It's also not obvious that there will ever be
co-tenant hypervisors, and if there are, that allowing multiple callbacks
to be registered is desirable or even correct.

Opportunistically rename callbacks=>cbs in the affected declarations to
match their definitions.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20211111020738.2512932-5-seanjc@google.com
(cherry picked from commit 2934e3d093)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I90ad0c84d40327a17e12009f48bd12966ab70d0c
2022-08-10 07:29:13 +00:00
Marc Zyngier
c437aada2a UPSTREAM: KVM: Convert kvm_for_each_vcpu() to using xa_for_each_range()
Now that the vcpu array is backed by an xarray, use the optimised
iterator that matches the underlying data structure.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20211116160403.4074052-8-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 214bd3a6f4)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I262005faec3a52c5f2f698d943920c6136bb8144
2022-08-10 07:29:13 +00:00
Marc Zyngier
e3031d06f7 BACKPORT: KVM: Use 'unsigned long' as kvm_for_each_vcpu()'s index
Everywhere we use kvm_for_each_vpcu(), we use an int as the vcpu
index. Unfortunately, we're about to move rework the iterator,
which requires this to be upgrade to an unsigned long.

Let's bite the bullet and repaint all of it in one go.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20211116160403.4074052-7-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 46808a4cb8)
[willdeacon@: Drop riscv hunks; drop x86 hunks for code that isn't present;
 convert kvm_hyperv_tsc_notifier() as well]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I78a3ae244de22575684dd47a9823b9fc61b6fa7d
2022-08-10 07:29:13 +00:00
Marc Zyngier
37766cea76 UPSTREAM: KVM: Convert the kvm->vcpus array to a xarray
At least on arm64 and x86, the vcpus array is pretty huge (up to
1024 entries on x86) and is mostly empty in the majority of the cases
(running 1k vcpu VMs is not that common).

This mean that we end-up with a 4kB block of unused memory in the
middle of the kvm structure.

Instead of wasting away this memory, let's use an xarray instead,
which gives us almost the same flexibility as a normal array, but
with a reduced memory usage with smaller VMs.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20211116160403.4074052-6-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c5b0775491)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I5c89adb8caf5b167536b0e51590c7ee7ec0363d9
2022-08-10 07:29:13 +00:00
Marc Zyngier
25f5ae814c BACKPORT: KVM: Move wiping of the kvm->vcpus array to common code
All architectures have similar loops iterating over the vcpus,
freeing one vcpu at a time, and eventually wiping the reference
off the vcpus array. They are also inconsistently taking
the kvm->lock mutex when wiping the references from the array.

Make this code common, which will simplify further changes.
The locking is dropped altogether, as this should only be called
when there is no further references on the kvm structure.

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20211116160403.4074052-2-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 27592ae8db)
[willdeacon@: Drop riscv changes; fix conflict in arm64/kvm/arm.c the
 same way as upstream merge commit 7fd55a02a4 ("Merge tag 'kvmarm-5.17'
 of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD")]
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: If63c359dc841fca1a9baf73283a7a014058b2c97
2022-08-10 07:29:13 +00:00
Jaegeuk Kim
00aeae956e Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-5.15.y' into android14-5.15
* aosp/upstream-f2fs-stable-linux-5.15.y:
  f2fs: use onstack pages instead of pvec
  f2fs: intorduce f2fs_all_cluster_page_ready
  f2fs: clean up f2fs_abort_atomic_write()
  f2fs: handle decompress only post processing in softirq
  f2fs: do not allow to decompress files have FI_COMPRESS_RELEASED
  f2fs: do not set compression bit if kernel doesn't support
  f2fs: remove device type check for direct IO
  f2fs: fix null-ptr-deref in f2fs_get_dnode_of_data
  f2fs: revive F2FS_IOC_ABORT_VOLATILE_WRITE
  lib/iov_iter: initialize "flags" in new pipe_buffer
  f2fs: fix to do sanity check on segment type in build_sit_entries()
  f2fs: obsolete unused MAX_DISCARD_BLOCKS
  f2fs: fix to avoid use f2fs_bug_on() in f2fs_new_node_page()
  f2fs: fix to remove F2FS_COMPR_FL and tag F2FS_NOCOMP_FL at the same time
  f2fs: introduce sysfs atomic write statistics
  f2fs: don't bother wait_ms by foreground gc
  f2fs: invalidate meta pages only for post_read required inode
  f2fs: allow compression of files without blocks
  f2fs: fix to check inline_data during compressed inode conversion
  f2fs: Delete f2fs_copy_page() and replace with memcpy_page()
  f2fs: fix to invalidate META_MAPPING before DIO write
  f2fs: add a sysfs entry to show zone capacity
  f2fs: adjust zone capacity when considering valid block count
  f2fs: enforce single zone capacity
  f2fs: remove redundant code for gc condition
  f2fs: introduce memory mode
  f2fs: initialize page_array_entry slab only if compression feature is on
  f2fs: optimize error handling in redirty_blocks
  f2fs: do not skip updating inode when retrying to flush node page
  f2fs: use the updated test_dummy_encryption helper functions
  f2fs: do not count ENOENT for error case
  f2fs: fix iostat related lock protection
  f2fs: attach inline_data after setting compression
  BACKPORT: block: simplify calling convention of elv_unregister_queue()
  UPSTREAM: blk-crypto: remove blk_crypto_unregister()
  UPSTREAM: blk-crypto: update inline encryption documentation
  BACKPORT: blk-crypto: rename blk_keyslot_manager to blk_crypto_profile
  UPSTREAM: blk-crypto: rename keyslot-manager files to blk-crypto-profile
  UPSTREAM: blk-crypto-fallback: properly prefix function and struct names
  fscrypt: add new helper functions for test_dummy_encryption
  fscrypt: factor out fscrypt_policy_to_key_spec()
  fscrypt: log when starting to use inline encryption
  fscrypt: split up FS_CRYPTO_BLOCK_SIZE

Bug: 228919347
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I45cd7f2e740d19fa477481906b5b42cfd82c6bc8
2022-08-09 09:04:34 -07:00
Andy Shevchenko
0b65cedea1 UPSTREAM: KVM: arm64: vgic: Replace kernel.h with the necessary inclusions
arm_vgic.h does not require all the stuff that kernel.h provides.
Replace kernel.h inclusion with the list of what is really being used.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220104151940.55399-1-andriy.shevchenko@linux.intel.com
(cherry picked from commit 6c9eeb5f4a)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I90713958b02f5d11e925c27716d478744796acf8
2022-08-08 15:24:07 +01:00
Vitaly Kuznetsov
2adcfdc6a6 UPSTREAM: KVM: Drop stale kvm_is_transparent_hugepage() declaration
kvm_is_transparent_hugepage() was removed in commit 205d76ff06 ("KVM:
Remove kvm_is_transparent_hugepage() and PageTransCompoundMap()") but its
declaration in include/linux/kvm_host.h persisted. Drop it.

Fixes: 205d76ff06 (""KVM: Remove kvm_is_transparent_hugepage() and PageTransCompoundMap()")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211018151407.2107363-1-vkuznets@redhat.com
(cherry picked from commit f0e6e6fa41)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 233587962
Bug: 233588291
Change-Id: I9078ab62be40bc843ca2959f929ed22c1b8888e2
2022-08-08 14:57:53 +01:00
Greg Kroah-Hartman
046ce7a74e Merge 5.15.59 into android14-5.15
Changes in 5.15.59
	Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put
	Revert "ocfs2: mount shared volume without ha stack"
	ntfs: fix use-after-free in ntfs_ucsncmp()
	fs: sendfile handles O_NONBLOCK of out_fd
	secretmem: fix unhandled fault in truncate
	mm: fix page leak with multiple threads mapping the same page
	hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte
	asm-generic: remove a broken and needless ifdef conditional
	s390/archrandom: prevent CPACF trng invocations in interrupt context
	nouveau/svm: Fix to migrate all requested pages
	drm/simpledrm: Fix return type of simpledrm_simple_display_pipe_mode_valid()
	watch_queue: Fix missing rcu annotation
	watch_queue: Fix missing locking in add_watch_to_object()
	tcp: Fix data-races around sysctl_tcp_dsack.
	tcp: Fix a data-race around sysctl_tcp_app_win.
	tcp: Fix a data-race around sysctl_tcp_adv_win_scale.
	tcp: Fix a data-race around sysctl_tcp_frto.
	tcp: Fix a data-race around sysctl_tcp_nometrics_save.
	tcp: Fix data-races around sysctl_tcp_no_ssthresh_metrics_save.
	ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
	ice: do not setup vlan for loopback VSI
	scsi: ufs: host: Hold reference returned by of_parse_phandle()
	Revert "tcp: change pingpong threshold to 3"
	octeontx2-pf: Fix UDP/TCP src and dst port tc filters
	tcp: Fix data-races around sysctl_tcp_moderate_rcvbuf.
	tcp: Fix a data-race around sysctl_tcp_limit_output_bytes.
	tcp: Fix a data-race around sysctl_tcp_challenge_ack_limit.
	scsi: core: Fix warning in scsi_alloc_sgtables()
	scsi: mpt3sas: Stop fw fault watchdog work item during system shutdown
	net: ping6: Fix memleak in ipv6_renew_options().
	ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr
	net/tls: Remove the context from the list in tls_device_down
	igmp: Fix data-races around sysctl_igmp_qrv.
	net: pcs: xpcs: propagate xpcs_read error to xpcs_get_state_c37_sgmii
	net: sungem_phy: Add of_node_put() for reference returned by of_get_parent()
	tcp: Fix a data-race around sysctl_tcp_min_tso_segs.
	tcp: Fix a data-race around sysctl_tcp_min_rtt_wlen.
	tcp: Fix a data-race around sysctl_tcp_autocorking.
	tcp: Fix a data-race around sysctl_tcp_invalid_ratelimit.
	Documentation: fix sctp_wmem in ip-sysctl.rst
	macsec: fix NULL deref in macsec_add_rxsa
	macsec: fix error message in macsec_add_rxsa and _txsa
	macsec: limit replay window size with XPN
	macsec: always read MACSEC_SA_ATTR_PN as a u64
	net: macsec: fix potential resource leak in macsec_add_rxsa() and macsec_add_txsa()
	net: mld: fix reference count leak in mld_{query | report}_work()
	tcp: Fix data-races around sk_pacing_rate.
	net: Fix data-races around sysctl_[rw]mem(_offset)?.
	tcp: Fix a data-race around sysctl_tcp_comp_sack_delay_ns.
	tcp: Fix a data-race around sysctl_tcp_comp_sack_slack_ns.
	tcp: Fix a data-race around sysctl_tcp_comp_sack_nr.
	tcp: Fix data-races around sysctl_tcp_reflect_tos.
	ipv4: Fix data-races around sysctl_fib_notify_on_flag_change.
	i40e: Fix interface init with MSI interrupts (no MSI-X)
	sctp: fix sleep in atomic context bug in timer handlers
	octeontx2-pf: cn10k: Fix egress ratelimit configuration
	netfilter: nf_queue: do not allow packet truncation below transport header offset
	virtio-net: fix the race between refill work and close
	perf symbol: Correct address for bss symbols
	sfc: disable softirqs for ptp TX
	sctp: leave the err path free in sctp_stream_init to sctp_stream_free
	ARM: crypto: comment out gcc warning that breaks clang builds
	mm/hmm: fault non-owner device private entries
	page_alloc: fix invalid watermark check on a negative value
	ARM: 9216/1: Fix MAX_DMA_ADDRESS overflow
	EDAC/ghes: Set the DIMM label unconditionally
	docs/kernel-parameters: Update descriptions for "mitigations=" param with retbleed
	locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
	x86/bugs: Do not enable IBPB at firmware entry when IBPB is not available
	Linux 5.15.59

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I4f2002d38aea467e150a912f50d456c41b23de89
2022-08-04 15:18:41 +02:00
Marc Zyngier
d7ddd989d6 ANDROID: mm/vmalloc: Add arch-specific callbacks to track io{remap,unmap} physical pages
Add a pair of hooks (ioremap_phys_range_hook/iounmap_phys_range_hook)
that can be implemented by an architecture. Contrary to the existing
arch_sync_kernel_mappings(), this one tracks things at the physical
address level.

This is specially useful in these virtualised environments where
the guest has to tell the host whether (and how) it intends to use
a MMIO device.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Bug: 233587962
Change-Id: I970c2e632cb2b01060d5e66e4194fa9248188f43
Signed-off-by: Will Deacon <willdeacon@google.com>
2022-08-04 13:03:53 +00:00
Will Deacon
f884005542 Revert "FROMGIT: KVM: Drop stale kvm_is_transparent_hugepage() declaration"
This reverts commit 03761cf7c7.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I8d3741c7173b36b4e8f0d651a3f6600b0bbd9b72
2022-08-04 13:03:53 +00:00
Will Deacon
a7ff4a258c Revert "FROMGIT: KVM: arm64: vgic: Replace kernel.h with the necessary inclusions"
This reverts commit de680fdc4b.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I9bb0ad2b102a3415d5a3e57bb3c07ca5b45985f7
2022-08-04 13:03:53 +00:00
Will Deacon
e93e37edf2 Revert "ANDROID: KVM: arm64: Merge vmcr/apr save/restore"
This reverts commit 4eda3ce580.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ieaf3d86587164d809a780e533226a78b158ca4d6
2022-08-04 13:03:53 +00:00
Will Deacon
0dd604b38e Revert "ANDROID: KVM: arm64: Add MEMINFO and {UN,}SHARE hypercalls for protected guests"
This reverts commit da4c4dc33a.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I5f812a7f92fc3762cb455b779f56aedac2e2cb83
2022-08-04 13:03:53 +00:00
Will Deacon
a3c29e1691 Revert "ANDROID: BACKPORT: KVM: arm64: Introduce KVM_VM_TYPE_ARM_PROTECTED machine type for PVMs"
This reverts commit 3c4b7ff736.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I9009040d4630399b83c96140fcf9ddd27148edfd
2022-08-04 13:03:53 +00:00
Will Deacon
41eae257b0 Revert "ANDROID: KVM: arm64: Define MMIO guard hypercalls"
This reverts commit c1f264d4f0.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I178180fa556f79494f19308f624da389276c6bfe
2022-08-04 13:03:53 +00:00
Will Deacon
432cf24bb2 Revert "ANDROID: mm/vmalloc: Add arch-specific callbacks to track io{remap,unmap} physical pages"
This reverts commit acd8b4b1f1.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ie56327837f823b424f3438d5b974c8086f8ba1e9
2022-08-04 13:03:53 +00:00
Will Deacon
1c13daec02 Revert "ANDROID: firmware: arm_ffa: Move constants to header file"
This reverts commit 905f49d35b.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I057a34c49e5e3fd1f44f1f770aee848df6a19289
2022-08-04 13:03:53 +00:00
Will Deacon
12749a81ca Revert "ANDROID: firmware: arm_ffa: Move comment before the field it is documenting"
This reverts commit 1c0baeb7f0.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ia33d00daec02d055a281b4f80372dabfe6834f3b
2022-08-04 13:03:53 +00:00
Will Deacon
b422db4c25 Revert "ANDROID: KVM: arm64: Handle FFA_RXTX_MAP and FFA_RXTX_UNMAP calls from the host"
This reverts commit 64eaaad40f.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I396947e13f07258c59915e47fb7fee8b619dd9e3
2022-08-04 13:03:53 +00:00
Will Deacon
96c5df4973 Revert "FROMLIST: KVM: arm64: Bump guest PSCI version to 1.1"
This reverts commit b5c4401843.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: Ic91d43d8d7fe81554c3a8c7e1b576a4c280ad611
2022-08-04 13:03:53 +00:00
Will Deacon
4c29fddedd Revert "FROMLIST: KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest"
This reverts commit 25aa354adb.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I2fe4cc1a129872e0cd5d65ae974a358b23895526
2022-08-04 13:03:53 +00:00
Will Deacon
a66888825b Revert "ANDROID: KVM: arm64: Use PSCI MEM_PROTECT to zap guest pages on reset"
This reverts commit 4347917056.

Bug: 233587962
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I3224c53a791029db52821f5333fe0ba9da168f64
2022-08-04 13:03:53 +00:00
Kuniyuki Iwashima
618116a273 net: Fix data-races around sysctl_[rw]mem(_offset)?.
[ Upstream commit 0273954595 ]

While reading these sysctl variables, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.

  - .sysctl_rmem
  - .sysctl_rwmem
  - .sysctl_rmem_offset
  - .sysctl_wmem_offset
  - sysctl_tcp_rmem[1, 2]
  - sysctl_tcp_wmem[1, 2]
  - sysctl_decnet_rmem[1]
  - sysctl_decnet_wmem[1]
  - sysctl_tipc_rmem[1]

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-03 12:03:51 +02:00
Ziyang Xuan
189e370b82 ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr
commit 85f0173df3 upstream.

Change net device's MTU to smaller than IPV6_MIN_MTU or unregister
device while matching route. That may trigger null-ptr-deref bug
for ip6_ptr probability as following.

=========================================================
BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134
Read of size 4 at addr 0000000000000308 by task ping6/263

CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ #14
Call trace:
 dump_backtrace+0x1a8/0x230
 show_stack+0x20/0x70
 dump_stack_lvl+0x68/0x84
 print_report+0xc4/0x120
 kasan_report+0x84/0x120
 __asan_load4+0x94/0xd0
 find_match.part.0+0x70/0x134
 __find_rr_leaf+0x408/0x470
 fib6_table_lookup+0x264/0x540
 ip6_pol_route+0xf4/0x260
 ip6_pol_route_output+0x58/0x70
 fib6_rule_lookup+0x1a8/0x330
 ip6_route_output_flags_noref+0xd8/0x1a0
 ip6_route_output_flags+0x58/0x160
 ip6_dst_lookup_tail+0x5b4/0x85c
 ip6_dst_lookup_flow+0x98/0x120
 rawv6_sendmsg+0x49c/0xc70
 inet_sendmsg+0x68/0x94

Reproducer as following:
Firstly, prepare conditions:
$ip netns add ns1
$ip netns add ns2
$ip link add veth1 type veth peer name veth2
$ip link set veth1 netns ns1
$ip link set veth2 netns ns2
$ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1
$ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2
$ip netns exec ns1 ifconfig veth1 up
$ip netns exec ns2 ifconfig veth2 up
$ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1
$ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1

Secondly, execute the following two commands in two ssh windows
respectively:
$ip netns exec ns1 sh
$while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done

$ip netns exec ns1 sh
$while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done

It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly,
then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check.

	cpu0			cpu1
fib6_table_lookup
__find_rr_leaf
			addrconf_notify [ NETDEV_CHANGEMTU ]
			addrconf_ifdown
			RCU_INIT_POINTER(dev->ip6_ptr, NULL)
find_match
ip6_ignore_linkdown

So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to
fix the null-ptr-deref bug.

Fixes: dcd1f57295 ("net/ipv6: Remove fib6_idev")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220728013307.656257-1-william.xuanziyang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-03 12:03:47 +02:00
Wei Wang
927c5cf0ba Revert "tcp: change pingpong threshold to 3"
commit 4d8f24eeed upstream.

This reverts commit 4a41f453be.

This to-be-reverted commit was meant to apply a stricter rule for the
stack to enter pingpong mode. However, the condition used to check for
interactive session "before(tp->lsndtime, icsk->icsk_ack.lrcvtime)" is
jiffy based and might be too coarse, which delays the stack entering
pingpong mode.
We revert this patch so that we no longer use the above condition to
determine interactive session, and also reduce pingpong threshold to 1.

Fixes: 4a41f453be ("tcp: change pingpong threshold to 3")
Reported-by: LemmyHuang <hlm3280@163.com>
Suggested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220721204404.388396-1-weiwan@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-03 12:03:45 +02:00
Kuniyuki Iwashima
0d8fa3c2a4 tcp: Fix a data-race around sysctl_tcp_adv_win_scale.
commit 36eeee75ef upstream.

While reading sysctl_tcp_adv_win_scale, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-03 12:03:44 +02:00
Lukas Bulwahn
71f7115011 asm-generic: remove a broken and needless ifdef conditional
commit e2a619ca0b upstream.

Commit 527701eda5 ("lib: Add a generic version of devmem_is_allowed()")
introduces the config symbol GENERIC_LIB_DEVMEM_IS_ALLOWED, but then
falsely refers to CONFIG_GENERIC_DEVMEM_IS_ALLOWED (note the missing LIB
in the reference) in ./include/asm-generic/io.h.

Luckily, ./scripts/checkkconfigsymbols.py warns on non-existing configs:

GENERIC_DEVMEM_IS_ALLOWED
Referencing files: include/asm-generic/io.h

The actual fix, though, is simply to not to make this function declaration
dependent on any kernel config. For architectures that intend to use
the generic version, the arch's 'select GENERIC_LIB_DEVMEM_IS_ALLOWED' will
lead to picking the function definition, and for other architectures, this
function is simply defined elsewhere.

The wrong '#ifndef' on a non-existing config symbol also always had the
same effect (although more by mistake than by intent). So, there is no
functional change.

Remove this broken and needless ifdef conditional.

Fixes: 527701eda5 ("lib: Add a generic version of devmem_is_allowed()")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-03 12:03:42 +02:00
Luiz Augusto von Dentz
f32d5615a7 Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put
commit d0be8347c6 upstream.

This fixes the following trace which is caused by hci_rx_work starting up
*after* the final channel reference has been put() during sock_close() but
*before* the references to the channel have been destroyed, so instead
the code now rely on kref_get_unless_zero/l2cap_chan_hold_unless_zero to
prevent referencing a channel that is about to be destroyed.

  refcount_t: increment on 0; use-after-free.
  BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
  Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705

  CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W
  4.14.234-00003-g1fb6d0bd49a4-dirty #28
  Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150
  Google Inc. MSM sm8150 Flame DVT (DT)
  Workqueue: hci0 hci_rx_work
  Call trace:
   dump_backtrace+0x0/0x378
   show_stack+0x20/0x2c
   dump_stack+0x124/0x148
   print_address_description+0x80/0x2e8
   __kasan_report+0x168/0x188
   kasan_report+0x10/0x18
   __asan_load4+0x84/0x8c
   refcount_dec_and_test+0x20/0xd0
   l2cap_chan_put+0x48/0x12c
   l2cap_recv_frame+0x4770/0x6550
   l2cap_recv_acldata+0x44c/0x7a4
   hci_acldata_packet+0x100/0x188
   hci_rx_work+0x178/0x23c
   process_one_work+0x35c/0x95c
   worker_thread+0x4cc/0x960
   kthread+0x1a8/0x1c4
   ret_from_fork+0x10/0x18

Cc: stable@kernel.org
Reported-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-03 12:03:40 +02:00
Greg Kroah-Hartman
6d2ac8a0a4 Merge 5.15.58 into android14-5.15
Changes in 5.15.58
	pinctrl: stm32: fix optional IRQ support to gpios
	riscv: add as-options for modules with assembly compontents
	mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication
	lockdown: Fix kexec lockdown bypass with ima policy
	drm/ttm: fix locking in vmap/vunmap TTM GEM helpers
	bus: mhi: host: pci_generic: add Telit FN980 v1 hardware revision
	bus: mhi: host: pci_generic: add Telit FN990
	Revert "selftest/vm: verify remap destination address in mremap_test"
	Revert "selftest/vm: verify mmap addr in mremap_test"
	PCI: hv: Fix multi-MSI to allow more than one MSI vector
	PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI
	PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()
	PCI: hv: Fix interrupt mapping for multi-MSI
	serial: mvebu-uart: correctly report configured baudrate value
	batman-adv: Use netif_rx_any_context() any.
	Revert "mt76: mt7921: Fix the error handling path of mt7921_pci_probe()"
	Revert "mt76: mt7921e: fix possible probe failure after reboot"
	mt76: mt7921: use physical addr to unify register access
	mt76: mt7921e: fix possible probe failure after reboot
	mt76: mt7921: Fix the error handling path of mt7921_pci_probe()
	xfs: fix maxlevels comparisons in the btree staging code
	xfs: fold perag loop iteration logic into helper function
	xfs: rename the next_agno perag iteration variable
	xfs: terminate perag iteration reliably on agcount
	xfs: fix perag reference leak on iteration race with growfs
	xfs: prevent a WARN_ONCE() in xfs_ioc_attr_list()
	r8152: fix a WOL issue
	ip: Fix data-races around sysctl_ip_default_ttl.
	xfrm: xfrm_policy: fix a possible double xfrm_pols_put() in xfrm_bundle_lookup()
	power/reset: arm-versatile: Fix refcount leak in versatile_reboot_probe
	RDMA/irdma: Do not advertise 1GB page size for x722
	RDMA/irdma: Fix sleep from invalid context BUG
	pinctrl: ralink: rename MT7628(an) functions to MT76X8
	pinctrl: ralink: rename pinctrl-rt2880 to pinctrl-ralink
	pinctrl: ralink: Check for null return of devm_kcalloc
	perf/core: Fix data race between perf_event_set_output() and perf_mmap_close()
	ipv4/tcp: do not use per netns ctl sockets
	net: tun: split run_ebpf_filter() and pskb_trim() into different "if statement"
	mm/pagealloc: sysctl: change watermark_scale_factor max limit to 30%
	sysctl: move some boundary constants from sysctl.c to sysctl_vals
	tcp: Fix data-races around sysctl_tcp_ecn.
	drm/amd/display: Support for DMUB HPD interrupt handling
	drm/amd/display: Add option to defer works of hpd_rx_irq
	drm/amd/display: Fork thread to offload work of hpd_rx_irq
	drm/amdgpu/display: add quirk handling for stutter mode
	drm/amd/display: Ignore First MST Sideband Message Return Error
	scsi: megaraid: Clear READ queue map's nr_queues
	scsi: ufs: core: Drop loglevel of WriteBoost message
	nvme: check for duplicate identifiers earlier
	nvme: fix block device naming collision
	e1000e: Enable GPT clock before sending message to CSME
	Revert "e1000e: Fix possible HW unit hang after an s0ix exit"
	igc: Reinstate IGC_REMOVED logic and implement it properly
	ip: Fix data-races around sysctl_ip_no_pmtu_disc.
	ip: Fix data-races around sysctl_ip_fwd_use_pmtu.
	ip: Fix data-races around sysctl_ip_fwd_update_priority.
	ip: Fix data-races around sysctl_ip_nonlocal_bind.
	ip: Fix a data-race around sysctl_ip_autobind_reuse.
	ip: Fix a data-race around sysctl_fwmark_reflect.
	tcp/dccp: Fix a data-race around sysctl_tcp_fwmark_accept.
	tcp: sk->sk_bound_dev_if once in inet_request_bound_dev_if()
	tcp: Fix data-races around sysctl_tcp_l3mdev_accept.
	tcp: Fix data-races around sysctl_tcp_mtu_probing.
	tcp: Fix data-races around sysctl_tcp_base_mss.
	tcp: Fix data-races around sysctl_tcp_min_snd_mss.
	tcp: Fix a data-race around sysctl_tcp_mtu_probe_floor.
	tcp: Fix a data-race around sysctl_tcp_probe_threshold.
	tcp: Fix a data-race around sysctl_tcp_probe_interval.
	net: stmmac: fix pm runtime issue in stmmac_dvr_remove()
	net: stmmac: fix unbalanced ptp clock issue in suspend/resume flow
	mtd: rawnand: gpmi: validate controller clock rate
	mtd: rawnand: gpmi: Set WAIT_FOR_READY timeout based on program/erase times
	net: dsa: microchip: ksz_common: Fix refcount leak bug
	net: skb: introduce kfree_skb_reason()
	net: skb: use kfree_skb_reason() in tcp_v4_rcv()
	net: skb: use kfree_skb_reason() in __udp4_lib_rcv()
	net: socket: rename SKB_DROP_REASON_SOCKET_FILTER
	net: skb_drop_reason: add document for drop reasons
	net: netfilter: use kfree_drop_reason() for NF_DROP
	net: ipv4: use kfree_skb_reason() in ip_rcv_core()
	net: ipv4: use kfree_skb_reason() in ip_rcv_finish_core()
	i2c: mlxcpld: Fix register setting for 400KHz frequency
	i2c: cadence: Change large transfer count reset logic to be unconditional
	perf tests: Fix Convert perf time to TSC test for hybrid
	net: stmmac: fix dma queue left shift overflow issue
	net/tls: Fix race in TLS device down flow
	igmp: Fix data-races around sysctl_igmp_llm_reports.
	igmp: Fix a data-race around sysctl_igmp_max_memberships.
	igmp: Fix data-races around sysctl_igmp_max_msf.
	tcp: Fix data-races around keepalive sysctl knobs.
	tcp: Fix data-races around sysctl_tcp_syn(ack)?_retries.
	tcp: Fix data-races around sysctl_tcp_syncookies.
	tcp: Fix data-races around sysctl_tcp_migrate_req.
	tcp: Fix data-races around sysctl_tcp_reordering.
	tcp: Fix data-races around some timeout sysctl knobs.
	tcp: Fix a data-race around sysctl_tcp_notsent_lowat.
	tcp: Fix a data-race around sysctl_tcp_tw_reuse.
	tcp: Fix data-races around sysctl_max_syn_backlog.
	tcp: Fix data-races around sysctl_tcp_fastopen.
	tcp: Fix data-races around sysctl_tcp_fastopen_blackhole_timeout.
	iavf: Fix handling of dummy receive descriptors
	pinctrl: armada-37xx: Use temporary variable for struct device
	pinctrl: armada-37xx: Make use of the devm_platform_ioremap_resource()
	pinctrl: armada-37xx: Convert to use dev_err_probe()
	pinctrl: armada-37xx: use raw spinlocks for regmap to avoid invalid wait context
	i40e: Fix erroneous adapter reinitialization during recovery process
	ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero
	net: stmmac: remove redunctant disable xPCS EEE call
	gpio: pca953x: only use single read/write for No AI mode
	gpio: pca953x: use the correct range when do regmap sync
	gpio: pca953x: use the correct register address when regcache sync during init
	be2net: Fix buffer overflow in be_get_module_eeprom
	net: dsa: sja1105: silent spi_device_id warnings
	net: dsa: vitesse-vsc73xx: silent spi_device_id warnings
	drm/imx/dcss: Add missing of_node_put() in fail path
	ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh.
	ipv4: Fix data-races around sysctl_fib_multipath_hash_policy.
	ipv4: Fix data-races around sysctl_fib_multipath_hash_fields.
	ip: Fix data-races around sysctl_ip_prot_sock.
	udp: Fix a data-race around sysctl_udp_l3mdev_accept.
	tcp: Fix data-races around sysctl knobs related to SYN option.
	tcp: Fix a data-race around sysctl_tcp_early_retrans.
	tcp: Fix data-races around sysctl_tcp_recovery.
	tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts.
	tcp: Fix data-races around sysctl_tcp_slow_start_after_idle.
	tcp: Fix a data-race around sysctl_tcp_retrans_collapse.
	tcp: Fix a data-race around sysctl_tcp_stdurg.
	tcp: Fix a data-race around sysctl_tcp_rfc1337.
	tcp: Fix a data-race around sysctl_tcp_abort_on_overflow.
	tcp: Fix data-races around sysctl_tcp_max_reordering.
	gpio: gpio-xilinx: Fix integer overflow
	KVM: selftests: Fix target thread to be migrated in rseq_test
	spi: bcm2835: bcm2835_spi_handle_err(): fix NULL pointer deref for non DMA transfers
	KVM: Don't null dereference ops->destroy
	mm/mempolicy: fix uninit-value in mpol_rebind_policy()
	bpf: Make sure mac_header was set before using it
	sched/deadline: Fix BUG_ON condition for deboosted tasks
	x86/bugs: Warn when "ibrs" mitigation is selected on Enhanced IBRS parts
	dlm: fix pending remove if msg allocation fails
	x86/uaccess: Implement macros for CMPXCHG on user addresses
	x86/extable: Tidy up redundant handler functions
	x86/extable: Get rid of redundant macros
	x86/mce: Deduplicate exception handling
	x86/extable: Rework the exception table mechanics
	x86/extable: Provide EX_TYPE_DEFAULT_MCE_SAFE and EX_TYPE_FAULT_MCE_SAFE
	bitfield.h: Fix "type of reg too small for mask" test
	x86/entry_32: Remove .fixup usage
	x86/extable: Extend extable functionality
	x86/msr: Remove .fixup usage
	x86/futex: Remove .fixup usage
	KVM: x86: Use __try_cmpxchg_user() to emulate atomic accesses
	xhci: dbc: refactor xhci_dbc_init()
	xhci: dbc: create and remove dbc structure in dbgtty driver.
	xhci: dbc: Rename xhci_dbc_init and xhci_dbc_exit
	xhci: Set HCD flag to defer primary roothub registration
	mt76: fix use-after-free by removing a non-RCU wcid pointer
	iwlwifi: fw: uefi: add missing include guards
	crypto: qat - set to zero DH parameters before free
	crypto: qat - use pre-allocated buffers in datapath
	crypto: qat - refactor submission logic
	crypto: qat - add backlog mechanism
	crypto: qat - fix memory leak in RSA
	crypto: qat - remove dma_free_coherent() for RSA
	crypto: qat - remove dma_free_coherent() for DH
	crypto: qat - add param check for RSA
	crypto: qat - add param check for DH
	crypto: qat - re-enable registration of algorithms
	exfat: fix referencing wrong parent directory information after renaming
	tracing: Have event format check not flag %p* on __get_dynamic_array()
	tracing: Place trace_pid_list logic into abstract functions
	tracing: Fix return value of trace_pid_write()
	um: virtio_uml: Allow probing from devicetree
	um: virtio_uml: Fix broken device handling in time-travel
	Bluetooth: Add bt_skb_sendmsg helper
	Bluetooth: Add bt_skb_sendmmsg helper
	Bluetooth: SCO: Replace use of memcpy_from_msg with bt_skb_sendmsg
	Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg
	Bluetooth: Fix passing NULL to PTR_ERR
	Bluetooth: SCO: Fix sco_send_frame returning skb->len
	Bluetooth: Fix bt_skb_sendmmsg not allocating partial chunks
	exfat: use updated exfat_chain directly during renaming
	drm/amd/display: Reset DMCUB before HW init
	drm/amd/display: Optimize bandwidth on following fast update
	drm/amd/display: Fix surface optimization regression on Carrizo
	x86/amd: Use IBPB for firmware calls
	x86/alternative: Report missing return thunk details
	watchqueue: make sure to serialize 'wqueue->defunct' properly
	tty: drivers/tty/, stop using tty_schedule_flip()
	tty: the rest, stop using tty_schedule_flip()
	tty: drop tty_schedule_flip()
	tty: extract tty_flip_buffer_commit() from tty_flip_buffer_push()
	tty: use new tty_insert_flip_string_and_push_buffer() in pty_write()
	net: usb: ax88179_178a needs FLAG_SEND_ZLP
	watch-queue: remove spurious double semicolon
	drm/amd/display: Don't lock connection_mutex for DMUB HPD
	drm/amd/display: invalid parameter check in dmub_hpd_callback
	x86/extable: Prefer local labels in .set directives
	KVM: x86: fix typo in __try_cmpxchg_user causing non-atomicness
	x86: drop bogus "cc" clobber from __try_cmpxchg_user_asm()
	drm/amdgpu: Off by one in dm_dmub_outbox1_low_irq()
	x86/entry_32: Fix segment exceptions
	drm/amd/display: Fix wrong format specifier in amdgpu_dm.c
	Linux 5.15.58

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I6655a937b4226d3011278d13df84b25e5ab4b9ef
2022-08-02 08:37:15 +02:00