Commit Graph

204849 Commits

Author SHA1 Message Date
Linus Torvalds
150aae354b Merge tag 'perf_urgent_for_v6.2_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:

 - Pass only an initialized perf event attribute to the LSM hook

 - Fix a use-after-free on the perf syscall's error path

 - A potential integer overflow fix in amd_core_pmu_init()

 - Fix the cgroup events tracking after the context handling rewrite

 - Return the proper value from the inherit_event() function on error

* tag 'perf_urgent_for_v6.2_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Call LSM hook after copying perf_event_attr
  perf: Fix use-after-free in error path
  perf/x86/amd: fix potential integer overflow on shift of a int
  perf/core: Fix cgroup events tracking
  perf core: Return error pointer if inherit_event() fails to find pmu_ctx
2023-01-01 11:27:00 -08:00
Linus Torvalds
5b129817ae Merge tag 'x86_urgent_for_v6.2_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Two fixes to correct how kprobes handles INT3 now that they're added
   by other functionality like the rethunks and not only kgdb

 - Remove __init section markings of two functions which are referenced
   by a function in the .text section

* tag 'x86_urgent_for_v6.2_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kprobes: Fix optprobe optimization check with CONFIG_RETHUNK
  x86/kprobes: Fix kprobes instruction boudary check with CONFIG_RETHUNK
  x86/calldepth: Fix incorrect init section references
2023-01-01 11:19:50 -08:00
Paolo Bonzini
a5496886eb Merge branch 'kvm-late-6.1-fixes' into HEAD
x86:

* several fixes to nested VMX execution controls

* fixes and clarification to the documentation for Xen emulation

* do not unnecessarily release a pmu event with zero period

* MMU fixes

* fix Coverity warning in kvm_hv_flush_tlb()

selftests:

* fixes for the ucall mechanism in selftests

* other fixes mostly related to compilation with clang
2022-12-28 07:19:14 -05:00
Paolo Bonzini
a79b53aaaa KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESET
While KVM_XEN_EVTCHN_RESET is usually called with no vCPUs running,
if that happened it could cause a deadlock.  This is due to
kvm_xen_eventfd_reset() doing a synchronize_srcu() inside
a kvm->lock critical section.

To avoid this, first collect all the evtchnfd objects in an
array and free all of them once the kvm->lock critical section
is over and th SRCU grace period has expired.

Reported-by: Michal Luczaj <mhal@rbox.co>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-28 05:53:57 -05:00
Masami Hiramatsu (Google)
63dc6325ff x86/kprobes: Fix optprobe optimization check with CONFIG_RETHUNK
Since the CONFIG_RETHUNK and CONFIG_SLS will use INT3 for stopping
speculative execution after function return, kprobe jump optimization
always fails on the functions with such INT3 inside the function body.
(It already checks the INT3 padding between functions, but not inside
 the function)

To avoid this issue, as same as kprobes, check whether the INT3 comes
from kgdb or not, and if so, stop decoding and make it fail. The other
INT3 will come from CONFIG_RETHUNK/CONFIG_SLS and those can be
treated as a one-byte instruction.

Fixes: e463a09af2 ("x86: Add straight-line-speculation mitigation")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/167146051929.1374301.7419382929328081706.stgit@devnote3
2022-12-27 12:51:58 +01:00
Masami Hiramatsu (Google)
1993bf9799 x86/kprobes: Fix kprobes instruction boudary check with CONFIG_RETHUNK
Since the CONFIG_RETHUNK and CONFIG_SLS will use INT3 for stopping
speculative execution after RET instruction, kprobes always failes to
check the probed instruction boundary by decoding the function body if
the probed address is after such sequence. (Note that some conditional
code blocks will be placed after function return, if compiler decides
it is not on the hot path.)

This is because kprobes expects kgdb puts the INT3 as a software
breakpoint and it will replace the original instruction.
But these INT3 are not such purpose, it doesn't need to recover the
original instruction.

To avoid this issue, kprobes checks whether the INT3 is owned by
kgdb or not, and if so, stop decoding and make it fail. The other
INT3 will come from CONFIG_RETHUNK/CONFIG_SLS and those can be
treated as a one-byte instruction.

Fixes: e463a09af2 ("x86: Add straight-line-speculation mitigation")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/167146051026.1374301.392728975473572291.stgit@devnote3
2022-12-27 12:51:58 +01:00
Arnd Bergmann
ade8c20847 x86/calldepth: Fix incorrect init section references
The addition of callthunks_translate_call_dest means that
skip_addr() and patch_dest() can no longer be discarded
as part of the __init section freeing:

WARNING: modpost: vmlinux.o: section mismatch in reference: callthunks_translate_call_dest.cold (section: .text.unlikely) -> skip_addr (section: .init.text)
WARNING: modpost: vmlinux.o: section mismatch in reference: callthunks_translate_call_dest.cold (section: .text.unlikely) -> patch_dest (section: .init.text)
WARNING: modpost: vmlinux.o: section mismatch in reference: is_callthunk.cold (section: .text.unlikely) -> skip_addr (section: .init.text)
ERROR: modpost: Section mismatches detected.
Set CONFIG_SECTION_MISMATCH_WARN_ONLY=y to allow them.

Fixes: b2e9dfe54b ("x86/bpf: Emit call depth accounting if required")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221215164334.968863-1-arnd@kernel.org
2022-12-27 12:51:58 +01:00
Colin Ian King
08245672cd perf/x86/amd: fix potential integer overflow on shift of a int
The left shift of int 32 bit integer constant 1 is evaluated using 32 bit
arithmetic and then passed as a 64 bit function argument. In the case where
i is 32 or more this can lead to an overflow.  Avoid this by shifting
using the BIT_ULL macro instead.

Fixes: 471af006a7 ("perf/x86/amd: Constrain Large Increment per Cycle events")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Link: https://lore.kernel.org/r/20221202135149.1797974-1-colin.i.king@gmail.com
2022-12-27 12:44:00 +01:00
David Woodhouse
b0305c1e0e KVM: x86/xen: Add KVM_XEN_INVALID_GPA and KVM_XEN_INVALID_GFN to uapi
These are (uint64_t)-1 magic values are a userspace ABI, allowing the
shared info pages and other enlightenments to be disabled. This isn't
a Xen ABI because Xen doesn't let the guest turn these off except with
the full SHUTDOWN_soft_reset mechanism. Under KVM, the userspace VMM is
expected to handle soft reset, and tear down the kernel parts of the
enlightenments accordingly.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-5-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-27 06:01:49 -05:00
Michal Luczaj
1c14faa508 KVM: x86/xen: Simplify eventfd IOCTLs
Port number is validated in kvm_xen_setattr_evtchn().
Remove superfluous checks in kvm_xen_eventfd_assign() and
kvm_xen_eventfd_update().

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Message-Id: <20221222203021.1944101-3-mhal@rbox.co>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-4-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-27 06:01:49 -05:00
Paolo Bonzini
70eae03087 KVM: x86/xen: Fix SRCU/RCU usage in readers of evtchn_ports
The evtchnfd structure itself must be protected by either kvm->lock or
SRCU. Use the former in kvm_xen_eventfd_update(), since the lock is
being taken anyway; kvm_xen_hcall_evtchn_send() instead is a reader and
does not need kvm->lock, and is called in SRCU critical section from the
kvm_x86_handle_exit function.

It is also important to use rcu_read_{lock,unlock}() in
kvm_xen_hcall_evtchn_send(), because idr_remove() will *not*
use synchronize_srcu() to wait for readers to complete.

Remove a superfluous if (kvm) check before calling synchronize_srcu()
in kvm_xen_eventfd_deassign() where kvm has been dereferenced already.

Co-developed-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-3-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-27 06:01:49 -05:00
David Woodhouse
92c58965e9 KVM: x86/xen: Use kvm_read_guest_virt() instead of open-coding it badly
In particular, we shouldn't assume that being contiguous in guest virtual
address space means being contiguous in guest *physical* address space.

In dropping the manual calls to kvm_mmu_gva_to_gpa_system(), also drop
the srcu_read_lock() that was around them. All call sites are reached
from kvm_xen_hypercall() which is called from the handle_exit function
with the read lock already held.

       536395260 ("KVM: x86/xen: handle PV timers oneshot mode")
       1a65105a5 ("KVM: x86/xen: handle PV spinlocks slowpath")

Fixes: 2fd6df2f2 ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-2-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-27 06:01:48 -05:00
Michal Luczaj
385407a69d KVM: x86/xen: Fix memory leak in kvm_xen_write_hypercall_page()
Release page irrespectively of kvm_vcpu_write_guest() return value.

Suggested-by: Paul Durrant <paul@xen.org>
Fixes: 23200b7a30 ("KVM: x86/xen: intercept xen hypercalls if enabled")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Message-Id: <20221220151454.712165-1-mhal@rbox.co>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221226120320.1125390-1-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-27 06:01:48 -05:00
Lai Jiangshan
562f5bc48a kvm: x86/mmu: Remove duplicated "be split" in spte.h
"be split be split" -> "be split"

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Message-Id: <20221207120505.9175-1-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-27 06:00:51 -05:00
Steven Rostedt (Google)
292a089d78 treewide: Convert del_timer*() to timer_shutdown*()
Due to several bugs caused by timers being re-armed after they are
shutdown and just before they are freed, a new state of timers was added
called "shutdown".  After a timer is set to this state, then it can no
longer be re-armed.

The following script was run to find all the trivial locations where
del_timer() or del_timer_sync() is called in the same function that the
object holding the timer is freed.  It also ignores any locations where
the timer->function is modified between the del_timer*() and the free(),
as that is not considered a "trivial" case.

This was created by using a coccinelle script and the following
commands:

    $ cat timer.cocci
    @@
    expression ptr, slab;
    identifier timer, rfield;
    @@
    (
    -       del_timer(&ptr->timer);
    +       timer_shutdown(&ptr->timer);
    |
    -       del_timer_sync(&ptr->timer);
    +       timer_shutdown_sync(&ptr->timer);
    )
      ... when strict
          when != ptr->timer
    (
            kfree_rcu(ptr, rfield);
    |
            kmem_cache_free(slab, ptr);
    |
            kfree(ptr);
    )

    $ spatch timer.cocci . > /tmp/t.patch
    $ patch -p1 < /tmp/t.patch

Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-25 13:38:09 -08:00
Linus Torvalds
06d65a6f64 Merge tag 'mips_6.2_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
 "Fixes due to DT changes"

* tag 'mips_6.2_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: dts: bcm63268: Add missing properties to the TWD node
  MIPS: ralink: mt7621: avoid to init common ralink reset controller
2022-12-23 10:49:45 -08:00
Sean Christopherson
50a9ac2598 KVM: x86/mmu: Don't install TDP MMU SPTE if SP has unexpected level
Don't install a leaf TDP MMU SPTE if the parent page's level doesn't
match the target level of the fault, and instead have the vCPU retry the
faulting instruction after warning.  Continuing on is completely
unnecessary as the absolute worst case scenario of retrying is DoSing
the vCPU, whereas continuing on all but guarantees bigger explosions, e.g.

  ------------[ cut here ]------------
  kernel BUG at arch/x86/kvm/mmu/tdp_mmu.c:559!
  invalid opcode: 0000 [#1] SMP
  CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G        W          6.1.0-rc4+ #64
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:__handle_changed_spte.cold+0x95/0x9c
  RSP: 0018:ffffc9000072faf8 EFLAGS: 00010246
  RAX: 00000000000000c1 RBX: ffffc90000731000 RCX: 0000000000000027
  RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8
  RBP: 0600000112400bf3 R08: ffff888277c5b4c0 R09: ffffc9000072f9a0
  R10: 0000000000000001 R11: 0000000000000001 R12: 06000001126009f3
  R13: 0000000000000002 R14: 0000000012600901 R15: 0000000012400b01
  FS:  00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0
  Call Trace:
   <TASK>
   kvm_tdp_mmu_map+0x3b0/0x510
   kvm_tdp_page_fault+0x10c/0x130
   kvm_mmu_page_fault+0x103/0x680
   vmx_handle_exit+0x132/0x5a0 [kvm_intel]
   vcpu_enter_guest+0x60c/0x16f0
   kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
   kvm_vcpu_ioctl+0x271/0x660
   __x64_sys_ioctl+0x80/0xb0
   do_syscall_64+0x2b/0x50
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
   </TASK>
  Modules linked in: kvm_intel
  ---[ end trace 0000000000000000 ]---

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221213033030.83345-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:33:53 -05:00
Sean Christopherson
21a36ac6b6 KVM: x86/mmu: Re-check under lock that TDP MMU SP hugepage is disallowed
Re-check sp->nx_huge_page_disallowed under the tdp_mmu_pages_lock spinlock
when adding a new shadow page in the TDP MMU.  To ensure the NX reclaim
kthread can't see a not-yet-linked shadow page, the page fault path links
the new page table prior to adding the page to possible_nx_huge_pages.

If the page is zapped by different task, e.g. because dirty logging is
disabled, between linking the page and adding it to the list, KVM can end
up triggering use-after-free by adding the zapped SP to the aforementioned
list, as the zapped SP's memory is scheduled for removal via RCU callback.
The bug is detected by the sanity checks guarded by CONFIG_DEBUG_LIST=y,
i.e. the below splat is just one possible signature.

  ------------[ cut here ]------------
  list_add corruption. prev->next should be next (ffffc9000071fa70), but was ffff88811125ee38. (prev=ffff88811125ee38).
  WARNING: CPU: 1 PID: 953 at lib/list_debug.c:30 __list_add_valid+0x79/0xa0
  Modules linked in: kvm_intel
  CPU: 1 PID: 953 Comm: nx_huge_pages_t Tainted: G        W          6.1.0-rc4+ #71
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:__list_add_valid+0x79/0xa0
  RSP: 0018:ffffc900006efb68 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff888116cae8a0 RCX: 0000000000000027
  RDX: 0000000000000027 RSI: 0000000100001872 RDI: ffff888277c5b4c8
  RBP: ffffc90000717000 R08: ffff888277c5b4c0 R09: ffffc900006efa08
  R10: 0000000000199998 R11: 0000000000199a20 R12: ffff888116cae930
  R13: ffff88811125ee38 R14: ffffc9000071fa70 R15: ffff88810b794f90
  FS:  00007fc0415d2740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000000115201006 CR4: 0000000000172ea0
  Call Trace:
   <TASK>
   track_possible_nx_huge_page+0x53/0x80
   kvm_tdp_mmu_map+0x242/0x2c0
   kvm_tdp_page_fault+0x10c/0x130
   kvm_mmu_page_fault+0x103/0x680
   vmx_handle_exit+0x132/0x5a0 [kvm_intel]
   vcpu_enter_guest+0x60c/0x16f0
   kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
   kvm_vcpu_ioctl+0x271/0x660
   __x64_sys_ioctl+0x80/0xb0
   do_syscall_64+0x2b/0x50
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
   </TASK>
  ---[ end trace 0000000000000000 ]---

Fixes: 61f9447854 ("KVM: x86/mmu: Set disallowed_nx_huge_page in TDP MMU before setting SPTE")
Reported-by: Greg Thelen <gthelen@google.com>
Analyzed-by: David Matlack <dmatlack@google.com>
Cc: David Matlack <dmatlack@google.com>
Cc: Ben Gardon <bgardon@google.com>
Cc: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221213033030.83345-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:33:53 -05:00
Sean Christopherson
80a3e4ae96 KVM: x86/mmu: Map TDP MMU leaf SPTE iff target level is reached
Map the leaf SPTE when handling a TDP MMU page fault if and only if the
target level is reached.  A recent commit reworked the retry logic and
incorrectly assumed that walking SPTEs would never "fail", as the loop
either bails (retries) or installs parent SPs.  However, the iterator
itself will bail early if it detects a frozen (REMOVED) SPTE when
stepping down.   The TDP iterator also rereads the current SPTE before
stepping down specifically to avoid walking into a part of the tree that
is being removed, which means it's possible to terminate the loop without
the guts of the loop observing the frozen SPTE, e.g. if a different task
zaps a parent SPTE between the initial read and try_step_down()'s refresh.

Mapping a leaf SPTE at the wrong level results in all kinds of badness as
page table walkers interpret the SPTE as a page table, not a leaf, and
walk into the weeds.

  ------------[ cut here ]------------
  WARNING: CPU: 1 PID: 1025 at arch/x86/kvm/mmu/tdp_mmu.c:1070 kvm_tdp_mmu_map+0x481/0x510
  Modules linked in: kvm_intel
  CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G        W          6.1.0-rc4+ #64
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:kvm_tdp_mmu_map+0x481/0x510
  RSP: 0018:ffffc9000072fba8 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffffc9000072fcc0 RCX: 0000000000000027
  RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8
  RBP: ffff888107d45a10 R08: ffff888277c5b4c0 R09: ffffc9000072fa48
  R10: 0000000000000001 R11: 0000000000000001 R12: ffffc9000073a0e0
  R13: ffff88810fc54800 R14: ffff888107d1ae60 R15: ffff88810fc54f90
  FS:  00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0
  Call Trace:
   <TASK>
   kvm_tdp_page_fault+0x10c/0x130
   kvm_mmu_page_fault+0x103/0x680
   vmx_handle_exit+0x132/0x5a0 [kvm_intel]
   vcpu_enter_guest+0x60c/0x16f0
   kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
   kvm_vcpu_ioctl+0x271/0x660
   __x64_sys_ioctl+0x80/0xb0
   do_syscall_64+0x2b/0x50
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
   </TASK>
  ---[ end trace 0000000000000000 ]---
  Invalid SPTE change: cannot replace a present leaf
  SPTE with another present leaf SPTE mapping a
  different PFN!
  as_id: 0 gfn: 100200 old_spte: 600000112400bf3 new_spte: 6000001126009f3 level: 2
  ------------[ cut here ]------------
  kernel BUG at arch/x86/kvm/mmu/tdp_mmu.c:559!
  invalid opcode: 0000 [#1] SMP
  CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G        W          6.1.0-rc4+ #64
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:__handle_changed_spte.cold+0x95/0x9c
  RSP: 0018:ffffc9000072faf8 EFLAGS: 00010246
  RAX: 00000000000000c1 RBX: ffffc90000731000 RCX: 0000000000000027
  RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8
  RBP: 0600000112400bf3 R08: ffff888277c5b4c0 R09: ffffc9000072f9a0
  R10: 0000000000000001 R11: 0000000000000001 R12: 06000001126009f3
  R13: 0000000000000002 R14: 0000000012600901 R15: 0000000012400b01
  FS:  00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0
  Call Trace:
   <TASK>
   kvm_tdp_mmu_map+0x3b0/0x510
   kvm_tdp_page_fault+0x10c/0x130
   kvm_mmu_page_fault+0x103/0x680
   vmx_handle_exit+0x132/0x5a0 [kvm_intel]
   vcpu_enter_guest+0x60c/0x16f0
   kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
   kvm_vcpu_ioctl+0x271/0x660
   __x64_sys_ioctl+0x80/0xb0
   do_syscall_64+0x2b/0x50
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
   </TASK>
  Modules linked in: kvm_intel
  ---[ end trace 0000000000000000 ]---

Fixes: 63d28a25e0 ("KVM: x86/mmu: simplify kvm_tdp_mmu_map flow when guest has to retry")
Cc: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221213033030.83345-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:33:52 -05:00
Sean Christopherson
f5d16bb9be KVM: x86/mmu: Don't attempt to map leaf if target TDP MMU SPTE is frozen
Hoist the is_removed_spte() check above the "level == goal_level" check
when walking SPTEs during a TDP MMU page fault to avoid attempting to map
a leaf entry if said entry is frozen by a different task/vCPU.

  ------------[ cut here ]------------
  WARNING: CPU: 3 PID: 939 at arch/x86/kvm/mmu/tdp_mmu.c:653 kvm_tdp_mmu_map+0x269/0x4b0
  Modules linked in: kvm_intel
  CPU: 3 PID: 939 Comm: nx_huge_pages_t Not tainted 6.1.0-rc4+ #67
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:kvm_tdp_mmu_map+0x269/0x4b0
  RSP: 0018:ffffc9000068fba8 EFLAGS: 00010246
  RAX: 00000000000005a0 RBX: ffffc9000068fcc0 RCX: 0000000000000005
  RDX: ffff88810741f000 RSI: ffff888107f04600 RDI: ffffc900006a3000
  RBP: 060000010b000bf3 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 000ffffffffff000 R12: 0000000000000005
  R13: ffff888113670000 R14: ffff888107464958 R15: 0000000000000000
  FS:  00007f01c942c740(0000) GS:ffff888277cc0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000000117013006 CR4: 0000000000172ea0
  Call Trace:
   <TASK>
   kvm_tdp_page_fault+0x10c/0x130
   kvm_mmu_page_fault+0x103/0x680
   vmx_handle_exit+0x132/0x5a0 [kvm_intel]
   vcpu_enter_guest+0x60c/0x16f0
   kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
   kvm_vcpu_ioctl+0x271/0x660
   __x64_sys_ioctl+0x80/0xb0
   do_syscall_64+0x2b/0x50
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
   </TASK>
  ---[ end trace 0000000000000000 ]---

Fixes: 63d28a25e0 ("KVM: x86/mmu: simplify kvm_tdp_mmu_map flow when guest has to retry")
Cc: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <20221213033030.83345-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:33:52 -05:00
Sean Christopherson
a0860d68a2 KVM: nVMX: Don't stuff secondary execution control if it's not supported
When stuffing the allowed secondary execution controls for nested VMX in
response to CPUID updates, don't set the allowed-1 bit for a feature that
isn't supported by KVM, i.e. isn't allowed by the canonical vmcs_config.

WARN if KVM attempts to manipulate a feature that isn't supported.  All
features that are currently stuffed are always advertised to L1 for
nested VMX if they are supported in KVM's base configuration, and no
additional features should ever be added to the CPUID-induced stuffing
(updating VMX MSRs in response to CPUID updates is a long-standing KVM
flaw that is slowly being fixed).

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221213062306.667649-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:32:03 -05:00
Sean Christopherson
31de69f4ee KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1
Set ENABLE_USR_WAIT_PAUSE in KVM's supported VMX MSR configuration if the
feature is supported in hardware and enabled in KVM's base, non-nested
configuration, i.e. expose ENABLE_USR_WAIT_PAUSE to L1 if it's supported.
This fixes a bug where saving/restoring, i.e. migrating, a vCPU will fail
if WAITPKG (the associated CPUID feature) is enabled for the vCPU, and
obviously allows L1 to enable the feature for L2.

KVM already effectively exposes ENABLE_USR_WAIT_PAUSE to L1 by stuffing
the allowed-1 control ina vCPU's virtual MSR_IA32_VMX_PROCBASED_CTLS2 when
updating secondary controls in response to KVM_SET_CPUID(2), but (a) that
depends on flawed code (KVM shouldn't touch VMX MSRs in response to CPUID
updates) and (b) runs afoul of vmx_restore_control_msr()'s restriction
that the guest value must be a strict subset of the supported host value.

Although no past commit explicitly enabled nested support for WAITPKG,
doing so is safe and functionally correct from an architectural
perspective as no additional KVM support is needed to virtualize TPAUSE,
UMONITOR, and UMWAIT for L2 relative to L1, and KVM already forwards
VM-Exits to L1 as necessary (commit bf653b78f9, "KVM: vmx: Introduce
handle_unexpected_vmexit and handle WAITPKG vmexit").

Note, KVM always keeps the hosts MSR_IA32_UMWAIT_CONTROL resident in
hardware, i.e. always runs both L1 and L2 with the host's power management
settings for TPAUSE and UMWAIT.  See commit bf09fb6cba ("KVM: VMX: Stop
context switching MSR_IA32_UMWAIT_CONTROL") for more details.

Fixes: e69e72faa3 ("KVM: x86: Add support for user wait instructions")
Cc: stable@vger.kernel.org
Reported-by: Aaron Lewis <aaronlewis@google.com>
Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20221213062306.667649-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:22:37 -05:00
Sean Christopherson
057b18756b KVM: nVMX: Document that ignoring memory failures for VMCLEAR is deliberate
Explicitly drop the result of kvm_vcpu_write_guest() when writing the
"launch state" as part of VMCLEAR emulation, and add a comment to call
out that KVM's behavior is architecturally valid.  Intel's pseudocode
effectively says that VMCLEAR is a nop if the target VMCS address isn't
in memory, e.g. if the address points at MMIO.

Add a FIXME to call out that suppressing failures on __copy_to_user() is
wrong, as memory (a memslot) does exist in that case.  Punt the issue to
the future as open coding kvm_vcpu_write_guest() just to make sure the
guest dies with -EFAULT isn't worth the extra complexity.  The flaw will
need to be addressed if KVM ever does something intelligent on uaccess
failures, e.g. to support post-copy demand paging, but in that case KVM
will need a more thorough overhaul, i.e. VMCLEAR shouldn't need to open
code a core KVM helper.

No functional change intended.

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1527765 ("Error handling issues")
Fixes: 587d7e72ae ("kvm: nVMX: VMCLEAR should not cause the vCPU to shut down")
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221220154224.526568-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:16:49 -05:00
Sean Christopherson
77b1908e10 KVM: x86: Sanity check inputs to kvm_handle_memory_failure()
Add a sanity check in kvm_handle_memory_failure() to assert that a valid
x86_exception structure is provided if the memory "failure" wants to
propagate a fault into the guest.  If a memory failure happens during a
direct guest physical memory access, e.g. for nested VMX, KVM hardcodes
the failure to X86EMUL_IO_NEEDED and doesn't provide an exception pointer
(because the exception struct would just be filled with garbage).

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221220153427.514032-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:15:25 -05:00
Peng Hao
3c649918b7 KVM: x86: Simplify kvm_apic_hw_enabled
kvm_apic_hw_enabled() only needs to return bool, there is no place
to use the return value of MSR_IA32_APICBASE_ENABLE.

Signed-off-by: Peng Hao <flyingpeng@tencent.com>
Message-Id: <CAPm50aJ=BLXNWT11+j36Dd6d7nz2JmOBk4u7o_NPQ0N61ODu1g@mail.gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:09:28 -05:00
Vitaly Kuznetsov
8b9e13d2de KVM: x86: hyper-v: Fix 'using uninitialized value' Coverity warning
In kvm_hv_flush_tlb(), 'data_offset' and 'consumed_xmm_halves' variables
are used in a mutually exclusive way: in 'hc->fast' we count in 'XMM
halves' and increase 'data_offset' otherwise. Coverity discovered, that in
one case both variables are incremented unconditionally. This doesn't seem
to cause any issues as the only user of 'data_offset'/'consumed_xmm_halves'
data is kvm_hv_get_tlb_flush_entries() -> kvm_hv_get_hc_data() which also
takes into account 'hc->fast' but is still worth fixing.

To make things explicit, put 'data_offset' and 'consumed_xmm_halves' to
'struct kvm_hv_hcall' as a union and use at call sites. This allows to
remove explicit 'data_offset'/'consumed_xmm_halves' parameters from
kvm_hv_get_hc_data()/kvm_get_sparse_vp_set()/kvm_hv_get_tlb_flush_entries()
helpers.

Note: 'struct kvm_hv_hcall' is allocated on stack in kvm_hv_hypercall() and
is not zeroed, consumers are supposed to initialize the appropriate field
if needed.

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1527764 ("Uninitialized variables")
Fixes: 260970862c ("KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221208102700.959630-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:08:16 -05:00
Adamos Ttofari
fceb3a36c2 KVM: x86: ioapic: Fix level-triggered EOI and userspace I/OAPIC reconfigure race
When scanning userspace I/OAPIC entries, intercept EOI for level-triggered
IRQs if the current vCPU has a pending and/or in-service IRQ for the
vector in its local API, even if the vCPU doesn't match the new entry's
destination.  This fixes a race between userspace I/OAPIC reconfiguration
and IRQ delivery that results in the vector's bit being left set in the
remote IRR due to the eventual EOI not being forwarded to the userspace
I/OAPIC.

Commit 0fc5a36dd6 ("KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC
reconfigure race") fixed the in-kernel IOAPIC, but not the userspace
IOAPIC configuration, which has a similar race.

Fixes: 0fc5a36dd6 ("KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race")

Signed-off-by: Adamos Ttofari <attofari@amazon.de>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221208094415.12723-1-attofari@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:07:40 -05:00
Like Xu
55c590adfe KVM: x86/pmu: Prevent zero period event from being repeatedly released
The current vPMU can reuse the same pmc->perf_event for the same
hardware event via pmc_pause/resume_counter(), but this optimization
does not apply to a portion of the TSX events (e.g., "event=0x3c,in_tx=1,
in_tx_cp=1"), where event->attr.sample_period is legally zero at creation,
thus making the perf call to perf_event_period() meaningless (no need to
adjust sample period in this case), and instead causing such reusable
perf_events to be repeatedly released and created.

Avoid releasing zero sample_period events by checking is_sampling_event()
to follow the previously enable/disable optimization.

Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20221207071506.15733-2-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:06:45 -05:00
Linus Torvalds
7a5189c58b Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull RISC-V kvm updates from Paolo Bonzini:

 - Allow unloading KVM module

 - Allow KVM user-space to set mvendorid, marchid, and mimpid

 - Several fixes and cleanups

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  RISC-V: KVM: Add ONE_REG interface for mvendorid, marchid, and mimpid
  RISC-V: KVM: Save mvendorid, marchid, and mimpid when creating VCPU
  RISC-V: Export sbi_get_mvendorid() and friends
  RISC-V: KVM: Move sbi related struct and functions to kvm_vcpu_sbi.h
  RISC-V: KVM: Use switch-case in kvm_riscv_vcpu_set/get_reg()
  RISC-V: KVM: Remove redundant includes of asm/csr.h
  RISC-V: KVM: Remove redundant includes of asm/kvm_vcpu_timer.h
  RISC-V: KVM: Fix reg_val check in kvm_riscv_vcpu_set_reg_config()
  RISC-V: KVM: Simplify kvm_arch_prepare_memory_region()
  RISC-V: KVM: Exit run-loop immediately if xfer_to_guest fails
  RISC-V: KVM: use vma_lookup() instead of find_vma_intersection()
  RISC-V: KVM: Add exit logic to main.c
2022-12-21 18:52:15 -08:00
Linus Torvalds
9cf5b508bd Merge tag 'rproc-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull remoteproc updates from Bjorn Andersson:
 "rproc-virtio device names are now auto generated, to avoid conflicts
  between remoteproc instances.

  The imx_rproc driver is extended with support for communicating with
  and attaching to a running M4 on i.MX8QXP, as well as support for
  attaching to the M4 after self-recovering from a crash. Support is
  added for i.MX8QM and mailbox channels are reconnected during the
  recovery process, in order to avoid data corruption.

  The Xilinx Zynqmp firmware interface is extended and support for the
  Xilinx R5 RPU is introduced.

  Various resources leaks, primarily in error paths, throughout the
  Qualcomm drivers are corrected.

  Lastly a fix to ensure that pm_relax is invoked even if the remoteproc
  instance is stopped between a crash is being reported and the recovery
  handler is scheduled"

* tag 'rproc-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (25 commits)
  remoteproc: core: Do pm_relax when in RPROC_OFFLINE state
  remoteproc: qcom: q6v5: Fix missing clk_disable_unprepare() in q6v5_wcss_qcs404_power_on()
  remoteproc: qcom_q6v5_pas: Fix missing of_node_put() in adsp_alloc_memory_region()
  remoteproc: qcom_q6v5_pas: detach power domains on remove
  remoteproc: qcom_q6v5_pas: disable wakeup on probe fail or remove
  remoteproc: qcom: q6v5: Fix potential null-ptr-deref in q6v5_wcss_init_mmio()
  remoteproc: sysmon: fix memory leak in qcom_add_sysmon_subdev()
  remoteproc: sysmon: Make QMI message rules const
  drivers: remoteproc: Add Xilinx r5 remoteproc driver
  firmware: xilinx: Add RPU configuration APIs
  firmware: xilinx: Add shutdown/wakeup APIs
  firmware: xilinx: Add ZynqMP firmware ioctl enums for RPU configuration.
  arm64: dts: xilinx: zynqmp: Add RPU subsystem device node
  dt-bindings: remoteproc: Add Xilinx RPU subsystem bindings
  remoteproc: core: Use device_match_of_node()
  remoteproc: imx_rproc: Correct i.MX93 DRAM mapping
  remoteproc: imx_rproc: Enable attach recovery for i.MX8QM/QXP
  remoteproc: imx_rproc: Request mbox channel later
  remoteproc: imx_rproc: Support i.MX8QM
  remoteproc: imx_rproc: Support kicking Mcore from Linux for i.MX8QXP
  ...
2022-12-21 09:37:14 -08:00
Linus Torvalds
7c08461253 m68k: remove broken strcmp implementation
The m68 hand-written assembler version of strcmp() has always been
broken: it returns the difference between the first non-matching byte
done as a 8-bit subtraction.

That is _almost_ right, but is broken for the overflow case.  The
strcmp() function should indeed return the sign of the difference
between the first byte that differs, but the subtraction needs to be
done in a wider type than 'char'.  Otherwise the ordering isn't actually
stable.

This went unnoticed for basically forever, because nobody ever cares
about non-US-ASCII orderings in the kernel (in fact, most users only
care about "exact match or not"), so overflows don't really happen in
practice, even if it was very very wrong.

But that mostly unnoticeable bug becomes very noticeable by the recent
change to make 'char' be unsigned in the kernel across all architectures
(commit 3bc753c06d: "kbuild: treat char as always unsigned"). Because
the code not only did the subtraction in the wrong type width, it also
used 'char' to then make the compiler expand the result from an 8-bit
difference to the 'int' return value.

So now with an unsigned char that incorrect arithmetic width was then
not even sign-expanded, and always returned just a positive integer.

We could re-instate the old broken code by just turning the 'char' into
'signed char' as has been done elsewhere where people depended on the
signedness of 'char', but since the whole function was broken to begin
with, and we have a non-broken default fallback implementation, let's
just remove this broken function entirely.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/lkml/20221221145332.GA2399037@roeck-us.net/
Cc: Jason Donenfeld <Jason@zx2c4.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-21 08:56:43 -08:00
Linus Torvalds
222882c2ab Merge tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull more random number generator updates from Jason Donenfeld:
 "Two remaining changes that are now possible after you merged a few
  other trees:

   - #include <asm/archrandom.h> can be removed from random.h now,
     making the direct use of the arch_random_* API more of a private
     implementation detail between the archs and random.c, rather than
     something for general consumers.

   - Two additional uses of prandom_u32_max() snuck in during the
     initial phase of pulls, so these have been converted to
     get_random_u32_below(), and now the deprecated prandom_u32_max()
     alias -- which was just a wrapper around get_random_u32_below() --
     can be removed.

  In addition, there is one fix:

   - Check efi_rt_services_supported() before attempting to use an EFI
     runtime function.

     This affected EFI systems that disable runtime services yet still
     boot via EFI (e.g. the reporter's Lenovo Thinkpad X13s laptop), as
     well systems where EFI runtime services have been forcibly
     disabled, such as on PREEMPT_RT.

     On those machines, a very early and hard to diagnose crash would
     happen, preventing boot"

* tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  prandom: remove prandom_u32_max()
  efi: random: fix NULL-deref when refreshing seed
  random: do not include <asm/archrandom.h> from random.h
2022-12-21 08:02:30 -08:00
Florian Fainelli
24b333a866 MIPS: dts: bcm63268: Add missing properties to the TWD node
We currently have a DTC warning with the current DTS due to the lack of
a suitable #address-cells and #size-cells property:

  DTC     arch/mips/boot/dts/brcm/bcm63268-comtrend-vr-3032u.dtb
arch/mips/boot/dts/brcm/bcm63268.dtsi:115.5-22: Warning (reg_format): /ubus/timer-mfd@10000080/timer@0:reg: property has invalid length (8 bytes) (#address-cells == 2, #size-cells == 1)
arch/mips/boot/dts/brcm/bcm63268.dtsi:120.5-22: Warning (reg_format): /ubus/timer-mfd@10000080/watchdog@1c:reg: property has invalid length (8 bytes) (#address-cells == 2, #size-cells == 1)
arch/mips/boot/dts/brcm/bcm63268.dtsi:111.4-35: Warning (ranges_format): /ubus/timer-mfd@10000080:ranges: "ranges" property has invalid length (12 bytes) (parent #address-cells == 1, child #address-cells == 2, #size-cells == 1)

Fixes: d3db4b96ab ("mips: dts: bcm63268: add TWD block timer")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-12-21 10:46:10 +01:00
Sergio Paracuellos
76ce51798c MIPS: ralink: mt7621: avoid to init common ralink reset controller
Commit 38a8553b0a ("clk: ralink: make system controller node a reset provider")
make system controller a reset provider for mt7621 ralink SoCs. Ralink init code
also tries to start previous common reset controller which at the end tries to
find device tree node 'ralink,rt2880-reset'. mt7621 device tree file is not
using at all this node anymore. Hence avoid to init this common reset controller
for mt7621 ralink SoCs to avoid 'Failed to find reset controller node' boot
error trace error.

Fixes: 64b2d6ffff ("staging: mt7621-dts: align resets with binding documentation")
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-12-21 10:45:56 +01:00
Linus Torvalds
b6bb9676f2 Merge tag 'm68knommu-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu update from Greg Ungerer:
 "Only a single change to use the safer strscpy() instead of strncpy()
  when setting up the cmdline"

* tag 'm68knommu-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k: use strscpy() to instead of strncpy()
2022-12-20 08:56:35 -06:00
Linus Torvalds
35f79d0e2c Merge tag 'parisc-for-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "There is one noteable patch, which allows the parisc kernel to use the
  same MADV_xxx constants as the other architectures going forward. With
  that change only alpha has one entry left (MADV_DONTNEED is 6 vs 4 on
  others) which is different. To prevent an ABI breakage, a wrapper is
  included which translates old MADV values to the new ones, so existing
  userspace isn't affected. Reason for that patch is, that some
  applications wrongly used the standard MADV_xxx values even on some
  non-x86 platforms and as such those programs failed to run correctly
  on parisc (examples are qemu-user, tor browser and boringssl).

  Then the kgdb console and the LED code received some fixes, and some
  0-day warnings are now gone. Finally, the very last compile warning
  which was visible during a kernel build is now fixed too (in the vDSO
  code).

  The majority of the patches are tagged for stable series and in
  summary this patchset is quite small and drops more code than it adds:

Fixes:
   - Fix potential null-ptr-deref in start_task()
   - Fix kgdb console on serial port
   - Add missing FORCE prerequisites in Makefile
   - Drop PMD_SHIFT from calculation in pgtable.h

  Enhancements:
   - Implement a wrapper to align madvise() MADV_* constants with other
     architectures
   - If machine supports running MPE/XL, show the MPE model string

  Cleanups:
   - Drop duplicate kgdb console code
   - Indenting fixes in setup_cmdline()"

* tag 'parisc-for-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Show MPE/iX model string at bootup
  parisc: Add missing FORCE prerequisites in Makefile
  parisc: Move pdc_result struct to firmware.c
  parisc: Drop locking in pdc console code
  parisc: Drop duplicate kgdb_pdc console
  parisc: Fix locking in pdc_iodc_print() firmware call
  parisc: Drop PMD_SHIFT from calculation in pgtable.h
  parisc: Align parisc MADV_XXX constants with all other architectures
  parisc: led: Fix potential null-ptr-deref in start_task()
  parisc: Fix inconsistent indenting in setup_cmdline()
2022-12-20 08:43:53 -06:00
Jason A. Donenfeld
3c202d14a9 prandom: remove prandom_u32_max()
Convert the final two users of prandom_u32_max() that slipped in during
6.2-rc1 to use get_random_u32_below().

Then, with no more users left, we can finally remove the deprecated
function.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-12-20 03:13:45 +01:00
Jason A. Donenfeld
6bb20c152b random: do not include <asm/archrandom.h> from random.h
The <asm/archrandom.h> header is a random.c private detail, not
something to be called by other code. As such, don't make it
automatically available by way of random.h.

Cc: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-12-20 03:13:45 +01:00
Linus Torvalds
850f7a5cab Merge tag 'soc-fixes-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "These are a couple of build fixes from randconfig testing, plus a set
  of Mediatek SoC specific fixes, all trivial"

* tag 'soc-fixes-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  soc: tegra: fix CPU_BIG_ENDIAN dependencies
  ARM: disallow pre-ARMv5 builds with ld.lld
  ARM: pxa: fix building with clang
  MAINTAINERS: add related dts to IXP4xx
  ARM: dts: spear: drop 0x from unit address
  arm64: dts: mt8183: Fix Mali GPU clock
  arm64: dts: mediatek: mt8195-demo: fix the memory size of node secmon
  soc: mediatek: pm-domains: Fix the power glitch issue
2022-12-19 16:07:59 -06:00
Linus Torvalds
6feb57c2fd Merge tag 'kbuild-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:

 - Support zstd-compressed debug info

 - Allow W=1 builds to detect objects shared among multiple modules

 - Add srcrpm-pkg target to generate a source RPM package

 - Make the -s option detection work for future GNU Make versions

 - Add -Werror to KBUILD_CPPFLAGS when CONFIG_WERROR=y

 - Allow W=1 builds to detect -Wundef warnings in any preprocessed files

 - Raise the minimum supported version of binutils to 2.25

 - Use $(intcmp ...) to compare integers if GNU Make >= 4.4 is used

 - Use $(file ...) to read a file if GNU Make >= 4.2 is used

 - Print error if GNU Make older than 3.82 is used

 - Allow modpost to detect section mismatches with Clang LTO

 - Include vmlinuz.efi into kernel tarballs for arm64 CONFIG_EFI_ZBOOT=y

* tag 'kbuild-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (29 commits)
  buildtar: fix tarballs with EFI_ZBOOT enabled
  modpost: Include '.text.*' in TEXT_SECTIONS
  padata: Mark padata_work_init() as __ref
  kbuild: ensure Make >= 3.82 is used
  kbuild: refactor the prerequisites of the modpost rule
  kbuild: change module.order to list *.o instead of *.ko
  kbuild: use .NOTINTERMEDIATE for future GNU Make versions
  kconfig: refactor Makefile to reduce process forks
  kbuild: add read-file macro
  kbuild: do not sort after reading modules.order
  kbuild: add test-{ge,gt,le,lt} macros
  Documentation: raise minimum supported version of binutils to 2.25
  kbuild: add -Wundef to KBUILD_CPPFLAGS for W=1 builds
  kbuild: move -Werror from KBUILD_CFLAGS to KBUILD_CPPFLAGS
  kbuild: Port silent mode detection to future gnu make.
  init/version.c: remove #include <generated/utsrelease.h>
  firmware_loader: remove #include <generated/utsrelease.h>
  modpost: Mark uuid_le type to be suitable only for MEI
  kbuild: add ability to make source rpm buildable using koji
  kbuild: warn objects shared among multiple modules
  ...
2022-12-19 12:33:32 -06:00
Arnd Bergmann
b9cb6be06b Merge tag 'v6.1-dts64-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux into arm/fixes
MT8183: fix phandle for GPU clock
MT8195 demo: fix size of secmon reserved memory area

* tag 'v6.1-dts64-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux:
  arm64: dts: mt8183: Fix Mali GPU clock
  arm64: dts: mediatek: mt8195-demo: fix the memory size of node secmon

Link: https://lore.kernel.org/r/af4c45ce-a150-438f-dab4-e47b120c32c4@suse.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-12-19 16:47:18 +01:00
Arnd Bergmann
6a7ee50f8f ARM: disallow pre-ARMv5 builds with ld.lld
lld cannot build for ARMv4/v4T targets because it inserts 'blx' instructions
that are unsupported there:

  ld.lld: warning: lld uses blx instruction, no object with architecture supporting feature detected

Add a Kconfig time dependency to prevent those targets from being
selected in randconfig builds.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://github.com/llvm/llvm-project/issues/50764
Link: https://github.com/ClangBuiltLinux/linux/issues/964
Link: https://lore.kernel.org/r/20221215162635.3750763-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-12-19 16:46:50 +01:00
Arnd Bergmann
4b88615950 ARM: pxa: fix building with clang
The integrated assembler in clang does not understand the xscale
specific mra/mar instructions:

arch/arm/mach-pxa/pxa27x.c:136:15: error: unsupported architectural extension: xscale
        asm volatile(".arch_extension xscale\n\t"
arch/arm/mach-pxa/pxa27x.c:136:40: error: invalid instruction, did you mean: mcr, mla, mrc, mrs, msr?
        mra r2, r3, acc0

Since these are coprocessor features, the same can be expressed using
mrrc/mcrr, so use that for builds with IAS.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20221215162529.3659187-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-12-19 16:46:32 +01:00
Krzysztof Kozlowski
2b76cfe190 ARM: dts: spear: drop 0x from unit address
By coding style, unit address should not start with 0x.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20221210113347.63939-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-12-19 16:45:45 +01:00
Helge Deller
4934fbfb3f parisc: Show MPE/iX model string at bootup
Some (mostly 64-bit machines) machines allow to run MPE/iX and report the MPE
model string via firmware call. Enhance the pdc_model_sysmodel() function to
report that model string.
Note that some 32-bit machines like the B160L wrongly report success for the
firmware call, so include a check to prevent showing wrong info.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-12-19 16:08:52 +01:00
Linus Torvalds
b8fd76f418 Merge tag 'iommu-updates-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
 "Core code:
   - map/unmap_pages() cleanup
   - SVA and IOPF refactoring
   - Clean up and document return codes from device/domain attachment

  AMD driver:
   - Rework and extend parsing code for ivrs_ioapic, ivrs_hpet and
     ivrs_acpihid command line options
   - Some smaller cleanups

  Intel driver:
   - Blocking domain support
   - Cleanups

  S390 driver:
   - Fixes and improvements for attach and aperture handling

  PAMU driver:
   - Resource leak fix and cleanup

  Rockchip driver:
   - Page table permission bit fix

  Mediatek driver:
   - Improve safety from invalid dts input
   - Smaller fixes and improvements

  Exynos driver:
   - Fix driver initialization sequence

  Sun50i driver:
   - Remove IOMMU_DOMAIN_IDENTITY as it has not been working forever
   - Various other fixes"

* tag 'iommu-updates-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (74 commits)
  iommu/mediatek: Fix forever loop in error handling
  iommu/mediatek: Fix crash on isr after kexec()
  iommu/sun50i: Remove IOMMU_DOMAIN_IDENTITY
  iommu/amd: Fix typo in macro parameter name
  iommu/mediatek: Remove unused "mapping" member from mtk_iommu_data
  iommu/mediatek: Improve safety for mediatek,smi property in larb nodes
  iommu/mediatek: Validate number of phandles associated with "mediatek,larbs"
  iommu/mediatek: Add error path for loop of mm_dts_parse
  iommu/mediatek: Use component_match_add
  iommu/mediatek: Add platform_device_put for recovering the device refcnt
  iommu/fsl_pamu: Fix resource leak in fsl_pamu_probe()
  iommu/vt-d: Use real field for indication of first level
  iommu/vt-d: Remove unnecessary domain_context_mapped()
  iommu/vt-d: Rename domain_add_dev_info()
  iommu/vt-d: Rename iommu_disable_dev_iotlb()
  iommu/vt-d: Add blocking domain support
  iommu/vt-d: Add device_block_translation() helper
  iommu/vt-d: Allocate pasid table in device probe path
  iommu/amd: Check return value of mmu_notifier_register()
  iommu/amd: Fix pci device refcount leak in ppr_notifier()
  ...
2022-12-19 08:34:39 -06:00
Linus Torvalds
2f26e42455 Merge tag 'loongarch-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:

 - Switch to relative exception tables

 - Add unaligned access support

 - Add alternative runtime patching mechanism

 - Add FDT booting support from efi system table

 - Add suspend/hibernation (ACPI S3/S4) support

 - Add basic STACKPROTECTOR support

 - Add ftrace (function tracer) support

 - Update the default config file

* tag 'loongarch-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (24 commits)
  LoongArch: Update Loongson-3 default config file
  LoongArch: modules/ftrace: Initialize PLT at load time
  LoongArch/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support
  LoongArch/ftrace: Add HAVE_DYNAMIC_FTRACE_WITH_ARGS support
  LoongArch/ftrace: Add HAVE_DYNAMIC_FTRACE_WITH_REGS support
  LoongArch/ftrace: Add dynamic function graph tracer support
  LoongArch/ftrace: Add dynamic function tracer support
  LoongArch/ftrace: Add recordmcount support
  LoongArch/ftrace: Add basic support
  LoongArch: module: Use got/plt section indices for relocations
  LoongArch: Add basic STACKPROTECTOR support
  LoongArch: Add hibernation (ACPI S4) support
  LoongArch: Add suspend (ACPI S3) support
  LoongArch: Add processing ISA Node in DeviceTree
  LoongArch: Add FDT booting support from efi system table
  LoongArch: Use alternative to optimize libraries
  LoongArch: Add alternative runtime patching mechanism
  LoongArch: Add unaligned access support
  LoongArch: BPF: Add BPF exception tables
  LoongArch: Remove the .fixup section usage
  ...
2022-12-19 08:23:27 -06:00
Linus Torvalds
96bab5b926 Merge tag 'csky-for-linus-6.2-rc1' of https://github.com/c-sky/csky-linux
Pull arch/csky updates from Guo Ren:

 - Revert rseq support - it wasn't ready

 - Add current_stack_pointer support

 - Typo fixup

* tag 'csky-for-linus-6.2-rc1' of https://github.com/c-sky/csky-linux:
  Revert "csky: Add support for restartable sequence"
  Revert "csky: Fixup CONFIG_DEBUG_RSEQ"
  csky: Kconfig: Fix spelling mistake "Meory" -> "Memory"
  csky: add arch support current_stack_pointer
2022-12-19 07:51:30 -06:00
Linus Torvalds
5f6e430f93 Merge tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:

 - Add powerpc qspinlock implementation optimised for large system
   scalability and paravirt. See the merge message for more details

 - Enable objtool to be built on powerpc to generate mcount locations

 - Use a temporary mm for code patching with the Radix MMU, so the
   writable mapping is restricted to the patching CPU

 - Add an option to build the 64-bit big-endian kernel with the ELFv2
   ABI

 - Sanitise user registers on interrupt entry on 64-bit Book3S

 - Many other small features and fixes

Thanks to Aboorva Devarajan, Angel Iglesias, Benjamin Gray, Bjorn
Helgaas, Bo Liu, Chen Lifu, Christoph Hellwig, Christophe JAILLET,
Christophe Leroy, Christopher M. Riedl, Colin Ian King, Deming Wang,
Disha Goel, Dmitry Torokhov, Finn Thain, Geert Uytterhoeven, Gustavo A.
R. Silva, Haowen Bai, Joel Stanley, Jordan Niethe, Julia Lawall, Kajol
Jain, Laurent Dufour, Li zeming, Miaoqian Lin, Michael Jeanson, Nathan
Lynch, Naveen N. Rao, Nayna Jain, Nicholas Miehlbradt, Nicholas Piggin,
Pali Rohár, Randy Dunlap, Rohan McLure, Russell Currey, Sathvika
Vasireddy, Shaomin Deng, Stephen Kitt, Stephen Rothwell, Thomas
Weißschuh, Tiezhu Yang, Uwe Kleine-König, Xie Shaowen, Xiu Jianfeng,
XueBing Chen, Yang Yingliang, Zhang Jiaming, ruanjinjie, Jessica Yu,
and Wolfram Sang.

* tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (181 commits)
  powerpc/code-patching: Fix oops with DEBUG_VM enabled
  powerpc/qspinlock: Fix 32-bit build
  powerpc/prom: Fix 32-bit build
  powerpc/rtas: mandate RTAS syscall filtering
  powerpc/rtas: define pr_fmt and convert printk call sites
  powerpc/rtas: clean up includes
  powerpc/rtas: clean up rtas_error_log_max initialization
  powerpc/pseries/eeh: use correct API for error log size
  powerpc/rtas: avoid scheduling in rtas_os_term()
  powerpc/rtas: avoid device tree lookups in rtas_os_term()
  powerpc/rtasd: use correct OF API for event scan rate
  powerpc/rtas: document rtas_call()
  powerpc/pseries: unregister VPA when hot unplugging a CPU
  powerpc/pseries: reset the RCU watchdogs after a LPM
  powerpc: Take in account addition CPU node when building kexec FDT
  powerpc: export the CPU node count
  powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state
  powerpc/dts/fsl: Fix pca954x i2c-mux node names
  cxl: Remove unnecessary cxl_pci_window_alignment()
  selftests/powerpc: Fix resource leaks
  ...
2022-12-19 07:13:33 -06:00
Helge Deller
9086e60179 parisc: Add missing FORCE prerequisites in Makefile
Fix those make warnings:
    arch/parisc/kernel/vdso32/Makefile:30: FORCE prerequisite is missing
    arch/parisc/kernel/vdso64/Makefile:30: FORCE prerequisite is missing

Add the missing FORCE prerequisites for all build targets identified by
"make help".

Fixes: e1f86d7b4b ("kbuild: warn if FORCE is missing for if_changed(_dep,_rule) and filechk")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # 5.18+
2022-12-18 22:18:49 +01:00