* for-next/kselftest:
kselftest/arm64: Log the PIDs of the parent and child in sve-ptrace
kselftest/arm64: signal: Allow tests to be incompatible with features
kselftest/arm64: mte: user_mem: test a wider range of values
kselftest/arm64: mte: user_mem: add more test types
kselftest/arm64: mte: user_mem: add test type enum
kselftest/arm64: mte: user_mem: check different offsets and sizes
kselftest/arm64: mte: user_mem: rework error handling
kselftest/arm64: mte: user_mem: introduce tag_offset and tag_len
kselftest/arm64: Remove local definitions of MTE prctls
kselftest/arm64: Remove local ARRAY_SIZE() definitions
* for-next/coredump:
arm64: Change elfcore for_each_mte_vma() to use VMA iterator
arm64: mte: Document the core dump file format
arm64: mte: Dump the MTE tags in the core file
arm64: mte: Define the number of bytes for storing the tags in a page
elf: Introduce the ARM MTE ELF segment type
elfcore: Replace CONFIG_{IA64, UML} checks with a new option
If the test triggers a problem it may well result in a log message from
the kernel such as a WARN() or BUG(). If these include a PID it can help
with debugging to know if it was the parent or child process that triggered
the issue, since the test is just creating a new thread the process name
will be the same either way. Print the PIDs of the parent and child on
startup so users have this information to hand should it be needed.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220303192817.2732509-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
When a IAR register read races with a GIC interrupt RELEASE event,
GIC-CPU interface could wrongly return a valid INTID to the CPU
for an interrupt that is already released(non activated) instead of 0x3ff.
As a side effect, an interrupt handler could run twice, once with
interrupt priority and then with idle priority.
As a workaround, gic_read_iar is updated so that it will return a
valid interrupt ID only if there is a change in the active priority list
after the IAR read on all the affected Silicons.
Since there are silicon variants where both 23154 and 38545 are applicable,
workaround for erratum 23154 has been extended to address both of them.
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220307143014.22758-1-lcherian@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
It is a preparation patch for eBPF atomic supports under arm64. eBPF
needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and
atomic[64]_{xchg|cmpxchg}. The ordering semantics of eBPF atomics are
the same with the implementations in linux kernel.
Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB
instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for
LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra
helper is added. atomic_fetch_add() and other atomic ops needs support for
STLXR instruction, so extend enum aarch64_insn_ldst_type to do that.
LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE
atomics is enabled, so just return AARCH64_BREAK_FAULT directly in
these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20220217072232.1186625-3-houtao1@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
If CONFIG_ARM64_LSE_ATOMICS is off, encoders for LSE-related instructions
can return AARCH64_BREAK_FAULT directly in insn.h. In order to access
AARCH64_BREAK_FAULT in insn.h, we can not include debug-monitors.h in
insn.h, because debug-monitors.h has already depends on insn.h, so just
move AARCH64_BREAK_FAULT into insn-def.h.
It will be used by the following patch to eliminate unnecessary LSE-related
encoders when CONFIG_ARM64_LSE_ATOMICS is off.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20220217072232.1186625-2-houtao1@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
For each vma mapped with PROT_MTE (the VM_MTE flag set), generate a
PT_ARM_MEMTAG_MTE segment in the core file and dump the corresponding
tags. The in-file size for such segments is 128 bytes per page.
For pages in a VM_MTE vma which are not present in the user page tables
or don't have the PG_mte_tagged flag set (e.g. execute-only), just write
zeros in the core file.
An example of program headers for two vmas, one 2-page, the other 4-page
long:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
...
LOAD 0x030000 0x0000ffff80034000 0x0000000000000000 0x000000 0x002000 RW 0x1000
LOAD 0x030000 0x0000ffff80036000 0x0000000000000000 0x004000 0x004000 RW 0x1000
...
LOPROC+0x1 0x05b000 0x0000ffff80034000 0x0000000000000000 0x000100 0x002000 0
LOPROC+0x1 0x05b100 0x0000ffff80036000 0x0000000000000000 0x000200 0x004000 0
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Luis Machado <luis.machado@linaro.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220131165456.2160675-5-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Memory tags will be dumped in the core file as segments with their own
type. Discussions with the binutils and the generic ABI community
settled on using new definitions in the PT_*PROC space (and to be
documented in the processor-specific ABIs).
Introduce PT_ARM_MEMTAG_MTE as (PT_LOPROC + 0x1). Not included in this
patch since there is no upstream support but the CHERI/BSD community
will also reserve:
#define PT_ARM_MEMTAG_CHERI (PT_LOPROC + 0x2)
#define PT_RISCV_MEMTAG_CHERI (PT_LOPROC + 0x3)
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Luis Machado <luis.machado@linaro.org>
Link: https://lore.kernel.org/r/20220131165456.2160675-3-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
With the current wording readers might infer that PR_GET_TAGGED_ADDR_CTRL
will report the mode currently active in the thread however this is not the
actual behaviour, instead all modes currently selected by the process will
be reported with the mode used depending on the combination of the
requested modes and the default set for the current CPU. This has been the
case since 433c38f40f ("arm64: mte: change ASYNC and SYNC TCF settings
into bitfields"), before that we did not allow more than one mode to be
requested simultaneously.
Update the documentation to more clearly reflect current behaviour.
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220127190324.660405-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
The GCR EL1 test unconditionally includes local definitions of the prctls
it tests. Since not only will the kselftest build infrastructure ensure
that the in tree uapi headers are available but the toolchain being used to
build kselftest may ensure that system uapi headers with MTE support are
available this causes the compiler to warn about duplicate definitions.
Remove these duplicate definitions.
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220126174421.1712795-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
When the insn framework is used to encode an AND/ORR/EOR instruction,
aarch64_encode_immediate() is used to pick the immr imms values.
If the immediate is a 64bit mask, with bit 63 set, and zeros in any
of the upper 32 bits, the immr value is incorrectly calculated meaning
the wrong mask is generated.
For example, 0x8000000000000001 should have an immr of 1, but 32 is used,
meaning the resulting mask is 0x0000000300000000.
It would appear eBPF is unable to hit these cases, as build_insn()'s
imm value is a s32, so when used with BPF_ALU64, the sign-extended
u64 immediate would always have all-1s or all-0s in the upper 32 bits.
KVM does not generate a va_mask with any of the top bits set as these
VA wouldn't be usable with TTBR0_EL2.
This happens because the rotation is calculated from fls(~imm), which
takes an unsigned int, but the immediate may be 64bit.
Use fls64() so the 64bit mask doesn't get truncated to a u32.
Signed-off-by: James Morse <james.morse@arm.com>
Brown-paper-bag-for: Marc Zyngier <maz@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220127162127.2391947-4-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Pull ext4 fixes from Ted Ts'o:
"Various bug fixes for ext4 fast commit and inline data handling.
Also fix regression introduced as part of moving to the new mount API"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
fs/ext4: fix comments mentioning i_mutex
ext4: fix incorrect type issue during replay_del_range
jbd2: fix kernel-doc descriptions for jbd2_journal_shrink_{scan,count}()
ext4: fix potential NULL pointer dereference in ext4_fill_super()
jbd2: refactor wait logic for transaction updates into a common function
jbd2: cleanup unused functions declarations from jbd2.h
ext4: fix error handling in ext4_fc_record_modified_inode()
ext4: remove redundant max inline_size check in ext4_da_write_inline_data_begin()
ext4: fix error handling in ext4_restore_inline_data()
ext4: fast commit may miss file actions
ext4: fast commit may not fallback for ineligible commit
ext4: modify the logic of ext4_mb_new_blocks_simple
ext4: prevent used blocks from being allocated during fast commit replay
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix display of grouped aliased events in 'perf stat'.
- Add missing branch_sample_type to perf_event_attr__fprintf().
- Apply correct label to user/kernel symbols in branch mode.
- Fix 'perf ftrace' system_wide tracing, it has to be set before
creating the maps.
- Return error if procfs isn't mounted for PID namespaces when
synthesizing records for pre-existing processes.
- Set error stream of objdump process for 'perf annotate' TUI, to avoid
garbling the screen.
- Add missing arm64 support to perf_mmap__read_self(), the kernel part
got into 5.17.
- Check for NULL pointer before dereference writing debug info about a
sample.
- Update UAPI copies for asound, perf_event, prctl and kvm headers.
- Fix a typo in bpf_counter_cgroup.c.
* tag 'perf-tools-fixes-for-v5.17-2022-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf ftrace: system_wide collection is not effective by default
libperf: Add arm64 support to perf_mmap__read_self()
tools include UAPI: Sync sound/asound.h copy with the kernel sources
perf stat: Fix display of grouped aliased events
perf tools: Apply correct label to user/kernel symbols in branch mode
perf bpf: Fix a typo in bpf_counter_cgroup.c
perf synthetic-events: Return error if procfs isn't mounted for PID namespaces
perf session: Check for NULL pointer before dereference
perf annotate: Set error stream of objdump process for TUI
perf tools: Add missing branch_sample_type to perf_event_attr__fprintf()
tools headers UAPI: Sync linux/kvm.h with the kernel sources
tools headers UAPI: Sync linux/prctl.h with the kernel sources
perf beauty: Make the prctl arg regexp more strict to cope with PR_SET_VMA
tools headers cpufeatures: Sync with the kernel sources
tools headers UAPI: Sync linux/perf_event.h with the kernel sources
tools include UAPI: Sync sound/asound.h copy with the kernel sources
Pull perf fixes from Borislav Petkov:
- Intel/PT: filters could crash the kernel
- Intel: default disable the PMU for SMM, some new-ish EFI firmware has
started using CPL3 and the PMU CPL filters don't discriminate against
SMM, meaning that CPL3 (userspace only) events now also count EFI/SMM
cycles.
- Fixup for perf_event_attr::sig_data
* tag 'perf_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/pt: Fix crash with stop filters in single-range mode
perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures
selftests/perf_events: Test modification of perf_event_attr::sig_data
perf: Copy perf_event_attr::sig_data on modification
x86/perf: Default set FREEZE_ON_SMI for all
Pull objtool fix from Borislav Petkov:
"Fix a potential truncated string warning triggered by gcc12"
* tag 'objtool_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix truncated string warning
Pull irq fix from Borislav Petkov:
"Remove a bogus warning introduced by the recent PCI MSI irq affinity
overhaul"
* tag 'irq_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
PCI/MSI: Remove bogus warning in pci_irq_get_affinity()
Pull EDAC fixes from Borislav Petkov:
"Fix altera and xgene EDAC drivers to propagate the correct error code
from platform_get_irq() so that deferred probing still works"
* tag 'edac_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/xgene: Fix deferred probing
EDAC/altera: Fix deferred probing
Picking the changes from:
06feec6005 ("ASoC: hdmi-codec: Fix OOB memory accesses")
Which entails no changes in the tooling side as it doesn't introduce new
SNDRV_PCM_IOCTL_ ioctls.
To silence this perf tools build warning:
Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/lkml/Yf+6OT+2eMrYDEeX@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For perf recording, it retrieves process info by iterating nodes in proc
fs. If we run perf in a non-root PID namespace with command:
# unshare --fork --pid perf record -e cycles -a -- test_program
... in this case, unshare command creates a child PID namespace and
launches perf tool in it, but the issue is the proc fs is not mounted
for the non-root PID namespace, this leads to the perf tool gathering
process info from its parent PID namespace.
We can use below command to observe the process nodes under proc fs:
# unshare --pid --fork ls /proc
1 137 1968 2128 3 342 48 62 78 crypto kcore net uptime
10 138 2 2142 30 35 49 63 8 devices keys pagetypeinfo version
11 139 20 2143 304 36 50 64 82 device-tree key-users partitions vmallocinfo
12 14 2011 22 305 37 51 65 83 diskstats kmsg self vmstat
128 140 2038 23 307 39 52 656 84 driver kpagecgroup slabinfo zoneinfo
129 15 2074 24 309 4 53 67 9 execdomains kpagecount softirqs
13 16 2094 241 31 40 54 68 asound fb kpageflags stat
130 164 2096 242 310 41 55 69 buddyinfo filesystems loadavg swaps
131 17 2098 25 317 42 56 70 bus fs locks sys
132 175 21 26 32 43 57 71 cgroups interrupts meminfo sysrq-trigger
133 179 2102 263 329 44 58 75 cmdline iomem misc sysvipc
134 1875 2103 27 330 45 59 76 config.gz ioports modules thread-self
135 19 2117 29 333 46 6 77 consoles irq mounts timer_list
136 1941 2121 298 34 47 60 773 cpuinfo kallsyms mtd tty
So it shows many existed tasks, since unshared command has not mounted
the proc fs for the new created PID namespace, it still accesses the
proc fs of the root PID namespace. This leads to two prominent issues:
- Firstly, PID values are mismatched between thread info and samples.
The gathered thread info are coming from the proc fs of the root PID
namespace, but samples record its PID from the child PID namespace.
- The second issue is profiled program 'test_program' returns its forked
PID number from the child PID namespace, perf tool wrongly uses this
PID number to retrieve the process info via the proc fs of the root
PID namespace.
To avoid issues, we need to mount proc fs for the child PID namespace
with the option '--mount-proc' when use unshare command:
# unshare --fork --pid --mount-proc perf record -e cycles -a -- test_program
Conversely, when the proc fs of the root PID namespace is used by child
namespace, perf tool can detect the multiple PID levels and
nsinfo__is_in_root_namespace() returns false, this patch reports error
for this case:
# unshare --fork --pid perf record -e cycles -a -- test_program
Couldn't synthesize bpf events.
Perf runs in non-root PID namespace but it tries to gather process info from its parent PID namespace.
Please mount the proc file system properly, e.g. add the option '--mount-proc' for unshare command.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20211224124014.2492751-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in:
f6c6804c43 ("kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h")
That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.
This is also by now used by tools/testing/selftests/kvm/, a simple test
build succeeded.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/Yf+4k5Fs5Q3HdSG9@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To check if more kernel API sync is needed and also to see if the perf
build tests continue to pass.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull xen fixes from Juergen Gross:
- documentation fixes related to Xen
- enable x2apic mode when available when running as hardware
virtualized guest under Xen
- cleanup and fix a corner case of vcpu enumeration when running a
paravirtualized Xen guest
* tag 'for-linus-5.17a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/Xen: streamline (and fix) PV CPU enumeration
xen: update missing ioctl magic numers documentation
Improve docs for IOCTL_GNTDEV_MAP_GRANT_REF
xen: xenbus_dev.h: delete incorrect file name
xen/x2apic: enable x2apic mode when supported for HVM
Pull kvm fixes from Paolo Bonzini:
"ARM:
- A couple of fixes when handling an exception while a SError has
been delivered
- Workaround for Cortex-A510's single-step erratum
RISC-V:
- Make CY, TM, and IR counters accessible in VU mode
- Fix SBI implementation version
x86:
- Report deprecation of x87 features in supported CPUID
- Preparation for fixing an interrupt delivery race on AMD hardware
- Sparse fix
All except POWER and s390:
- Rework guest entry code to correctly mark noinstr areas and fix
vtime' accounting (for x86, this was already mostly correct but not
entirely; for ARM, MIPS and RISC-V it wasn't)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Use ERR_PTR_USR() to return -EFAULT as a __user pointer
KVM: x86: Report deprecated x87 features in supported CPUID
KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata
KVM: arm64: Stop handle_exit() from handling HVC twice when an SError occurs
KVM: arm64: Avoid consuming a stale esr value when SError occur
RISC-V: KVM: Fix SBI implementation version
RISC-V: KVM: make CY, TM, and IR counters accessible in VU mode
kvm/riscv: rework guest entry logic
kvm/arm64: rework guest entry logic
kvm/x86: rework guest entry logic
kvm/mips: rework guest entry logic
kvm: add guest_state_{enter,exit}_irqoff()
KVM: x86: Move delivery of non-APICv interrupt into vendor code
kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h
Pull xfs fixes from Darrick Wong:
"I was auditing operations in XFS that clear file privileges, and
realized that XFS' fallocate implementation drops suid/sgid but
doesn't clear file capabilities the same way that file writes and
reflink do.
There are VFS helpers that do it correctly, so refactor XFS to use
them. I also noticed that we weren't flushing the log at the correct
point in the fallocate operation, so that's fixed too.
Summary:
- Fix fallocate so that it drops all file privileges when files are
modified instead of open-coding that incompletely.
- Fix fallocate to flush the log if the caller wanted synchronous
file updates"
* tag 'xfs-5.17-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: ensure log flush at the end of a synchronous fallocate call
xfs: move xfs_update_prealloc_flags() to xfs_pnfs.c
xfs: set prealloc flag in xfs_alloc_file_space()
xfs: fallocate() should call file_modified()
xfs: remove XFS_PREALLOC_SYNC
xfs: reject crazy array sizes being fed to XFS_IOC_GETBMAP*