* for-next/misc:
arm64: mm: Drop 'const' from conditional arm64_dma_phys_limit definition
arm64: clean up tools Makefile
arm64: drop unused includes of <linux/personality.h>
arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones
arm64: prevent instrumentation of bp hardening callbacks
arm64: cpufeature: Remove cpu_has_fwb() check
arm64: atomics: remove redundant static branch
arm64: entry: Save some nops when CONFIG_ARM64_PSEUDO_NMI is not set
* for-next/linkage:
arm64: module: remove (NOLOAD) from linker script
linkage: remove SYM_FUNC_{START,END}_ALIAS()
x86: clean up symbol aliasing
arm64: clean up symbol aliasing
linkage: add SYM_FUNC_ALIAS{,_LOCAL,_WEAK}()
* for-next/kselftest:
kselftest/arm64: Log the PIDs of the parent and child in sve-ptrace
kselftest/arm64: signal: Allow tests to be incompatible with features
kselftest/arm64: mte: user_mem: test a wider range of values
kselftest/arm64: mte: user_mem: add more test types
kselftest/arm64: mte: user_mem: add test type enum
kselftest/arm64: mte: user_mem: check different offsets and sizes
kselftest/arm64: mte: user_mem: rework error handling
kselftest/arm64: mte: user_mem: introduce tag_offset and tag_len
kselftest/arm64: Remove local definitions of MTE prctls
kselftest/arm64: Remove local ARRAY_SIZE() definitions
* for-next/coredump:
arm64: Change elfcore for_each_mte_vma() to use VMA iterator
arm64: mte: Document the core dump file format
arm64: mte: Dump the MTE tags in the core file
arm64: mte: Define the number of bytes for storing the tags in a page
elf: Introduce the ARM MTE ELF segment type
elfcore: Replace CONFIG_{IA64, UML} checks with a new option
Commit 031495635b ("arm64: Do not defer reserve_crashkernel() for
platforms with no DMA memory zones") introduced different definitions
for 'arm64_dma_phys_limit' depending on CONFIG_ZONE_DMA{,32} based on
a late suggestion from Pasha. Sadly, this results in a build error when
passing W=1:
| arch/arm64/mm/init.c:90:19: error: conflicting type qualifiers for 'arm64_dma_phys_limit'
Drop the 'const' for now and use '__ro_after_init' consistently.
Link: https://lore.kernel.org/r/202203090241.aj7paWeX-lkp@intel.com
Link: https://lore.kernel.org/r/CA+CK2bDbbx=8R=UthkMesWOST8eJMtOGJdfMRTFSwVmo0Vn0EA@mail.gmail.com
Fixes: 031495635b ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
Signed-off-by: Will Deacon <will@kernel.org>
Since commit 2369f171d5 ("arm64: crash_core: Export MODULES, VMALLOC,
and VMEMMAP ranges"), Stephen reports a warning when building htmldocs:
| Documentation/admin-guide/kdump/vmcoreinfo.rst:498: WARNING: Title underline too short.
Extend the underline to squash the warning.
Fixes: 2369f171d5 ("arm64: crash_core: Export MODULES, VMALLOC, and VMEMMAP ranges")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Will Deacon <will@kernel.org>
Remove unused gen-y.
Remove redundant $(shell ...) because 'mkdir' is done in cmd_gen_cpucaps.
Replace $(filter-out $(PHONY), $^) with the $(real-prereqs) shorthand.
The '&&' in cmd_gen_cpucaps should be replaced with ';' because it is
run under 'set -e' environment.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20220227085232.206529-1-masahiroy@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
The following patches resulted in deferring crash kernel reservation to
mem_init(), mainly aimed at platforms with DMA memory zones (no IOMMU),
in particular Raspberry Pi 4.
commit 1a8e1cef76 ("arm64: use both ZONE_DMA and ZONE_DMA32")
commit 8424ecdde7 ("arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges")
commit 0a30c53573 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
commit 2687275a58 ("arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required")
Above changes introduced boot slowdown due to linear map creation for
all the memory banks with NO_BLOCK_MAPPINGS, see discussion[1]. The proposed
changes restore crash kernel reservation to earlier behavior thus avoids
slow boot, particularly for platforms with IOMMU (no DMA memory zones).
Tested changes to confirm no ~150ms boot slowdown on our SoC with IOMMU
and 8GB memory. Also tested with ZONE_DMA and/or ZONE_DMA32 configs to confirm
no regression to deferring scheme of crash kernel memory reservation.
In both cases successfully collected kernel crash dump.
[1] https://lore.kernel.org/all/9436d033-579b-55fa-9b00-6f4b661c2dd7@linux.microsoft.com/
Signed-off-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Cc: stable@vger.kernel.org
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/r/1646242689-20744-1-git-send-email-vijayb@linux.microsoft.com
[will: Add #ifdef CONFIG_KEXEC_CORE guards to fix 'crashk_res' references in allnoconfig build]
Signed-off-by: Will Deacon <will@kernel.org>
If the test triggers a problem it may well result in a log message from
the kernel such as a WARN() or BUG(). If these include a PID it can help
with debugging to know if it was the parent or child process that triggered
the issue, since the test is just creating a new thread the process name
will be the same either way. Print the PIDs of the parent and child on
startup so users have this information to hand should it be needed.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220303192817.2732509-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
When a IAR register read races with a GIC interrupt RELEASE event,
GIC-CPU interface could wrongly return a valid INTID to the CPU
for an interrupt that is already released(non activated) instead of 0x3ff.
As a side effect, an interrupt handler could run twice, once with
interrupt priority and then with idle priority.
As a workaround, gic_read_iar is updated so that it will return a
valid interrupt ID only if there is a change in the active priority list
after the IAR read on all the affected Silicons.
Since there are silicon variants where both 23154 and 38545 are applicable,
workaround for erratum 23154 has been extended to address both of them.
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220307143014.22758-1-lcherian@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
When a contiguous HugeTLB page is mapped, set_pte_at() will be called
CONT_PTES/CONT_PMDS times. Therefore, __sync_icache_dcache() will
flush cache multiple times if the page is executable (to ensure
the I-D cache coherency). However, the first flushing cache already
covers subsequent cache flush operations. So only flusing cache
for the head page if it is a HugeTLB page to avoid redundant cache
flushing. In the next patch, it is also depends on this change
since the tail vmemmap pages of HugeTLB is mapped with read-only
meanning only head page struct can be modified.
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220302084624.33340-1-songmuchun@bytedance.com
Signed-off-by: Will Deacon <will@kernel.org>
We may call arm64_apply_bp_hardening() early during entry (e.g. in
el0_ia()) before it is safe to run instrumented code. Unfortunately this
may result in running instrumented code in two cases:
* The hardening callbacks called by arm64_apply_bp_hardening() are not
marked as `noinstr`, and have been observed to be instrumented when
compiled with either GCC or LLVM.
* Since arm64_apply_bp_hardening() itself is only marked as `inline`
rather than `__always_inline`, it is possible that the compiler
decides to place it out-of-line, whereupon it may be instrumented.
For example, with defconfig built with clang 13.0.0,
call_hvc_arch_workaround_1() is compiled as:
| <call_hvc_arch_workaround_1>:
| d503233f paciasp
| f81f0ffe str x30, [sp, #-16]!
| 320183e0 mov w0, #0x80008000
| d503201f nop
| d4000002 hvc #0x0
| f84107fe ldr x30, [sp], #16
| d50323bf autiasp
| d65f03c0 ret
... but when CONFIG_FTRACE=y and CONFIG_KCOV=y this is compiled as:
| <call_hvc_arch_workaround_1>:
| d503245f bti c
| d503201f nop
| d503201f nop
| d503233f paciasp
| a9bf7bfd stp x29, x30, [sp, #-16]!
| 910003fd mov x29, sp
| 94000000 bl 0 <__sanitizer_cov_trace_pc>
| 320183e0 mov w0, #0x80008000
| d503201f nop
| d4000002 hvc #0x0
| a8c17bfd ldp x29, x30, [sp], #16
| d50323bf autiasp
| d65f03c0 ret
... with a patchable function entry registered with ftrace, and a direct
call to __sanitizer_cov_trace_pc(). Neither of these are safe early
during entry sequences.
This patch avoids the unsafe instrumentation by marking
arm64_apply_bp_hardening() as `__always_inline` and by marking the
hardening functions as `noinstr`. This avoids the potential for
instrumentation, and causes clang to consistently generate the function
as with the defconfig sample.
Note: in the defconfig compilation, when CONFIG_SVE=y, x30 is spilled to
the stack without being placed in a frame record, which will result in a
missing entry if call_hvc_arch_workaround_1() is backtraced. Similar is
true of qcom_link_stack_sanitisation(), where inline asm spills the LR
to a GPR prior to corrupting it. This is not a significant issue
presently as we will only backtrace here if an exception is taken, and
in such cases we may omit entries for other reasons today.
The relevant hardening functions were introduced in commits:
ec82b567a7 ("arm64: Implement branch predictor hardening for Falkor")
b092201e00 ("arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support")
... and these were subsequently moved in commit:
d4647f0a2a ("arm64: Rewrite Spectre-v2 mitigation code")
The arm64_apply_bp_hardening() function was introduced in commit:
0f15adbb28 ("arm64: Add skeleton to harden the branch predictor against aliasing attacks")
... and was subsequently moved and reworked in commit:
6279017e80 ("KVM: arm64: Move BP hardening helpers into spectre.h")
Fixes: ec82b567a7 ("arm64: Implement branch predictor hardening for Falkor")
Fixes: b092201e00 ("arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support")
Fixes: d4647f0a2a ("arm64: Rewrite Spectre-v2 mitigation code")
Fixes: 0f15adbb28 ("arm64: Add skeleton to harden the branch predictor against aliasing attacks")
Fixes: 6279017e80 ("KVM: arm64: Move BP hardening helpers into spectre.h")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220224181028.512873-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
cpu_has_fwb() is supposed to warn user is following architectural
requirement is not valid:
LoUU, bits [29:27] - Level of Unification Uniprocessor for the cache
hierarchy.
Note
When FEAT_S2FWB is implemented, the architecture requires that
this field is zero so that no levels of data cache need to be
cleaned in order to manage coherency with instruction fetches.
LoUIS, bits [23:21] - Level of Unification Inner Shareable for the
cache hierarchy.
Note
When FEAT_S2FWB is implemented, the architecture requires that
this field is zero so that no levels of data cache need to be
cleaned in order to manage coherency with instruction fetches.
It is not really clear what user have to do if assertion fires. Having
assertions about the CPU design like this inspire even more assertions
to be added and the kernel definitely is not the right place for that,
so let's remove cpu_has_fwb() altogether.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Link: https://lore.kernel.org/r/20220224164739.119168-1-vladimir.murzin@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
It is a preparation patch for eBPF atomic supports under arm64. eBPF
needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and
atomic[64]_{xchg|cmpxchg}. The ordering semantics of eBPF atomics are
the same with the implementations in linux kernel.
Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB
instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for
LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra
helper is added. atomic_fetch_add() and other atomic ops needs support for
STLXR instruction, so extend enum aarch64_insn_ldst_type to do that.
LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE
atomics is enabled, so just return AARCH64_BREAK_FAULT directly in
these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20220217072232.1186625-3-houtao1@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
If CONFIG_ARM64_LSE_ATOMICS is off, encoders for LSE-related instructions
can return AARCH64_BREAK_FAULT directly in insn.h. In order to access
AARCH64_BREAK_FAULT in insn.h, we can not include debug-monitors.h in
insn.h, because debug-monitors.h has already depends on insn.h, so just
move AARCH64_BREAK_FAULT into insn-def.h.
It will be used by the following patch to eliminate unnecessary LSE-related
encoders when CONFIG_ARM64_LSE_ATOMICS is off.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20220217072232.1186625-2-houtao1@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Now that we have SYM_FUNC_ALIAS() and SYM_FUNC_ALIAS_WEAK(), use those
to simplify the definition of function aliases across arch/x86.
For clarity, where there are multiple annotations such as
EXPORT_SYMBOL(), I've tried to keep annotations grouped by symbol. For
example, where a function has a name and an alias which are both
exported, this is organised as:
SYM_FUNC_START(func)
... asm insns ...
SYM_FUNC_END(func)
EXPORT_SYMBOL(func)
SYM_FUNC_ALIAS(alias, func)
EXPORT_SYMBOL(alias)
Where there are only aliases and no exports or other annotations, I have
not bothered with line spacing, e.g.
SYM_FUNC_START(func)
... asm insns ...
SYM_FUNC_END(func)
SYM_FUNC_ALIAS(alias, func)
The tools/perf/ copies of memset_64.S and memset_64.S are updated
likewise to avoid the build system complaining these are mismatched:
| Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
| diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
| Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
| diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Mark Brown <broonie@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220216162229.1076788-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Now that we have SYM_FUNC_ALIAS() and SYM_FUNC_ALIAS_WEAK(), use those
to simplify and more consistently define function aliases across
arch/arm64.
Aliases are now defined in terms of a canonical function name. For
position-independent functions I've made the __pi_<func> name the
canonical name, and defined other alises in terms of this.
The SYM_FUNC_{START,END}_PI(func) macros obscure the __pi_<func> name,
and make this hard to seatch for. The SYM_FUNC_START_WEAK_PI() macro
also obscures the fact that the __pi_<func> fymbol is global and the
<func> symbol is weak. For clarity, I have removed these macros and used
SYM_FUNC_{START,END}() directly with the __pi_<func> name.
For example:
SYM_FUNC_START_WEAK_PI(func)
... asm insns ...
SYM_FUNC_END_PI(func)
EXPORT_SYMBOL(func)
... becomes:
SYM_FUNC_START(__pi_func)
... asm insns ...
SYM_FUNC_END(__pi_func)
SYM_FUNC_ALIAS_WEAK(func, __pi_func)
EXPORT_SYMBOL(func)
For clarity, where there are multiple annotations such as
EXPORT_SYMBOL(), I've tried to keep annotations grouped by symbol. For
example, where a function has a name and an alias which are both
exported, this is organised as:
SYM_FUNC_START(func)
... asm insns ...
SYM_FUNC_END(func)
EXPORT_SYMBOL(func)
SYM_FUNC_ALIAS(alias, func)
EXPORT_SYMBOL(alias)
For consistency with the other string functions, I've defined strrchr as
a position-independent function, as it can safely be used as such even
though we have no users today.
As we no longer use SYM_FUNC_{START,END}_ALIAS(), our local copies are
removed. The common versions will be removed by a subsequent patch.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Mark Brown <broonie@kernel.org>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220216162229.1076788-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Currently aliasing an asm function requires adding START and END
annotations for each name, as per Documentation/asm-annotations.rst:
SYM_FUNC_START_ALIAS(__memset)
SYM_FUNC_START(memset)
... asm insns ...
SYM_FUNC_END(memset)
SYM_FUNC_END_ALIAS(__memset)
This is more painful than necessary to maintain, especially where a
function has many aliases, some of which we may wish to define
conditionally. For example, arm64's memcpy/memmove implementation (which
uses some arch-specific SYM_*() helpers) has:
SYM_FUNC_START_ALIAS(__memmove)
SYM_FUNC_START_ALIAS_WEAK_PI(memmove)
SYM_FUNC_START_ALIAS(__memcpy)
SYM_FUNC_START_WEAK_PI(memcpy)
... asm insns ...
SYM_FUNC_END_PI(memcpy)
EXPORT_SYMBOL(memcpy)
SYM_FUNC_END_ALIAS(__memcpy)
EXPORT_SYMBOL(__memcpy)
SYM_FUNC_END_ALIAS_PI(memmove)
EXPORT_SYMBOL(memmove)
SYM_FUNC_END_ALIAS(__memmove)
EXPORT_SYMBOL(__memmove)
SYM_FUNC_START(name)
It would be much nicer if we could define the aliases *after* the
standard function definition. This would avoid the need to specify each
symbol name twice, and would make it easier to spot the canonical
function definition.
This patch adds new macros to allow us to do so, which allows the above
example to be rewritten more succinctly as:
SYM_FUNC_START(__pi_memcpy)
... asm insns ...
SYM_FUNC_END(__pi_memcpy)
SYM_FUNC_ALIAS(__memcpy, __pi_memcpy)
EXPORT_SYMBOL(__memcpy)
SYM_FUNC_ALIAS_WEAK(memcpy, __memcpy)
EXPORT_SYMBOL(memcpy)
SYM_FUNC_ALIAS(__pi_memmove, __pi_memcpy)
SYM_FUNC_ALIAS(__memmove, __pi_memmove)
EXPORT_SYMBOL(__memmove)
SYM_FUNC_ALIAS_WEAK(memmove, __memmove)
EXPORT_SYMBOL(memmove)
The reduction in duplication will also make it possible to replace some
uses of WEAK with more accurate Kconfig guards, e.g.
#ifndef CONFIG_KASAN
SYM_FUNC_ALIAS(memmove, __memmove)
EXPORT_SYMBOL(memmove)
#endif
... which should make it easier to ensure that symbols are neither used
nor overidden unexpectedly.
The existing SYM_FUNC_START_ALIAS() and SYM_FUNC_START_LOCAL_ALIAS() are
marked as deprecated, and will be removed once existing users are moved
over to the new scheme.
The tools/perf/ copy of linkage.h is updated to match. A subsequent
patch will depend upon this when updating the x86 asm annotations.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Mark Brown <broonie@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220216162229.1076788-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
For each vma mapped with PROT_MTE (the VM_MTE flag set), generate a
PT_ARM_MEMTAG_MTE segment in the core file and dump the corresponding
tags. The in-file size for such segments is 128 bytes per page.
For pages in a VM_MTE vma which are not present in the user page tables
or don't have the PG_mte_tagged flag set (e.g. execute-only), just write
zeros in the core file.
An example of program headers for two vmas, one 2-page, the other 4-page
long:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
...
LOAD 0x030000 0x0000ffff80034000 0x0000000000000000 0x000000 0x002000 RW 0x1000
LOAD 0x030000 0x0000ffff80036000 0x0000000000000000 0x004000 0x004000 RW 0x1000
...
LOPROC+0x1 0x05b000 0x0000ffff80034000 0x0000000000000000 0x000100 0x002000 0
LOPROC+0x1 0x05b100 0x0000ffff80036000 0x0000000000000000 0x000200 0x004000 0
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Luis Machado <luis.machado@linaro.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220131165456.2160675-5-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Memory tags will be dumped in the core file as segments with their own
type. Discussions with the binutils and the generic ABI community
settled on using new definitions in the PT_*PROC space (and to be
documented in the processor-specific ABIs).
Introduce PT_ARM_MEMTAG_MTE as (PT_LOPROC + 0x1). Not included in this
patch since there is no upstream support but the CHERI/BSD community
will also reserve:
#define PT_ARM_MEMTAG_CHERI (PT_LOPROC + 0x2)
#define PT_RISCV_MEMTAG_CHERI (PT_LOPROC + 0x3)
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Luis Machado <luis.machado@linaro.org>
Link: https://lore.kernel.org/r/20220131165456.2160675-3-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Due to a historical oversight, we emit a redundant static branch for
each atomic/atomic64 operation when CONFIG_ARM64_LSE_ATOMICS is
selected. We can safely remove this, making the kernel Image reasonably
smaller.
When CONFIG_ARM64_LSE_ATOMICS is selected, every LSE atomic operation
has two preceding static branches with the same target, e.g.
b f7c <kernel_init_freeable+0xa4>
b f7c <kernel_init_freeable+0xa4>
mov w0, #0x1 // #1
ldadd w0, w0, [x19]
This is because the __lse_ll_sc_body() wrapper uses
system_uses_lse_atomics(), which checks both `arm64_const_caps_ready`
and `cpu_hwcap_keys[ARM64_HAS_LSE_ATOMICS]`, each of which emits a
static branch. This has been the case since commit:
addfc38672 ("arm64: atomics: avoid out-of-line ll/sc atomics")
However, there was never a need to check `arm64_const_caps_ready`, which
was itself introduced in commit:
63a1e1c95e ("arm64/cpufeature: don't use mutex in bringup path")
... so that cpus_have_const_cap() could fall back to checking the
`cpu_hwcaps` bitmap prior to the static keys for individual caps
becoming enabled. As system_uses_lse_atomics() doesn't check
`cpu_hwcaps`, and doesn't need to as we can safely use the LL/SC atomics
prior to enabling the `ARM64_HAS_LSE_ATOMICS` static key, it doesn't
need to check `arm64_const_caps_ready`.
This patch removes the `arm64_const_caps_ready` check from
system_uses_lse_atomics(). As the arch_atomic_* routines are meant to be
safely usable in noinstr code, I've also marked
system_uses_lse_atomics() as __always_inline.
This results in one fewer static branch per atomic operation, with the
prior example becoming:
b f78 <kernel_init_freeable+0xa0>
mov w0, #0x1 // #1
ldadd w0, w0, [x19]
Each static branch consists of the branch itself and an associated
__jump_table entry. Removing these has a reasonable impact on the Image
size, with a GCC 11.1.0 defconfig v5.17-rc2 Image being reduced by
128KiB:
| [mark@lakrids:~/src/linux]% ls -al Image*
| -rw-r--r-- 1 mark mark 34619904 Feb 3 18:24 Image.baseline
| -rw-r--r-- 1 mark mark 34488832 Feb 3 18:33 Image.onebranch
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220204104439.270567-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
With the current wording readers might infer that PR_GET_TAGGED_ADDR_CTRL
will report the mode currently active in the thread however this is not the
actual behaviour, instead all modes currently selected by the process will
be reported with the mode used depending on the combination of the
requested modes and the default set for the current CPU. This has been the
case since 433c38f40f ("arm64: mte: change ASYNC and SYNC TCF settings
into bitfields"), before that we did not allow more than one mode to be
requested simultaneously.
Update the documentation to more clearly reflect current behaviour.
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220127190324.660405-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
The GCR EL1 test unconditionally includes local definitions of the prctls
it tests. Since not only will the kselftest build infrastructure ensure
that the in tree uapi headers are available but the toolchain being used to
build kselftest may ensure that system uapi headers with MTE support are
available this causes the compiler to warn about duplicate definitions.
Remove these duplicate definitions.
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220126174421.1712795-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
When the insn framework is used to encode an AND/ORR/EOR instruction,
aarch64_encode_immediate() is used to pick the immr imms values.
If the immediate is a 64bit mask, with bit 63 set, and zeros in any
of the upper 32 bits, the immr value is incorrectly calculated meaning
the wrong mask is generated.
For example, 0x8000000000000001 should have an immr of 1, but 32 is used,
meaning the resulting mask is 0x0000000300000000.
It would appear eBPF is unable to hit these cases, as build_insn()'s
imm value is a s32, so when used with BPF_ALU64, the sign-extended
u64 immediate would always have all-1s or all-0s in the upper 32 bits.
KVM does not generate a va_mask with any of the top bits set as these
VA wouldn't be usable with TTBR0_EL2.
This happens because the rotation is calculated from fls(~imm), which
takes an unsigned int, but the immediate may be 64bit.
Use fls64() so the 64bit mask doesn't get truncated to a u32.
Signed-off-by: James Morse <james.morse@arm.com>
Brown-paper-bag-for: Marc Zyngier <maz@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220127162127.2391947-4-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The 'fixmap' is a global resource and is used recursively by
create pud mapping(), leading to a potential race condition in the
presence of a concurrent call to alloc_init_pud():
kernel_init thread virtio-mem workqueue thread
================== ===========================
alloc_init_pud(...) alloc_init_pud(...)
pudp = pud_set_fixmap_offset(...) pudp = pud_set_fixmap_offset(...)
READ_ONCE(*pudp)
pud_clear_fixmap(...)
READ_ONCE(*pudp) // CRASH!
As kernel may sleep during creating pud mapping, introduce a mutex lock to
serialise use of the fixmap entries by alloc_init_pud(). However, there is
no need for locking in early boot stage and it doesn't work well with
KASLR enabled when early boot. So, enable lock when system_state doesn't
equal to "SYSTEM_BOOTING".
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Fixes: f471044545 ("arm64: mm: use fixmap when creating page tables")
Link: https://lore.kernel.org/r/20220201114400.56885-1-jianyong.wu@arm.com
Signed-off-by: Will Deacon <will@kernel.org>