The literal pool entry for identifying the vectors base is the only piece
of information in the trampoline page that identifies the true location
of the kernel.
This patch moves it into a page-aligned region of the .rodata section
and maps this adjacent to the trampoline text via an additional fixmap
entry, which protects against any accidental leakage of the trampoline
contents.
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Laura Abbott <labbott@redhat.com>
Tested-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
commit 6c27c4082f)
Change-Id: Iffe72dc5e7ee171d83a7b916a16146e35ddf904e
[ghackmann@google.com:
- adjust context
- replace ARM64_WORKAROUND_QCOM_FALKOR_E1003 alternative with
compile-time CONFIG_ARCH_MSM8996 check]
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
To allow unmapping of the kernel whilst running at EL0, we need to
point the exception vectors at an entry trampoline that can map/unmap
the kernel on entry/exit respectively.
This patch adds the trampoline page, although it is not yet plugged
into the vector table and is therefore unused.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Tested-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
commit c7b9adaf85)
Change-Id: Idd27ab26f1ec1db2ff756fc33ebb782201806f7c
[ghackmann@google.com: adjust context]
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
KASAN needs to know whether the allocation happens in an IRQ handler.
This lets us strip everything below the IRQ entry point to reduce the
number of unique stack traces needed to be stored.
Move the definition of __irq_entry to <linux/interrupt.h> so that the
users don't need to pull in <linux/ftrace.h>. Also introduce the
__softirq_entry macro which is similar to __irq_entry, but puts the
corresponding functions to the .softirqentry.text section.
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from be7635e728)
Change-Id: Ib321eb9c2b76ef4785cf3fd522169f524348bd9a
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Resume from hibernate needs to clean any text executed by the kernel with
the MMU off to the PoC. Collect these functions together into the
.idmap.text section as all this code is tightly coupled and also needs
the same cleaning after resume.
Data is more complicated, secondary_holding_pen_release is written with
the MMU on, clean and invalidated, then read with the MMU off. In contrast
__boot_cpu_mode is written with the MMU off, the corresponding cache line
is invalidated, so when we read it with the MMU on we don't get stale data.
These cache maintenance operations conflict with each other if the values
are within a Cache Writeback Granule (CWG) of each other.
Collect the data into two sections .mmuoff.data.read and .mmuoff.data.write,
the linker script ensures mmuoff.data.write section is aligned to the
architectural maximum CWG of 2KB.
Signed-off-by: James Morse <james.morse@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit b611303811)
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
This patch adds the uaccess macros/functions to disable access to user
space by setting TTBR0_EL1 to a reserved zeroed page. Since the value
written to TTBR0_EL1 must be a physical address, for simplicity this
patch introduces a reserved_ttbr0 page at a constant offset from
swapper_pg_dir. The uaccess_disable code uses the ttbr1_el1 value
adjusted by the reserved_ttbr0 offset.
Enabling access to user is done by restoring TTBR0_EL1 with the value
from the struct thread_info ttbr0 variable. Interrupts must be disabled
during the uaccess_ttbr0_enable code to ensure the atomicity of the
thread_info.ttbr0 read and TTBR0_EL1 write. This patch also moves the
get_thread_info asm macro from entry.S to assembler.h for reuse in the
uaccess_ttbr0_* macros.
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 31432001
Change-Id: I54ada623160cb47f5762e0e39a5e84a75252dbfd
(cherry picked from commit 4b65a5db36)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Conflicts:
conflicts are almost come from mm-kaslr, focus on mm
arch/arm64/include/asm/cpufeature.h
arch/arm64/include/asm/pgtable.h
arch/arm64/kernel/Makefile
arch/arm64/kernel/cpufeature.c
arch/arm64/kernel/head.S
arch/arm64/kernel/suspend.c
arch/arm64/kernel/vmlinux.lds.S
arch/arm64/kvm/hyp.S
arch/arm64/mm/init.c
arch/arm64/mm/mmu.c
arch/arm64/mm/proc-macros.S
Currently we have separate ALIGN_DEBUG_RO{,_MIN} directives to align
_etext and __init_begin. While we ensure that __init_begin is
page-aligned, we do not provide the same guarantee for _etext. This is
not problematic currently as the alignment of __init_begin is sufficient
to prevent issues when we modify permissions.
Subsequent patches will assume page alignment of segments of the kernel
we wish to map with different permissions. To ensure this, move _etext
after the ALIGN_DEBUG_RO_MIN for the init section. This renders the
prior ALIGN_DEBUG_RO irrelevant, and hence it is removed. Likewise,
upgrade to ALIGN_DEBUG_RO_MIN(PAGE_SIZE) for _stext.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit fca082bfb5)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Add support for hibernate/suspend-to-disk.
Suspend borrows code from cpu_suspend() to write cpu state onto the stack,
before calling swsusp_save() to save the memory image.
Restore creates a set of temporary page tables, covering only the
linear map, copies the restore code to a 'safe' page, then uses the copy to
restore the memory image. The copied code executes in the lower half of the
address space, and once complete, restores the original kernel's page
tables. It then calls into cpu_resume(), and follows the normal
cpu_suspend() path back into the suspend code.
To restore a kernel using KASLR, the address of the page tables, and
cpu_resume() are stored in the hibernate arch-header and the el2
vectors are pivotted via the 'safe' page in low memory.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Kevin Hilman <khilman@baylibre.com> # Tested on Juno R2
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 82869ac57b)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Conflicts:
arch/arm64/kernel/Makefile
Currently we treat the alternatives separately from other data that's
only used during initialisation, using separate .altinstructions and
.altinstr_replacement linker sections. These are freed for general
allocation separately from .init*. This is problematic as:
* We do not remove execute permissions, as we do for .init, leaving the
memory executable.
* We pad between them, making the kernel Image bianry up to PAGE_SIZE
bytes larger than necessary.
This patch moves the two sections into the contiguous region used for
.init*. This saves some memory, ensures that we remove execute
permissions, and allows us to remove some code made redundant by this
reorganisation.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 9aa4ec1571)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
commit 888b3c8720 upstream.
Entry symbols are not kprobe safe. So blacklist them for kprobing.
[dave.long@linaro.org: Remove check for hypervisor text]
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: David A. Long <dave.long@linaro.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
[catalin.marinas@arm.com: Do not include syscall wrappers in .entry.text]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
commit 2dd0e8d2d2 upstream.
Add support for basic kernel probes(kprobes) and jump probes
(jprobes) for ARM64.
Kprobes utilizes software breakpoint and single step debug
exceptions supported on ARM v8.
A software breakpoint is placed at the probe address to trap the
kernel execution into the kprobe handler.
ARM v8 supports enabling single stepping before the break exception
return (ERET), with next PC in exception return address (ELR_EL1). The
kprobe handler prepares an executable memory slot for out-of-line
execution with a copy of the original instruction being probed, and
enables single stepping. The PC is set to the out-of-line slot address
before the ERET. With this scheme, the instruction is executed with the
exact same register context except for the PC (and DAIF) registers.
Debug mask (PSTATE.D) is enabled only when single stepping a recursive
kprobe, e.g.: during kprobes reenter so that probed instruction can be
single stepped within the kprobe handler -exception- context.
The recursion depth of kprobe is always 2, i.e. upon probe re-entry,
any further re-entry is prevented by not calling handlers and the case
counted as a missed kprobe).
Single stepping from the x-o-l slot has a drawback for PC-relative accesses
like branching and symbolic literals access as the offset from the new PC
(slot address) may not be ensured to fit in the immediate value of
the opcode. Such instructions need simulation, so reject
probing them.
Instructions generating exceptions or cpu mode change are rejected
for probing.
Exclusive load/store instructions are rejected too. Additionally, the
code is checked to see if it is inside an exclusive load/store sequence
(code from Pratyush).
System instructions are mostly enabled for stepping, except MSR/MRS
accesses to "DAIF" flags in PSTATE, which are not safe for
probing.
[<dave.long@linaro.org>: changed to remove irq_stack references]
This also changes arch/arm64/include/asm/ptrace.h to use
include/asm-generic/ptrace.h.
Thanks to Steve Capper and Pratyush Anand for several suggested
Changes.
Signed-off-by: Sandeepa Prabhu <sandeepa.s.prabhu@gmail.com>
Signed-off-by: David A. Long <dave.long@linaro.org>
Signed-off-by: Pratyush Anand <panand@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds the uaccess macros/functions to disable access to user
space by setting TTBR0_EL1 to a reserved zeroed page. Since the value
written to TTBR0_EL1 must be a physical address, for simplicity this
patch introduces a reserved_ttbr0 page at a constant offset from
swapper_pg_dir. The uaccess_disable code uses the ttbr1_el1 value
adjusted by the reserved_ttbr0 offset.
Enabling access to user is done by restoring TTBR0_EL1 with the value
from the struct thread_info ttbr0 variable. Interrupts must be disabled
during the uaccess_ttbr0_enable code to ensure the atomicity of the
thread_info.ttbr0 read and TTBR0_EL1 write. This patch also moves the
get_thread_info asm macro from entry.S to assembler.h for reuse in the
uaccess_ttbr0_* macros.
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Change-Id: Idf09a870b8612dce23215bce90d88781f0c0c3aa
(cherry picked from commit 940d37234182d2675ab8ab46084840212d735018)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
As Kees Cook notes in the ARM counterpart of this patch [0]:
The _etext position is defined to be the end of the kernel text code,
and should not include any part of the data segments. This interferes
with things that might check memory ranges and expect executable code
up to _etext.
In particular, Kees is referring to the HARDENED_USERCOPY patch set [1],
which rejects attempts to call copy_to_user() on kernel ranges containing
executable code, but does allow access to the .rodata segment. Regardless
of whether one may or may not agree with the distinction, it makes sense
for _etext to have the same meaning across architectures.
So let's put _etext where it belongs, between .text and .rodata, and fix
up existing references to use __init_begin instead, which unlike _end_rodata
includes the exception and notes sections as well.
The _etext references in kaslr.c are left untouched, since its references
to [_stext, _etext) are meant to capture potential jump instruction targets,
and so disregarding .rodata is actually an improvement here.
[0] http://article.gmane.org/gmane.linux.kernel/2245084
[1] http://thread.gmane.org/gmane.linux.kernel.hardened.devel/2502
Reported-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 9fdc14c55c)
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
The linker routines that we rely on to produce a relocatable PIE binary
treat it as a shared ELF object in some ways, i.e., it emits symbol based
R_AARCH64_ABS64 relocations into the final binary since doing so would be
appropriate when linking a shared library that is subject to symbol
preemption. (This means that an executable can override certain symbols
that are exported by a shared library it is linked with, and that the
shared library *must* update all its internal references as well, and point
them to the version provided by the executable.)
Symbol preemption does not occur for OS hosted PIE executables, let alone
for vmlinux, and so we would prefer to get rid of these symbol based
relocations. This would allow us to simplify the relocation routines, and
to strip the .dynsym, .dynstr and .hash sections from the binary. (Note
that these are tiny, and are placed in the .init segment, but they clutter
up the vmlinux binary.)
Note that these R_AARCH64_ABS64 relocations are only emitted for absolute
references to symbols defined in the linker script, all other relocatable
quantities are covered by anonymous R_AARCH64_RELATIVE relocations that
simply list the offsets to all 64-bit values in the binary that need to be
fixed up based on the offset between the link time and run time addresses.
Fortunately, GNU ld has a -Bsymbolic option, which is intended for shared
libraries to allow them to ignore symbol preemption, and unconditionally
bind all internal symbol references to its own definitions. So set it for
our PIE binary as well, and get rid of the asoociated sections and the
relocation code that processes them.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: fixed conflict with __dynsym_offset linker script entry]
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 08cc55b2af)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Due to the untyped KIMAGE_VADDR constant, the linker may not notice
that the __rela_offset and __dynsym_offset expressions are absolute
values (i.e., are not subject to relocation). This does not matter for
KASLR, but it does confuse kallsyms in relative mode, since it uses
the lowest non-absolute symbol address as the anchor point, and expects
all other symbol addresses to be within 4 GB of it.
Fix this by qualifying these expressions as ABSOLUTE() explicitly.
Fixes: 0cd3defe0a ("arm64: kernel: perform relocation processing from ID map")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit d6732fc402)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Refactor the relocation processing so that the code executes from the
ID map while accessing the relocation tables via the virtual mapping.
This way, we can use literals containing virtual addresses as before,
instead of having to use convoluted absolute expressions.
For symmetry with the secondary code path, the relocation code and the
subsequent jump to the virtual entry point are implemented in a function
called __primary_switch(), and __mmap_switched() is renamed to
__primary_switched(). Also, the call sequence in stext() is aligned with
the one in secondary_startup(), by replacing the awkward 'adr_l lr' and
'b cpu_setup' sequence with a simple branch and link.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 0cd3defe0a)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
The mapping of the kernel consist of four segments, each of which is mapped
with different permission attributes and/or lifetimes. To optimize the TLB
and translation table footprint, we define various opaque constants in the
linker script that resolve to different aligment values depending on the
page size and whether CONFIG_DEBUG_ALIGN_RODATA is set.
Considering that
- a 4 KB granule kernel benefits from a 64 KB segment alignment (due to
the fact that it allows the use of the contiguous bit),
- the minimum alignment of the .data segment is THREAD_SIZE already, not
PAGE_SIZE (i.e., we already have padding between _data and the start of
the .data payload in many cases),
- 2 MB is a suitable alignment value on all granule sizes, either for
mapping directly (level 2 on 4 KB), or via the contiguous bit (level 3 on
16 KB and 64 KB),
- anything beyond 2 MB exceeds the minimum alignment mandated by the boot
protocol, and can only be mapped efficiently if the physical alignment
happens to be the same,
we can simplify this by standardizing on 64 KB (or 2 MB) explicitly, i.e.,
regardless of granule size, all segments are aligned either to 64 KB, or to
2 MB if CONFIG_DEBUG_ALIGN_RODATA=y. This also means we can drop the Kconfig
dependency of CONFIG_DEBUG_ALIGN_RODATA on CONFIG_ARM64_4K_PAGES.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 97740051dd)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Keeping .head.text out of the .text mapping buys us very little: its actual
payload is only 4 KB, most of which is padding, but the page alignment may
add up to 2 MB (in case of CONFIG_DEBUG_ALIGN_RODATA=y) of additional
padding to the uncompressed kernel Image.
Also, on 4 KB granule kernels, the 4 KB misalignment of .text forces us to
map the adjacent 56 KB of code without the PTE_CONT attribute, and since
this region contains things like the vector table and the GIC interrupt
handling entry point, this region is likely to benefit from the reduced TLB
pressure that results from PTE_CONT mappings.
So remove the alignment between the .head.text and .text sections, and use
the [_text, _etext) rather than the [_stext, _etext) interval for mapping
the .text segment.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 7eb90f2ff7)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Currently the .rodata section is actually still executable when DEBUG_RODATA
is enabled. This changes that so the .rodata is actually read only, no execute.
It also adds the .rodata section to the mem_init banner.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: added vm_struct vmlinux_rodata in map_kernel()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 2f39b5f91e)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
This implements CONFIG_RELOCATABLE, which links the final vmlinux
image with a dynamic relocation section, allowing the early boot code
to perform a relocation to a different virtual address at runtime.
This is a prerequisite for KASLR (CONFIG_RANDOMIZE_BASE).
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 1e48ef7fcc)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit ab893fb9f1)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Currently we have separate ALIGN_DEBUG_RO{,_MIN} directives to align
_etext and __init_begin. While we ensure that __init_begin is
page-aligned, we do not provide the same guarantee for _etext. This is
not problematic currently as the alignment of __init_begin is sufficient
to prevent issues when we modify permissions.
Subsequent patches will assume page alignment of segments of the kernel
we wish to map with different permissions. To ensure this, move _etext
after the ALIGN_DEBUG_RO_MIN for the init section. This renders the
prior ALIGN_DEBUG_RO irrelevant, and hence it is removed. Likewise,
upgrade to ALIGN_DEBUG_RO_MIN(PAGE_SIZE) for _stext.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit fca082bfb5)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Currently we treat the alternatives separately from other data that's
only used during initialisation, using separate .altinstructions and
.altinstr_replacement linker sections. These are freed for general
allocation separately from .init*. This is problematic as:
* We do not remove execute permissions, as we do for .init, leaving the
memory executable.
* We pad between them, making the kernel Image bianry up to PAGE_SIZE
bytes larger than necessary.
This patch moves the two sections into the contiguous region used for
.init*. This saves some memory, ensures that we remove execute
permissions, and allows us to remove some code made redundant by this
reorganisation.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 9aa4ec1571)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Currently we place an ALIGN_DEBUG_RO between text and data for the .text
and .init sections, and depending on configuration each of these may
result in up to SECTION_SIZE bytes worth of padding (for
DEBUG_RODATA_ALIGN).
We make no distinction between the text and data in each of these
sections at any point when creating the initial page tables in head.S.
We also make no distinction when modifying the tables; __map_memblock,
fixup_executable, mark_rodata_ro, and fixup_init only work at section
granularity. Thus this padding is unnecessary.
For the spit between init text and data we impose a minimum alignment of
16 bytes, but this is also unnecessary. The init data is output
immediately after the padding before any symbols are defined, so this is
not required to keep a symbol for linker a section array correctly
associated with the data. Any objects within the section will be given
at least their usual alignment regardless.
This patch removes the redundant padding.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 5b28cd9d08)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Bring the linker script in line with the recent increase of
L1_CACHE_BYTES to 128. Replace the hardcoded value of 64 with the
symbolic constant.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: fix up RW_DATA_SECTION as well]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
A kernel built with DEBUG_RO_DATA && !CONFIG_DEBUG_ALIGN_RODATA doesn't
have .text aligned to a page boundary, though fixup_executable works at
page-granularity thanks to its use of create_mapping. If .text is not
page-aligned, the first page it exists in may be marked non-executable,
leading to failures when an attempt is made to execute code in said
page.
This patch upgrades ALIGN_DEBUG_RO and ALIGN_DEBUG_RO_MIN to force page
alignment for DEBUG_RO_DATA && !CONFIG_DEBUG_ALIGN_RODATA kernels,
ensuring that all sections with specific RWX permission requirements are
mapped with the correct permissions.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Laura Abbott <laura@labbott.name>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: da141706ae ("arm64: add better page protections to arm64")
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Move the kernel pagetable (both swapper and idmap) definitions
from the generic asm/page.h to a new file, asm/kernel-pgtable.h.
This is mostly a cosmetic change, to clean up the asm/page.h to
get rid of the arch specific details which are not needed by the
generic code.
Also renames the symbols to prevent conflicts. e.g,
BLOCK_SHIFT => SWAPPER_BLOCK_SHIFT
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit ea8c2e1124 ("arm64: Extend the idmap to the whole kernel
image") changed the early page table code so that the entire kernel
Image is covered by the identity map. This allows functions that
need to enable or disable the MMU to reside anywhere in the kernel
Image.
However, this change has the unfortunate side effect that the Image
cannot cross a physical 512 MB alignment boundary anymore, since the
early page table code cannot deal with the Image crossing a /virtual/
512 MB alignment boundary.
So instead, reduce the ID map to a single page, that is populated by
the contents of the .idmap.text section. Only three functions reside
there at the moment: __enable_mmu(), cpu_resume_mmu() and cpu_reset().
If new code is introduced that needs to manipulate the MMU state, it
should be added to this section as well.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The HYP init bounce page is a runtime construct that ensures that the
HYP init code does not cross a page boundary. However, this is something
we can do perfectly well at build time, by aligning the code appropriately.
For arm64, we just align to 4 KB, and enforce that the code size is less
than 4 KB, regardless of the chosen page size.
For ARM, the whole code is less than 256 bytes, so we tweak the linker
script to align at a power of 2 upper bound of the code size
Note that this also fixes a benign off-by-one error in the original bounce
page code, where a bounce page would be allocated unnecessarily if the code
was exactly 1 page in size.
On ARM, it also fixes an issue with very large kernels reported by Arnd
Bergmann, where stub sections with linker emitted veneers could erroneously
trigger the size/alignment ASSERT() in the linker script.
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add page protections for arm64 similar to those in arm.
This is for security reasons to prevent certain classes
of exploits. The current method:
- Map all memory as either RWX or RW. We round to the nearest
section to avoid creating page tables before everything is mapped
- Once everything is mapped, if either end of the RWX section should
not be X, we split the PMD and remap as necessary
- When initmem is to be freed, we change the permissions back to
RW (using stop machine if necessary to flush the TLB)
- If CONFIG_DEBUG_RODATA is set, the read only sections are set
read only.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
.exit.* sections may be subject to patching by the new alternatives
framework and so shouldn't be discarded at link-time. Without this patch,
such a section will result in the following linker error:
`.exit.text' referenced in section `.altinstructions' of
drivers/built-in.o: defined in discarded section `.exit.text' of
drivers/built-in.o
Signed-off-by: Will Deacon <will.deacon@arm.com>
With a blatant copy of some x86 bits we introduce the alternative
runtime patching "framework" to arm64.
This is quite basic for now and we only provide the functions we need
at this time.
This is connected to the newly introduced feature bits.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Change our PE/COFF header to use the minimum file alignment of
512 bytes (0x200), as mandated by the PE/COFF spec v8.3
Also update the linker script so that the Image file itself is also a
round multiple of FileAlignment.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Roy Franz <roy.franz@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
This patch changes the __init_end address to a
page align address, so that free_initmem() can
free the whole .init section, because if the end
address is not page aligned, it will round down to
a page align address, then the tail unligned page
will not be freed.
Signed-off-by: wang <yalin.wang2010@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The arm64 Image header contains a text_offset field which bootloaders
are supposed to read to determine the offset (from a 2MB aligned "start
of memory" per booting.txt) at which to load the kernel. The offset is
not well respected by bootloaders at present, and due to the lack of
variation there is little incentive to support it. This is unfortunate
for the sake of future kernels where we may wish to vary the text offset
(even zeroing it).
This patch adds options to arm64 to enable fuzz-testing of text_offset.
CONFIG_ARM64_RANDOMIZE_TEXT_OFFSET forces the text offset to a random
16-byte aligned value value in the range [0..2MB) upon a build of the
kernel. It is recommended that distribution kernels enable randomization
to test bootloaders such that any compliance issues can be fixed early.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Tom Rini <trini@ti.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently the kernel Image is stripped of everything past the initial
stack, and at runtime the memory is initialised and used by the kernel.
This makes the effective minimum memory footprint of the kernel larger
than the size of the loaded binary, though bootloaders have no mechanism
to identify how large this minimum memory footprint is. This makes it
difficult to choose safe locations to place both the kernel and other
binaries required at boot (DTB, initrd, etc), such that the kernel won't
clobber said binaries or other reserved memory during initialisation.
Additionally when big endian support was added the image load offset was
overlooked, and is currently of an arbitrary endianness, which makes it
difficult for bootloaders to make use of it. It seems that bootloaders
aren't respecting the image load offset at present anyway, and are
assuming that offset 0x80000 will always be correct.
This patch adds an effective image size to the kernel header which
describes the amount of memory from the start of the kernel Image binary
which the kernel expects to use before detecting memory and handling any
memory reservations. This can be used by bootloaders to choose suitable
locations to load the kernel and/or other binaries such that the kernel
will not clobber any memory unexpectedly. As before, memory reservations
are required to prevent the kernel from clobbering these locations
later.
Both the image load offset and the effective image size are forced to be
little-endian regardless of the native endianness of the kernel to
enable bootloaders to load a kernel of arbitrary endianness. Bootloaders
which wish to make use of the load offset can inspect the effective
image size field for a non-zero value to determine if the offset is of a
known endianness. To enable software to determine the endinanness of the
kernel as may be required for certain use-cases, a new flags field (also
little-endian) is added to the kernel header to export this information.
The documentation is updated to clarify these details. To discourage
future assumptions regarding the value of text_offset, the value at this
point in time is removed from the main flow of the documentation (though
kept as a compatibility note). Some minor formatting issues in the
documentation are also corrected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Tom Rini <trini@ti.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Kevin Hilman <kevin.hilman@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently we place swapper_pg_dir and idmap_pg_dir below the kernel
image, between PHYS_OFFSET and (PHYS_OFFSET + TEXT_OFFSET). However,
bootloaders may use portions of this memory below the kernel and we do
not parse the memory reservation list until after the MMU has been
enabled. As such we may clobber some memory a bootloader wishes to have
preserved.
To enable the use of all of this memory by bootloaders (when the
required memory reservations are communicated to the kernel) it is
necessary to move our initial page tables elsewhere. As we currently
have an effectively unbound requirement for memory at the end of the
kernel image for .bss, we can place the page tables here.
This patch moves the initial page table to the end of the kernel image,
after the BSS. As they do not consist of any initialised data they will
be stripped from the kernel Image as with the BSS. The BSS clearing
routine is updated to stop at __bss_stop rather than _end so as to not
clobber the page tables, and memory reservations made redundant by the
new organisation are removed.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Change the arm64 linker script ENTRY() command to define _text as the
kernel entry point.
The arm64 boot protocol specifies that the kernel must be entered at the
beginning of the kernel image. The existing ENTRY() command defined the
symbol stext as the entry point, which emitted an incorrect entry point,
but would not cause a runtime error because the existing entry code
immediately jumps to stext.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The __data_loc variable is an unused left over from the 32 bit arm implementation.
Remove that variable and adjust the __mmap_switched startup routine accordingly.
Signed-off-by: Geoff Levand <geoff@infradead.org> for Huawei, Linaro
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We currently try to emit .comment twice, once in STABS_DEBUG, and once
in the line immediately following it. As the two section definitions are
identical, the latter is redundant and can be dropped.
This patch drops the redundant .comment section definition.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The .data section in the arm64 linker script currently lacks a
definition for page-aligned data. This leads to a .page_aligned
section being placed between the end of data and start of bss.
This patch corrects that by using the generic RW_DATA_SECTION
macro which includes support for page-aligned data.
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The arm64 kernel has an internal holding pen, which is necessary for
some systems where we can't bring CPUs online individually and must hold
multiple CPUs in a safe area until the kernel is able to handle them.
The current SMP infrastructure for arm64 is closely coupled to this
holding pen, and alternative boot methods must launch CPUs into the pen,
where they sit before they are launched into the kernel proper.
With PSCI (and possibly other future boot methods), we can bring CPUs
online individually, and need not perform the secondary_holding_pen
dance. Instead, this patch factors the holding pen management code out
to the spin-table boot method code, as it is the only boot method
requiring the pen.
A new entry point for secondaries, secondary_entry is added for other
boot methods to use, which bypasses the holding pen and its associated
overhead when bringing CPUs online. The smp.pen.text section is also
removed, as the pen can live in head.text without problem.
The cpu_operations structure is extended with two new functions,
cpu_boot and cpu_postboot, for bringing a cpu into the kernel and
performing any post-boot cleanup required by a bootmethod (e.g.
resetting the secondary_holding_pen_release to INVALID_HWID).
Documentation is added for cpu_operations.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The current vmlinux.lds.S places the notes sections between the
end of rw data and start of bss. This means that _edata doesn't
really point to the end of data. Since notes are read-only, this
patch moves them to the read-only segment so that _edata does
point to the end of initialized rw data.
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
As is done for other architectures, sort the exception table at
build-time rather than during boot.
Since sortextable appears to be a standalone C program relying on the
host elf.h to provide EM_AARCH64, I've had to add a conditional check in
order to allow cross-compilation on machines that aren't running a
bleeding-edge libc-dev.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add the necessary infrastructure for identity-mapped HYP page
tables. Idmap-ed code must be in the ".hyp.idmap.text" linker
section.
The rest of the HYP ends up in ".hyp.text".
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>