Merge tag 'lsk-v4.4-17.07-android' of git://git.linaro.org/kernel/linux-linaro-stable.git

LSK 17.07 v4.4-android

* tag 'lsk-v4.4-17.07-android': (402 commits)
  dt/vendor-prefixes: remove redundant vendor
  Linux 4.4.77
  saa7134: fix warm Medion 7134 EEPROM read
  x86/mm/pat: Don't report PAT on CPUs that don't support it
  ext4: check return value of kstrtoull correctly in reserved_clusters_store
  staging: comedi: fix clean-up of comedi_class in comedi_init()
  staging: vt6556: vnt_start Fix missing call to vnt_key_init_table.
  tcp: fix tcp_mark_head_lost to check skb len before fragmenting
  md: fix super_offset endianness in super_1_rdev_size_change
  md: fix incorrect use of lexx_to_cpu in does_sb_need_changing
  perf tools: Use readdir() instead of deprecated readdir_r() again
  perf tests: Remove wrong semicolon in while loop in CQM test
  perf trace: Do not process PERF_RECORD_LOST twice
  perf dwarf: Guard !x86_64 definitions under #ifdef else clause
  perf pmu: Fix misleadingly indented assignment (whitespace)
  perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed
  perf tools: Remove duplicate const qualifier
  perf script: Use readdir() instead of deprecated readdir_r()
  perf thread_map: Use readdir() instead of deprecated readdir_r()
  perf tools: Use readdir() instead of deprecated readdir_r()
  ...

Conflicts:
	Makefile
	drivers/Kconfig
	drivers/Makefile
	drivers/usb/dwc3/gadget.c

Change-Id: Ib4aae2e34ebbf0d7953c748a33f673acb3e744fc
This commit is contained in:
Huang, Tao
2017-07-26 19:32:04 +08:00
462 changed files with 9493 additions and 1662 deletions

View File

@@ -435,6 +435,8 @@ sysrq.txt
- info on the magic SysRq key.
target/
- directory with info on generating TCM v4 fabric .ko modules
tee.txt
- info on the TEE subsystem and drivers
this_cpu_ops.txt
- List rationale behind and the way to use this_cpu operations.
thermal/

View File

@@ -0,0 +1,31 @@
OP-TEE Device Tree Bindings
OP-TEE is a piece of software using hardware features to provide a Trusted
Execution Environment. The security can be provided with ARM TrustZone, but
also by virtualization or a separate chip.
We're using "linaro" as the first part of the compatible property for
the reference implementation maintained by Linaro.
* OP-TEE based on ARM TrustZone required properties:
- compatible : should contain "linaro,optee-tz"
- method : The method of calling the OP-TEE Trusted OS. Permitted
values are:
"smc" : SMC #0, with the register assignments specified
in drivers/tee/optee/optee_smc.h
"hvc" : HVC #0, with the register assignments specified
in drivers/tee/optee/optee_smc.h
Example:
firmware {
optee {
compatible = "linaro,optee-tz";
method = "smc";
};
};

View File

@@ -52,3 +52,48 @@ This property is set (currently only on PowerPC, and only needed on
book3e) by some versions of kexec-tools to tell the new kernel that it
is being booted by kexec, as the booting environment may differ (e.g.
a different secondary CPU release mechanism)
linux,usable-memory-range
-------------------------
This property (arm64 only) holds a base address and size, describing a
limited region in which memory may be considered available for use by
the kernel. Memory outside of this range is not available for use.
This property describes a limitation: memory within this range is only
valid when also described through another mechanism that the kernel
would otherwise use to determine available memory (e.g. memory nodes
or the EFI memory map). Valid memory may be sparse within the range.
e.g.
/ {
chosen {
linux,usable-memory-range = <0x9 0xf0000000 0x0 0x10000000>;
};
};
The main usage is for crash dump kernel to identify its own usable
memory and exclude, at its boot time, any other memory areas that are
part of the panicked kernel's memory.
While this property does not represent a real hardware, the address
and the size are expressed in #address-cells and #size-cells,
respectively, of the root node.
linux,elfcorehdr
----------------
This property (currently used only on arm64) holds the memory range,
the address and the size, of the elf core header which mainly describes
the panicked kernel's memory layout as PT_LOAD segments of elf format.
e.g.
/ {
chosen {
linux,elfcorehdr = <0x9 0xfffff000 0x0 0x800>;
};
};
While this property does not represent a real hardware, the address
and the size are expressed in #address-cells and #size-cells,
respectively, of the root node.

View File

@@ -128,6 +128,7 @@ lacie LaCie
lantiq Lantiq Semiconductor
lenovo Lenovo Group Ltd.
lg LG Corporation
linaro Linaro Limited
linux Linux-specific binding
lsi LSI Corp. (LSI Logic)
lltc Linear Technology Corporation

View File

@@ -307,6 +307,7 @@ Code Seq#(hex) Include File Comments
0xA3 80-8F Port ACL in development:
<mailto:tlewis@mindspring.com>
0xA3 90-9F linux/dtlk.h
0xA4 00-1F uapi/linux/tee.h Generic TEE subsystem
0xAA 00-3F linux/uapi/linux/userfaultfd.h
0xAB 00-1F linux/nbd.h
0xAC 00-1F linux/raw.h

View File

@@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
a remote system.
Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
s390x and arm architectures.
s390x, arm and arm64 architectures.
When the system kernel boots, it reserves a small section of memory for
the dump-capture kernel. This ensures that ongoing Direct Memory Access
@@ -249,6 +249,13 @@ Dump-capture kernel config options (Arch Dependent, arm)
AUTO_ZRELADDR=y
Dump-capture kernel config options (Arch Dependent, arm64)
----------------------------------------------------------
- Please note that kvm of the dump-capture kernel will not be enabled
on non-VHE systems even if it is configured. This is because the CPU
will not be reset to EL2 on panic.
Extended crashkernel syntax
===========================
@@ -312,6 +319,8 @@ Boot into System Kernel
any space below the alignment point may be overwritten by the dump-capture kernel,
which means it is possible that the vmcore is not that precise as expected.
On arm64, use "crashkernel=Y[@X]". Note that the start address of
the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
Load the Dump-capture Kernel
============================
@@ -334,6 +343,8 @@ For s390x:
- Use image or bzImage
For arm:
- Use zImage
For arm64:
- Use vmlinux or Image
If you are using a uncompressed vmlinux image then use following command
to load dump-capture kernel.
@@ -377,6 +388,9 @@ For s390x:
For arm:
"1 maxcpus=1 reset_devices"
For arm64:
"1 maxcpus=1 reset_devices"
Notes on loading the dump-capture kernel:
* By default, the ELF headers are stored in ELF64 format to support

View File

@@ -3590,6 +3590,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
spia_pedr=
spia_peddr=
stack_guard_gap= [MM]
override the default stack gap protection. The value
is in page units and it defines how many pages prior
to (for stacks growing down) resp. after (for stacks
growing up) the main stack are reserved for no other
mapping. Default value is 256 pages.
stacktrace [FTRACE]
Enabled the stack tracer on boot up.

View File

@@ -825,14 +825,13 @@ via the /proc/sys interface:
Each write syscall must fully contain the sysctl value to be
written, and multiple writes on the same sysctl file descriptor
will rewrite the sysctl value, regardless of file position.
0 - (default) Same behavior as above, but warn about processes that
perform writes to a sysctl file descriptor when the file position
is not 0.
1 - Respect file position when writing sysctl strings. Multiple writes
will append to the sysctl value buffer. Anything past the max length
of the sysctl value buffer will be ignored. Writes to numeric sysctl
entries must always be at file position 0 and the value must be
fully contained in the buffer sent in the write syscall.
0 - Same behavior as above, but warn about processes that perform writes
to a sysctl file descriptor when the file position is not 0.
1 - (default) Respect file position when writing sysctl strings. Multiple
writes will append to the sysctl value buffer. Anything past the max
length of the sysctl value buffer will be ignored. Writes to numeric
sysctl entries must always be at file position 0 and the value must
be fully contained in the buffer sent in the write syscall.
==============================================================

118
Documentation/tee.txt Normal file
View File

@@ -0,0 +1,118 @@
TEE subsystem
This document describes the TEE subsystem in Linux.
A TEE (Trusted Execution Environment) is a trusted OS running in some
secure environment, for example, TrustZone on ARM CPUs, or a separate
secure co-processor etc. A TEE driver handles the details needed to
communicate with the TEE.
This subsystem deals with:
- Registration of TEE drivers
- Managing shared memory between Linux and the TEE
- Providing a generic API to the TEE
The TEE interface
=================
include/uapi/linux/tee.h defines the generic interface to a TEE.
User space (the client) connects to the driver by opening /dev/tee[0-9]* or
/dev/teepriv[0-9]*.
- TEE_IOC_SHM_ALLOC allocates shared memory and returns a file descriptor
which user space can mmap. When user space doesn't need the file
descriptor any more, it should be closed. When shared memory isn't needed
any longer it should be unmapped with munmap() to allow the reuse of
memory.
- TEE_IOC_VERSION lets user space know which TEE this driver handles and
the its capabilities.
- TEE_IOC_OPEN_SESSION opens a new session to a Trusted Application.
- TEE_IOC_INVOKE invokes a function in a Trusted Application.
- TEE_IOC_CANCEL may cancel an ongoing TEE_IOC_OPEN_SESSION or TEE_IOC_INVOKE.
- TEE_IOC_CLOSE_SESSION closes a session to a Trusted Application.
There are two classes of clients, normal clients and supplicants. The latter is
a helper process for the TEE to access resources in Linux, for example file
system access. A normal client opens /dev/tee[0-9]* and a supplicant opens
/dev/teepriv[0-9].
Much of the communication between clients and the TEE is opaque to the
driver. The main job for the driver is to receive requests from the
clients, forward them to the TEE and send back the results. In the case of
supplicants the communication goes in the other direction, the TEE sends
requests to the supplicant which then sends back the result.
OP-TEE driver
=============
The OP-TEE driver handles OP-TEE [1] based TEEs. Currently it is only the ARM
TrustZone based OP-TEE solution that is supported.
Lowest level of communication with OP-TEE builds on ARM SMC Calling
Convention (SMCCC) [2], which is the foundation for OP-TEE's SMC interface
[3] used internally by the driver. Stacked on top of that is OP-TEE Message
Protocol [4].
OP-TEE SMC interface provides the basic functions required by SMCCC and some
additional functions specific for OP-TEE. The most interesting functions are:
- OPTEE_SMC_FUNCID_CALLS_UID (part of SMCCC) returns the version information
which is then returned by TEE_IOC_VERSION
- OPTEE_SMC_CALL_GET_OS_UUID returns the particular OP-TEE implementation, used
to tell, for instance, a TrustZone OP-TEE apart from an OP-TEE running on a
separate secure co-processor.
- OPTEE_SMC_CALL_WITH_ARG drives the OP-TEE message protocol
- OPTEE_SMC_GET_SHM_CONFIG lets the driver and OP-TEE agree on which memory
range to used for shared memory between Linux and OP-TEE.
The GlobalPlatform TEE Client API [5] is implemented on top of the generic
TEE API.
Picture of the relationship between the different components in the
OP-TEE architecture.
User space Kernel Secure world
~~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~~
+--------+ +-------------+
| Client | | Trusted |
+--------+ | Application |
/\ +-------------+
|| +----------+ /\
|| |tee- | ||
|| |supplicant| \/
|| +----------+ +-------------+
\/ /\ | TEE Internal|
+-------+ || | API |
+ TEE | || +--------+--------+ +-------------+
| Client| || | TEE | OP-TEE | | OP-TEE |
| API | \/ | subsys | driver | | Trusted OS |
+-------+----------------+----+-------+----+-----------+-------------+
| Generic TEE API | | OP-TEE MSG |
| IOCTL (TEE_IOC_*) | | SMCCC (OPTEE_SMC_CALL_*) |
+-----------------------------+ +------------------------------+
RPC (Remote Procedure Call) are requests from secure world to kernel driver
or tee-supplicant. An RPC is identified by a special range of SMCCC return
values from OPTEE_SMC_CALL_WITH_ARG. RPC messages which are intended for the
kernel are handled by the kernel driver. Other RPC messages will be forwarded to
tee-supplicant without further involvement of the driver, except switching
shared memory buffer representation.
References:
[1] https://github.com/OP-TEE/optee_os
[2] http://infocenter.arm.com/help/topic/com.arm.doc.den0028a/index.html
[3] drivers/tee/optee/optee_smc.h
[4] drivers/tee/optee/optee_msg.h
[5] http://www.globalplatform.org/specificationsdevice.asp look for
"TEE Client API Specification v1.0" and click download.

View File

@@ -7939,6 +7939,11 @@ F: arch/*/oprofile/
F: drivers/oprofile/
F: include/linux/oprofile.h
OP-TEE DRIVER
M: Jens Wiklander <jens.wiklander@linaro.org>
S: Maintained
F: drivers/tee/optee/
ORACLE CLUSTER FILESYSTEM 2 (OCFS2)
M: Mark Fasheh <mfasheh@suse.com>
M: Joel Becker <jlbec@evilplan.org>
@@ -9366,6 +9371,14 @@ F: drivers/hwtracing/stm/
F: include/linux/stm.h
F: include/uapi/linux/stm.h
TEE SUBSYSTEM
M: Jens Wiklander <jens.wiklander@linaro.org>
S: Maintained
F: include/linux/tee_drv.h
F: include/uapi/linux/tee.h
F: drivers/tee/
F: Documentation/tee.txt
THUNDERBOLT DRIVER
M: Andreas Noever <andreas.noever@gmail.com>
S: Maintained

View File

@@ -1,6 +1,6 @@
VERSION = 4
PATCHLEVEL = 4
SUBLEVEL = 71
SUBLEVEL = 77
EXTRAVERSION =
NAME = Blurry Fish Butt
@@ -650,6 +650,17 @@ endif
# Tell gcc to never replace conditional load with a non-conditional one
KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0)
# check for 'asm goto'
ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC) $(KBUILD_CFLAGS)), y)
KBUILD_CFLAGS += -DCC_HAVE_ASM_GOTO
KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO
else ifneq ($(findstring aarch64-linux-android, $(CROSS_COMPILE)),)
# It seems than android gcc can't pass gcc-goto.sh check, but asm goto work.
# So let's active it.
KBUILD_CFLAGS += -DCC_HAVE_ASM_GOTO
KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO
endif
ifdef CONFIG_READABLE_ASM
# Disable optimizations that make assembler listings hard to read.
# reorder blocks reorders the control in the function
@@ -805,17 +816,6 @@ KBUILD_CFLAGS += $(call cc-option,-Werror=date-time)
# use the deterministic mode of AR if available
KBUILD_ARFLAGS := $(call ar-option,D)
# check for 'asm goto'
ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC)), y)
KBUILD_CFLAGS += -DCC_HAVE_ASM_GOTO
KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO
else ifneq ($(findstring aarch64-linux-android, $(CROSS_COMPILE)),)
# It seems than android gcc can't pass gcc-goto.sh check, but asm goto work.
# So let's active it.
KBUILD_CFLAGS += -DCC_HAVE_ASM_GOTO
KBUILD_AFLAGS += -DCC_HAVE_ASM_GOTO
endif
include scripts/Makefile.kasan
include scripts/Makefile.extrawarn

View File

@@ -64,7 +64,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}

View File

@@ -54,14 +54,14 @@
timer@0200 {
compatible = "arm,cortex-a9-global-timer";
reg = <0x0200 0x100>;
interrupts = <GIC_PPI 11 IRQ_TYPE_LEVEL_HIGH>;
interrupts = <GIC_PPI 11 IRQ_TYPE_EDGE_RISING>;
clocks = <&clk_periph>;
};
local-timer@0600 {
compatible = "arm,cortex-a9-twd-timer";
reg = <0x0600 0x100>;
interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_HIGH>;
interrupts = <GIC_PPI 13 IRQ_TYPE_EDGE_RISING>;
clocks = <&clk_periph>;
};

View File

@@ -30,7 +30,7 @@
/* kHz uV */
996000 1250000
792000 1175000
396000 1075000
396000 1150000
>;
fsl,soc-operating-points = <
/* ARM kHz SOC-PU uV */

View File

@@ -110,7 +110,6 @@ __do_hyp_init:
@ - Write permission implies XN: disabled
@ - Instruction cache: enabled
@ - Data/Unified cache: enabled
@ - Memory alignment checks: enabled
@ - MMU: enabled (this code must be run from an identity mapping)
mrc p15, 4, r0, c1, c0, 0 @ HSCR
ldr r2, =HSCTLR_MASK
@@ -118,8 +117,8 @@ __do_hyp_init:
mrc p15, 0, r1, c1, c0, 0 @ SCTLR
ldr r2, =(HSCTLR_EE | HSCTLR_FI | HSCTLR_I | HSCTLR_C)
and r1, r1, r2
ARM( ldr r2, =(HSCTLR_M | HSCTLR_A) )
THUMB( ldr r2, =(HSCTLR_M | HSCTLR_A | HSCTLR_TE) )
ARM( ldr r2, =(HSCTLR_M) )
THUMB( ldr r2, =(HSCTLR_M | HSCTLR_TE) )
orr r1, r1, r2
orr r0, r0, r1
isb

View File

@@ -876,6 +876,9 @@ static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache *cache
pmd_t *pmd;
pud = stage2_get_pud(kvm, cache, addr);
if (!pud)
return NULL;
if (pud_none(*pud)) {
if (!cache)
return NULL;

View File

@@ -89,7 +89,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}
@@ -140,7 +140,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}

View File

@@ -1184,15 +1184,15 @@ void __init sanity_check_meminfo(void)
high_memory = __va(arm_lowmem_limit - 1) + 1;
if (!memblock_limit)
memblock_limit = arm_lowmem_limit;
/*
* Round the memblock limit down to a pmd size. This
* helps to ensure that we will allocate memory from the
* last full pmd, which should be mapped.
*/
if (memblock_limit)
memblock_limit = round_down(memblock_limit, PMD_SIZE);
if (!memblock_limit)
memblock_limit = arm_lowmem_limit;
memblock_limit = round_down(memblock_limit, PMD_SIZE);
memblock_set_current_limit(memblock_limit);
}

View File

@@ -631,6 +631,27 @@ config SECCOMP
and the task is only allowed to execute a few safe syscalls
defined by each seccomp mode.
config KEXEC
depends on PM_SLEEP_SMP
select KEXEC_CORE
bool "kexec system call"
---help---
kexec is a system call that implements the ability to shutdown your
current kernel, and to start another kernel. It is like a reboot
but it is independent of the system firmware. And like a reboot
you can start any kernel with it, not just Linux.
config CRASH_DUMP
bool "Build kdump crash kernel"
help
Generate crash dump after being started by kexec. This should
be normally only set in special crash dump kernels which are
loaded in the main kernel with kexec-tools into a specially
reserved region and then later executed after a crash by
kdump/kexec.
For more details see Documentation/kdump/kdump.txt
config XEN_DOM0
def_bool y
depends on XEN

View File

@@ -58,6 +58,8 @@ CONFIG_PREEMPT=y
CONFIG_KSM=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_CMA=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_CMDLINE="console=ttyAMA0"
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
CONFIG_COMPAT=y

View File

@@ -22,9 +22,9 @@
#define ACPI_MADT_GICC_LENGTH \
(acpi_gbl_FADT.header.revision < 6 ? 76 : 80)
#define BAD_MADT_GICC_ENTRY(entry, end) \
(!(entry) || (unsigned long)(entry) + sizeof(*(entry)) > (end) || \
(entry)->header.length != ACPI_MADT_GICC_LENGTH)
#define BAD_MADT_GICC_ENTRY(entry, end) \
(!(entry) || (entry)->header.length != ACPI_MADT_GICC_LENGTH || \
(unsigned long)(entry) + ACPI_MADT_GICC_LENGTH > (end))
/* Basic configuration for ACPI */
#ifdef CONFIG_ACPI

View File

@@ -0,0 +1,13 @@
#ifndef __ASM_ASM_UACCESS_H
#define __ASM_ASM_UACCESS_H
/*
* Remove the address tag from a virtual address, if present.
*/
.macro clear_address_tag, dst, addr
tst \addr, #(1 << 55)
bic \dst, \addr, #(0xff << 56)
csel \dst, \dst, \addr, eq
.endm
#endif

View File

@@ -44,23 +44,33 @@
#define smp_store_release(p, v) \
do { \
union { typeof(*p) __val; char __c[1]; } __u = \
{ .__val = (__force typeof(*p)) (v) }; \
compiletime_assert_atomic_type(*p); \
switch (sizeof(*p)) { \
case 1: \
asm volatile ("stlrb %w1, %0" \
: "=Q" (*p) : "r" (v) : "memory"); \
: "=Q" (*p) \
: "r" (*(__u8 *)__u.__c) \
: "memory"); \
break; \
case 2: \
asm volatile ("stlrh %w1, %0" \
: "=Q" (*p) : "r" (v) : "memory"); \
: "=Q" (*p) \
: "r" (*(__u16 *)__u.__c) \
: "memory"); \
break; \
case 4: \
asm volatile ("stlr %w1, %0" \
: "=Q" (*p) : "r" (v) : "memory"); \
: "=Q" (*p) \
: "r" (*(__u32 *)__u.__c) \
: "memory"); \
break; \
case 8: \
asm volatile ("stlr %1, %0" \
: "=Q" (*p) : "r" (v) : "memory"); \
: "=Q" (*p) \
: "r" (*(__u64 *)__u.__c) \
: "memory"); \
break; \
} \
} while (0)

View File

@@ -155,5 +155,6 @@ int set_memory_ro(unsigned long addr, int numpages);
int set_memory_rw(unsigned long addr, int numpages);
int set_memory_x(unsigned long addr, int numpages);
int set_memory_nx(unsigned long addr, int numpages);
int set_memory_valid(unsigned long addr, unsigned long size, int enable);
#endif

View File

@@ -20,7 +20,7 @@
#include <linux/threads.h>
#include <asm/irq.h>
#define NR_IPI 6
#define NR_IPI 7
typedef struct {
unsigned int __softirq_pending;

View File

@@ -0,0 +1,98 @@
/*
* kexec for arm64
*
* Copyright (C) Linaro.
* Copyright (C) Huawei Futurewei Technologies.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#ifndef _ARM64_KEXEC_H
#define _ARM64_KEXEC_H
/* Maximum physical address we can use pages from */
#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
/* Maximum address we can reach in physical address mode */
#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
/* Maximum address we can use for the control code buffer */
#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL)
#define KEXEC_CONTROL_PAGE_SIZE 4096
#define KEXEC_ARCH KEXEC_ARCH_AARCH64
#ifndef __ASSEMBLY__
/**
* crash_setup_regs() - save registers for the panic kernel
*
* @newregs: registers are saved here
* @oldregs: registers to be saved (may be %NULL)
*/
static inline void crash_setup_regs(struct pt_regs *newregs,
struct pt_regs *oldregs)
{
if (oldregs) {
memcpy(newregs, oldregs, sizeof(*newregs));
} else {
u64 tmp1, tmp2;
__asm__ __volatile__ (
"stp x0, x1, [%2, #16 * 0]\n"
"stp x2, x3, [%2, #16 * 1]\n"
"stp x4, x5, [%2, #16 * 2]\n"
"stp x6, x7, [%2, #16 * 3]\n"
"stp x8, x9, [%2, #16 * 4]\n"
"stp x10, x11, [%2, #16 * 5]\n"
"stp x12, x13, [%2, #16 * 6]\n"
"stp x14, x15, [%2, #16 * 7]\n"
"stp x16, x17, [%2, #16 * 8]\n"
"stp x18, x19, [%2, #16 * 9]\n"
"stp x20, x21, [%2, #16 * 10]\n"
"stp x22, x23, [%2, #16 * 11]\n"
"stp x24, x25, [%2, #16 * 12]\n"
"stp x26, x27, [%2, #16 * 13]\n"
"stp x28, x29, [%2, #16 * 14]\n"
"mov %0, sp\n"
"stp x30, %0, [%2, #16 * 15]\n"
"/* faked current PSTATE */\n"
"mrs %0, CurrentEL\n"
"mrs %1, SPSEL\n"
"orr %0, %0, %1\n"
"mrs %1, DAIF\n"
"orr %0, %0, %1\n"
"mrs %1, NZCV\n"
"orr %0, %0, %1\n"
/* pc */
"adr %1, 1f\n"
"1:\n"
"stp %1, %0, [%2, #16 * 16]\n"
: "=&r" (tmp1), "=&r" (tmp2)
: "r" (newregs)
: "memory"
);
}
}
#if defined(CONFIG_KEXEC_CORE) && defined(CONFIG_HIBERNATION)
extern bool crash_is_nosave(unsigned long pfn);
extern void crash_prepare_suspend(void);
extern void crash_post_resume(void);
#else
static inline bool crash_is_nosave(unsigned long pfn) {return false; }
static inline void crash_prepare_suspend(void) {}
static inline void crash_post_resume(void) {}
#endif
#endif /* __ASSEMBLY__ */
#endif

View File

@@ -33,7 +33,7 @@ extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
extern void init_mem_pgprot(void);
extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
unsigned long virt, phys_addr_t size,
pgprot_t prot);
pgprot_t prot, bool allow_block_mappings);
extern void *fixmap_remap_fdt(phys_addr_t dt_phys);
#endif

View File

@@ -16,6 +16,19 @@
#ifndef __ASM_SMP_H
#define __ASM_SMP_H
/* Values for secondary_data.status */
#define CPU_MMU_OFF (-1)
#define CPU_BOOT_SUCCESS (0)
/* The cpu invoked ops->cpu_die, synchronise it with cpu_kill */
#define CPU_KILL_ME (1)
/* The cpu couldn't die gracefully and is looping in the kernel */
#define CPU_STUCK_IN_KERNEL (2)
/* Fatal system error detected by secondary CPU, crash the system */
#define CPU_PANIC_KERNEL (3)
#ifndef __ASSEMBLY__
#include <linux/threads.h>
#include <linux/cpumask.h>
#include <linux/thread_info.h>
@@ -54,11 +67,17 @@ asmlinkage void secondary_start_kernel(void);
/*
* Initial data for bringing up a secondary CPU.
* @stack - sp for the secondary CPU
* @status - Result passed back from the secondary CPU to
* indicate failure.
*/
struct secondary_data {
void *stack;
long status;
};
extern struct secondary_data secondary_data;
extern long __early_cpu_boot_status;
extern void secondary_entry(void);
extern void arch_send_call_function_single_ipi(int cpu);
@@ -77,5 +96,38 @@ extern int __cpu_disable(void);
extern void __cpu_die(unsigned int cpu);
extern void cpu_die(void);
extern void cpu_die_early(void);
static inline void cpu_park_loop(void)
{
for (;;) {
wfe();
wfi();
}
}
static inline void update_cpu_boot_status(int val)
{
WRITE_ONCE(secondary_data.status, val);
/* Ensure the visibility of the status update */
dsb(ishst);
}
/*
* If a secondary CPU enters the kernel but fails to come online,
* (e.g. due to mismatched features), and cannot exit the kernel,
* we increment cpus_stuck_in_kernel and leave the CPU in a
* quiesecent loop within the kernel text. The memory containing
* this loop must not be re-used for anything else as the 'stuck'
* core is executing it.
*
* This function is used to inhibit features like kexec and hibernate.
*/
bool cpus_are_stuck_in_kernel(void);
extern void smp_send_crash_stop(void);
extern bool smp_crash_stop_failed(void);
#endif /* ifndef __ASSEMBLY__ */
#endif /* ifndef __ASM_SMP_H */

View File

@@ -27,6 +27,7 @@
/*
* User space memory access functions
*/
#include <linux/bitops.h>
#include <linux/string.h>
#include <linux/thread_info.h>
@@ -119,6 +120,13 @@ static inline void set_fs(mm_segment_t fs)
flag; \
})
/*
* When dealing with data aborts, watchpoints, or instruction traps we may end
* up with a tagged userland pointer. Clear the tag to get a sane pointer to
* pass on to access_ok(), for instance.
*/
#define untagged_addr(addr) sign_extend64(addr, 55)
#define access_ok(type, addr, size) __range_ok(addr, size)
#define user_addr_max get_fs

View File

@@ -34,6 +34,11 @@
*/
#define HVC_SET_VECTORS 1
/*
* HVC_SOFT_RESTART - CPU soft reset, used by the cpu_soft_restart routine.
*/
#define HVC_SOFT_RESTART 2
#define BOOT_CPU_MODE_EL1 (0xe11)
#define BOOT_CPU_MODE_EL2 (0xe12)

View File

@@ -44,7 +44,9 @@ arm64-obj-$(CONFIG_ACPI) += acpi.o
arm64-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
arm64-obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o
arm64-obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o
arm64-obj-$(CONFIG_PARAVIRT) += paravirt.o
arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o \
cpu-reset.o
arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-y += $(arm64-obj-y) vdso/ probes/
obj-m += $(arm64-obj-m)

View File

@@ -299,7 +299,8 @@ do { \
_ASM_EXTABLE(0b, 4b) \
_ASM_EXTABLE(1b, 4b) \
: "=&r" (res), "+r" (data), "=&r" (temp) \
: "r" (addr), "i" (-EAGAIN), "i" (-EFAULT) \
: "r" ((unsigned long)addr), "i" (-EAGAIN), \
"i" (-EFAULT) \
: "memory"); \
uaccess_disable(); \
} while (0)

View File

@@ -124,6 +124,8 @@ int main(void)
DEFINE(TZ_MINWEST, offsetof(struct timezone, tz_minuteswest));
DEFINE(TZ_DSTTIME, offsetof(struct timezone, tz_dsttime));
BLANK();
DEFINE(CPU_BOOT_STACK, offsetof(struct secondary_data, stack));
BLANK();
#ifdef CONFIG_KVM_ARM_HOST
DEFINE(VCPU_CONTEXT, offsetof(struct kvm_vcpu, arch.ctxt));
DEFINE(CPU_GP_REGS, offsetof(struct kvm_cpu_context, gp_regs));

View File

@@ -0,0 +1,54 @@
/*
* CPU reset routines
*
* Copyright (C) 2001 Deep Blue Solutions Ltd.
* Copyright (C) 2012 ARM Ltd.
* Copyright (C) 2015 Huawei Futurewei Technologies.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/linkage.h>
#include <asm/assembler.h>
#include <asm/sysreg.h>
#include <asm/virt.h>
.text
.pushsection .idmap.text, "ax"
/*
* __cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2) - Helper for
* cpu_soft_restart.
*
* @el2_switch: Flag to indicate a swich to EL2 is needed.
* @entry: Location to jump to for soft reset.
* arg0: First argument passed to @entry.
* arg1: Second argument passed to @entry.
* arg2: Third argument passed to @entry.
*
* Put the CPU into the same state as it would be if it had been reset, and
* branch to what would be the reset vector. It must be executed with the
* flat identity mapping.
*/
ENTRY(__cpu_soft_restart)
/* Clear sctlr_el1 flags. */
mrs x12, sctlr_el1
ldr x13, =SCTLR_ELx_FLAGS
bic x12, x12, x13
msr sctlr_el1, x12
isb
cbz x0, 1f // el2_switch?
mov x0, #HVC_SOFT_RESTART
hvc #0 // no return
1: mov x18, x1 // entry
mov x0, x2 // arg0
mov x1, x3 // arg1
mov x2, x4 // arg2
br x18
ENDPROC(__cpu_soft_restart)
.popsection

View File

@@ -0,0 +1,34 @@
/*
* CPU reset routines
*
* Copyright (C) 2015 Huawei Futurewei Technologies.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#ifndef _ARM64_CPU_RESET_H
#define _ARM64_CPU_RESET_H
#include <asm/virt.h>
void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
unsigned long arg0, unsigned long arg1, unsigned long arg2);
static inline void __noreturn cpu_soft_restart(unsigned long el2_switch,
unsigned long entry, unsigned long arg0, unsigned long arg1,
unsigned long arg2)
{
typeof(__cpu_soft_restart) *restart;
el2_switch = el2_switch && !is_kernel_in_hyp_mode() &&
is_hyp_mode_available();
restart = (void *)virt_to_phys(__cpu_soft_restart);
cpu_install_idmap();
restart(el2_switch, entry, arg0, arg1, arg2);
unreachable();
}
#endif

View File

@@ -894,28 +894,6 @@ static u64 __raw_read_system_reg(u32 sys_id)
}
}
/*
* Park the CPU which doesn't have the capability as advertised
* by the system.
*/
static void fail_incapable_cpu(char *cap_type,
const struct arm64_cpu_capabilities *cap)
{
int cpu = smp_processor_id();
pr_crit("CPU%d: missing %s : %s\n", cpu, cap_type, cap->desc);
/* Mark this CPU absent */
set_cpu_present(cpu, 0);
/* Check if we can park ourselves */
if (cpu_ops[cpu] && cpu_ops[cpu]->cpu_die)
cpu_ops[cpu]->cpu_die(cpu);
asm(
"1: wfe\n"
" wfi\n"
" b 1b");
}
/*
* Run through the enabled system capabilities and enable() it on this CPU.
* The capabilities were decided based on the available CPUs at the boot time.
@@ -944,8 +922,11 @@ void verify_local_cpu_capabilities(void)
* If the new CPU misses an advertised feature, we cannot proceed
* further, park the cpu.
*/
if (!feature_matches(__raw_read_system_reg(caps[i].sys_reg), &caps[i]))
fail_incapable_cpu("arm64_features", &caps[i]);
if (!feature_matches(__raw_read_system_reg(caps[i].sys_reg), &caps[i])) {
pr_crit("CPU%d: missing feature: %s\n",
smp_processor_id(), caps[i].desc);
cpu_die_early();
}
if (caps[i].enable)
caps[i].enable(NULL);
}
@@ -953,8 +934,11 @@ void verify_local_cpu_capabilities(void)
for (i = 0, caps = arm64_hwcaps; caps[i].matches; i++) {
if (!cpus_have_hwcap(&caps[i]))
continue;
if (!feature_matches(__raw_read_system_reg(caps[i].sys_reg), &caps[i]))
fail_incapable_cpu("arm64_hwcaps", &caps[i]);
if (!feature_matches(__raw_read_system_reg(caps[i].sys_reg), &caps[i])) {
pr_crit("CPU%d: missing HWCAP: %s\n",
smp_processor_id(), caps[i].desc);
cpu_die_early();
}
}
}

View File

@@ -0,0 +1,71 @@
/*
* Routines for doing kexec-based kdump
*
* Copyright (C) 2017 Linaro Limited
* Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/crash_dump.h>
#include <linux/errno.h>
#include <linux/io.h>
#include <linux/memblock.h>
#include <linux/uaccess.h>
#include <asm/memory.h>
/**
* copy_oldmem_page() - copy one page from old kernel memory
* @pfn: page frame number to be copied
* @buf: buffer where the copied page is placed
* @csize: number of bytes to copy
* @offset: offset in bytes into the page
* @userbuf: if set, @buf is in a user address space
*
* This function copies one page from old kernel memory into buffer pointed by
* @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
* copied or negative error in case of failure.
*/
ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
size_t csize, unsigned long offset,
int userbuf)
{
void *vaddr;
if (!csize)
return 0;
vaddr = memremap(__pfn_to_phys(pfn), PAGE_SIZE, MEMREMAP_WB);
if (!vaddr)
return -ENOMEM;
if (userbuf) {
if (copy_to_user((char __user *)buf, vaddr + offset, csize)) {
memunmap(vaddr);
return -EFAULT;
}
} else {
memcpy(buf, vaddr + offset, csize);
}
memunmap(vaddr);
return csize;
}
/**
* elfcorehdr_read - read from ELF core header
* @buf: buffer where the data is placed
* @csize: number of bytes to read
* @ppos: address in the memory
*
* This function reads @count bytes from elf core header which exists
* on crash dump kernel's memory.
*/
ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
{
memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
return count;
}

View File

@@ -36,7 +36,7 @@ int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md)
create_pgd_mapping(mm, md->phys_addr, md->virt_addr,
md->num_pages << EFI_PAGE_SHIFT,
__pgprot(prot_val | PTE_NG));
__pgprot(prot_val | PTE_NG), true);
return 0;
}

View File

@@ -32,6 +32,7 @@
#include <asm/ptrace.h>
#include <asm/thread_info.h>
#include <asm/uaccess.h>
#include <asm/asm-uaccess.h>
#include <asm/unistd.h>
/*
@@ -431,12 +432,13 @@ el1_da:
/*
* Data abort handling
*/
mrs x0, far_el1
mrs x3, far_el1
enable_dbg
// re-enable interrupts if they were enabled in the aborted context
tbnz x23, #7, 1f // PSR_I_BIT
enable_irq
1:
clear_address_tag x0, x3
mov x2, sp // struct pt_regs
bl do_mem_abort
@@ -598,7 +600,7 @@ el0_da:
// enable interrupts before calling the main handler
enable_dbg_and_irq
ct_user_exit
bic x0, x26, #(0xff << 56)
clear_address_tag x0, x26
mov x1, x25
mov x2, sp
bl do_mem_abort

View File

@@ -36,6 +36,7 @@
#include <asm/pgtable-hwdef.h>
#include <asm/pgtable.h>
#include <asm/page.h>
#include <asm/smp.h>
#include <asm/sysreg.h>
#include <asm/thread_info.h>
#include <asm/virt.h>
@@ -601,12 +602,6 @@ ENDPROC(set_cpu_boot_mode_flag)
ENTRY(__boot_cpu_mode)
.long BOOT_CPU_MODE_EL2
.long BOOT_CPU_MODE_EL1
/*
* The booting CPU updates the failed status @__early_cpu_boot_status,
* with MMU turned off.
*/
ENTRY(__early_cpu_boot_status)
.long 0
.popsection
@@ -655,7 +650,8 @@ __secondary_switched:
msr vbar_el1, x5
isb
ldr_l x0, secondary_data // get secondary_data.stack
adr_l x0, secondary_data
ldr x0, [x0, #CPU_BOOT_STACK] // get secondary_data.stack
mov sp, x0
and x0, x0, #~(THREAD_SIZE - 1)
msr sp_el0, x0 // save thread_info
@@ -663,6 +659,29 @@ __secondary_switched:
b secondary_start_kernel
ENDPROC(__secondary_switched)
/*
* The booting CPU updates the failed status @__early_cpu_boot_status,
* with MMU turned off.
*
* update_early_cpu_boot_status tmp, status
* - Corrupts tmp1, tmp2
* - Writes 'status' to __early_cpu_boot_status and makes sure
* it is committed to memory.
*/
.macro update_early_cpu_boot_status status, tmp1, tmp2
mov \tmp2, #\status
str_l \tmp2, __early_cpu_boot_status, \tmp1
dmb sy
dc ivac, \tmp1 // Invalidate potentially stale cache line
.endm
.pushsection .data..cacheline_aligned
.align L1_CACHE_SHIFT
ENTRY(__early_cpu_boot_status)
.long 0
.popsection
/*
* Enable the MMU.
*
@@ -680,6 +699,7 @@ ENTRY(__enable_mmu)
ubfx x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
b.ne __no_granule_support
update_early_cpu_boot_status 0, x1, x2
msr ttbr0_el1, x25 // load TTBR0
msr ttbr1_el1, x26 // load TTBR1
isb
@@ -719,8 +739,12 @@ ENTRY(__enable_mmu)
ENDPROC(__enable_mmu)
__no_granule_support:
/* Indicate that this CPU can't boot and is stuck in the kernel */
update_early_cpu_boot_status CPU_STUCK_IN_KERNEL, x1, x2
1:
wfe
b __no_granule_support
wfi
b 1b
ENDPROC(__no_granule_support)
__primary_switch:

View File

@@ -27,6 +27,7 @@
#include <asm/barrier.h>
#include <asm/cacheflush.h>
#include <asm/irqflags.h>
#include <asm/kexec.h>
#include <asm/memory.h>
#include <asm/mmu_context.h>
#include <asm/pgalloc.h>
@@ -94,7 +95,8 @@ int pfn_is_nosave(unsigned long pfn)
unsigned long nosave_begin_pfn = virt_to_pfn(&__nosave_begin);
unsigned long nosave_end_pfn = virt_to_pfn(&__nosave_end - 1);
return (pfn >= nosave_begin_pfn) && (pfn <= nosave_end_pfn);
return ((pfn >= nosave_begin_pfn) && (pfn <= nosave_end_pfn)) ||
crash_is_nosave(pfn);
}
void notrace save_processor_state(void)
@@ -245,6 +247,9 @@ int swsusp_arch_suspend(void)
local_dbg_save(flags);
if (__cpu_suspend_enter(&state)) {
/* make the crash dump kernel image visible/saveable */
crash_prepare_suspend();
ret = swsusp_save();
} else {
/* Clean kernel core startup/idle code to PoC*/
@@ -255,6 +260,9 @@ int swsusp_arch_suspend(void)
if (el2_reset_needed())
dcache_clean_range(__hyp_idmap_text_start, __hyp_idmap_text_end);
/* make the crash dump kernel image protected again */
crash_post_resume();
/*
* Tell the hibernation core that we've just restored
* the memory

View File

@@ -36,6 +36,7 @@
#include <asm/traps.h>
#include <asm/cputype.h>
#include <asm/system_misc.h>
#include <asm/uaccess.h>
/* Breakpoint currently in use for each BRP. */
static DEFINE_PER_CPU(struct perf_event *, bp_on_reg[ARM_MAX_BRP]);

View File

@@ -71,8 +71,16 @@ el1_sync:
msr vbar_el2, x1
b 9f
2: cmp x0, #HVC_SOFT_RESTART
b.ne 3f
mov x0, x2
mov x2, x4
mov x4, x1
mov x1, x3
br x4 // no return
/* Someone called kvm_call_hyp() against the hyp-stub... */
2: mov x0, #ARM_EXCEPTION_HYP_GONE
3: mov x0, #ARM_EXCEPTION_HYP_GONE
9: eret
ENDPROC(el1_sync)

View File

@@ -0,0 +1,364 @@
/*
* kexec for arm64
*
* Copyright (C) Linaro.
* Copyright (C) Huawei Futurewei Technologies.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/interrupt.h>
#include <linux/irq.h>
#include <linux/kernel.h>
#include <linux/kexec.h>
#include <linux/page-flags.h>
#include <linux/smp.h>
#include <asm/cacheflush.h>
#include <asm/cpu_ops.h>
#include <asm/memory.h>
#include <asm/mmu.h>
#include <asm/mmu_context.h>
#include <asm/page.h>
#include "cpu-reset.h"
/* Global variables for the arm64_relocate_new_kernel routine. */
extern const unsigned char arm64_relocate_new_kernel[];
extern const unsigned long arm64_relocate_new_kernel_size;
/**
* kexec_image_info - For debugging output.
*/
#define kexec_image_info(_i) _kexec_image_info(__func__, __LINE__, _i)
static void _kexec_image_info(const char *func, int line,
const struct kimage *kimage)
{
unsigned long i;
pr_debug("%s:%d:\n", func, line);
pr_debug(" kexec kimage info:\n");
pr_debug(" type: %d\n", kimage->type);
pr_debug(" start: %lx\n", kimage->start);
pr_debug(" head: %lx\n", kimage->head);
pr_debug(" nr_segments: %lu\n", kimage->nr_segments);
for (i = 0; i < kimage->nr_segments; i++) {
pr_debug(" segment[%lu]: %016lx - %016lx, 0x%lx bytes, %lu pages\n",
i,
kimage->segment[i].mem,
kimage->segment[i].mem + kimage->segment[i].memsz,
kimage->segment[i].memsz,
kimage->segment[i].memsz / PAGE_SIZE);
}
}
void machine_kexec_cleanup(struct kimage *kimage)
{
/* Empty routine needed to avoid build errors. */
}
/**
* machine_kexec_prepare - Prepare for a kexec reboot.
*
* Called from the core kexec code when a kernel image is loaded.
* Forbid loading a kexec kernel if we have no way of hotplugging cpus or cpus
* are stuck in the kernel. This avoids a panic once we hit machine_kexec().
*/
int machine_kexec_prepare(struct kimage *kimage)
{
kexec_image_info(kimage);
if (kimage->type != KEXEC_TYPE_CRASH && cpus_are_stuck_in_kernel()) {
pr_err("Can't kexec: CPUs are stuck in the kernel.\n");
return -EBUSY;
}
return 0;
}
/**
* kexec_list_flush - Helper to flush the kimage list and source pages to PoC.
*/
static void kexec_list_flush(struct kimage *kimage)
{
kimage_entry_t *entry;
for (entry = &kimage->head; ; entry++) {
unsigned int flag;
void *addr;
/* flush the list entries. */
__flush_dcache_area(entry, sizeof(kimage_entry_t));
flag = *entry & IND_FLAGS;
if (flag == IND_DONE)
break;
addr = phys_to_virt(*entry & PAGE_MASK);
switch (flag) {
case IND_INDIRECTION:
/* Set entry point just before the new list page. */
entry = (kimage_entry_t *)addr - 1;
break;
case IND_SOURCE:
/* flush the source pages. */
__flush_dcache_area(addr, PAGE_SIZE);
break;
case IND_DESTINATION:
break;
default:
BUG();
}
}
}
/**
* kexec_segment_flush - Helper to flush the kimage segments to PoC.
*/
static void kexec_segment_flush(const struct kimage *kimage)
{
unsigned long i;
pr_debug("%s:\n", __func__);
for (i = 0; i < kimage->nr_segments; i++) {
pr_debug(" segment[%lu]: %016lx - %016lx, 0x%lx bytes, %lu pages\n",
i,
kimage->segment[i].mem,
kimage->segment[i].mem + kimage->segment[i].memsz,
kimage->segment[i].memsz,
kimage->segment[i].memsz / PAGE_SIZE);
__flush_dcache_area(phys_to_virt(kimage->segment[i].mem),
kimage->segment[i].memsz);
}
}
/**
* machine_kexec - Do the kexec reboot.
*
* Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC.
*/
void machine_kexec(struct kimage *kimage)
{
phys_addr_t reboot_code_buffer_phys;
void *reboot_code_buffer;
bool in_kexec_crash = (kimage == kexec_crash_image);
bool stuck_cpus = cpus_are_stuck_in_kernel();
/*
* New cpus may have become stuck_in_kernel after we loaded the image.
*/
BUG_ON(!in_kexec_crash && (stuck_cpus || (num_online_cpus() > 1)));
WARN(in_kexec_crash && (stuck_cpus || smp_crash_stop_failed()),
"Some CPUs may be stale, kdump will be unreliable.\n");
reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
kexec_image_info(kimage);
pr_debug("%s:%d: control_code_page: %p\n", __func__, __LINE__,
kimage->control_code_page);
pr_debug("%s:%d: reboot_code_buffer_phys: %pa\n", __func__, __LINE__,
&reboot_code_buffer_phys);
pr_debug("%s:%d: reboot_code_buffer: %p\n", __func__, __LINE__,
reboot_code_buffer);
pr_debug("%s:%d: relocate_new_kernel: %p\n", __func__, __LINE__,
arm64_relocate_new_kernel);
pr_debug("%s:%d: relocate_new_kernel_size: 0x%lx(%lu) bytes\n",
__func__, __LINE__, arm64_relocate_new_kernel_size,
arm64_relocate_new_kernel_size);
/*
* Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
* after the kernel is shut down.
*/
memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
arm64_relocate_new_kernel_size);
/* Flush the reboot_code_buffer in preparation for its execution. */
__flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
flush_icache_range((uintptr_t)reboot_code_buffer,
arm64_relocate_new_kernel_size);
/* Flush the kimage list and its buffers. */
kexec_list_flush(kimage);
/* Flush the new image if already in place. */
if ((kimage != kexec_crash_image) && (kimage->head & IND_DONE))
kexec_segment_flush(kimage);
pr_info("Bye!\n");
/* Disable all DAIF exceptions. */
asm volatile ("msr daifset, #0xf" : : : "memory");
/*
* cpu_soft_restart will shutdown the MMU, disable data caches, then
* transfer control to the reboot_code_buffer which contains a copy of
* the arm64_relocate_new_kernel routine. arm64_relocate_new_kernel
* uses physical addressing to relocate the new image to its final
* position and transfers control to the image entry point when the
* relocation is complete.
*/
cpu_soft_restart(kimage != kexec_crash_image,
reboot_code_buffer_phys, kimage->head, kimage->start, 0);
BUG(); /* Should never get here. */
}
static void machine_kexec_mask_interrupts(void)
{
unsigned int i;
struct irq_desc *desc;
for_each_irq_desc(i, desc) {
struct irq_chip *chip;
int ret;
chip = irq_desc_get_chip(desc);
if (!chip)
continue;
/*
* First try to remove the active state. If this
* fails, try to EOI the interrupt.
*/
ret = irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false);
if (ret && irqd_irq_inprogress(&desc->irq_data) &&
chip->irq_eoi)
chip->irq_eoi(&desc->irq_data);
if (chip->irq_mask)
chip->irq_mask(&desc->irq_data);
if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
chip->irq_disable(&desc->irq_data);
}
}
/**
* machine_crash_shutdown - shutdown non-crashing cpus and save registers
*/
void machine_crash_shutdown(struct pt_regs *regs)
{
local_irq_disable();
/* shutdown non-crashing cpus */
smp_send_crash_stop();
/* for crashing cpu */
crash_save_cpu(regs, smp_processor_id());
machine_kexec_mask_interrupts();
pr_info("Starting crashdump kernel...\n");
}
void arch_kexec_protect_crashkres(void)
{
int i;
kexec_segment_flush(kexec_crash_image);
for (i = 0; i < kexec_crash_image->nr_segments; i++)
set_memory_valid(
__phys_to_virt(kexec_crash_image->segment[i].mem),
kexec_crash_image->segment[i].memsz >> PAGE_SHIFT, 0);
}
void arch_kexec_unprotect_crashkres(void)
{
int i;
for (i = 0; i < kexec_crash_image->nr_segments; i++)
set_memory_valid(
__phys_to_virt(kexec_crash_image->segment[i].mem),
kexec_crash_image->segment[i].memsz >> PAGE_SHIFT, 1);
}
#ifdef CONFIG_HIBERNATION
/*
* To preserve the crash dump kernel image, the relevant memory segments
* should be mapped again around the hibernation.
*/
void crash_prepare_suspend(void)
{
if (kexec_crash_image)
arch_kexec_unprotect_crashkres();
}
void crash_post_resume(void)
{
if (kexec_crash_image)
arch_kexec_protect_crashkres();
}
/*
* crash_is_nosave
*
* Return true only if a page is part of reserved memory for crash dump kernel,
* but does not hold any data of loaded kernel image.
*
* Note that all the pages in crash dump kernel memory have been initially
* marked as Reserved in kexec_reserve_crashkres_pages().
*
* In hibernation, the pages which are Reserved and yet "nosave" are excluded
* from the hibernation iamge. crash_is_nosave() does thich check for crash
* dump kernel and will reduce the total size of hibernation image.
*/
bool crash_is_nosave(unsigned long pfn)
{
int i;
phys_addr_t addr;
if (!crashk_res.end)
return false;
/* in reserved memory? */
addr = __pfn_to_phys(pfn);
if ((addr < crashk_res.start) || (crashk_res.end < addr))
return false;
if (!kexec_crash_image)
return true;
/* not part of loaded kernel image? */
for (i = 0; i < kexec_crash_image->nr_segments; i++)
if (addr >= kexec_crash_image->segment[i].mem &&
addr < (kexec_crash_image->segment[i].mem +
kexec_crash_image->segment[i].memsz))
return false;
return true;
}
void crash_free_reserved_phys_range(unsigned long begin, unsigned long end)
{
unsigned long addr;
struct page *page;
for (addr = begin; addr < end; addr += PAGE_SIZE) {
page = phys_to_page(addr);
ClearPageReserved(page);
free_reserved_page(page);
}
}
#endif /* CONFIG_HIBERNATION */
void arch_crash_save_vmcoreinfo(void)
{
VMCOREINFO_NUMBER(VA_BITS);
/* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
kimage_voffset);
vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n",
PHYS_OFFSET);
}

View File

@@ -0,0 +1,130 @@
/*
* kexec for arm64
*
* Copyright (C) Linaro.
* Copyright (C) Huawei Futurewei Technologies.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/kexec.h>
#include <linux/linkage.h>
#include <asm/assembler.h>
#include <asm/kexec.h>
#include <asm/page.h>
#include <asm/sysreg.h>
/*
* arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
*
* The memory that the old kernel occupies may be overwritten when coping the
* new image to its final location. To assure that the
* arm64_relocate_new_kernel routine which does that copy is not overwritten,
* all code and data needed by arm64_relocate_new_kernel must be between the
* symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end. The
* machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
* control_code_page, a special page which has been set up to be preserved
* during the copy operation.
*/
ENTRY(arm64_relocate_new_kernel)
/* Setup the list loop variables. */
mov x17, x1 /* x17 = kimage_start */
mov x16, x0 /* x16 = kimage_head */
dcache_line_size x15, x0 /* x15 = dcache line size */
mov x14, xzr /* x14 = entry ptr */
mov x13, xzr /* x13 = copy dest */
/* Clear the sctlr_el2 flags. */
mrs x0, CurrentEL
cmp x0, #CurrentEL_EL2
b.ne 1f
mrs x0, sctlr_el2
ldr x1, =SCTLR_ELx_FLAGS
bic x0, x0, x1
msr sctlr_el2, x0
isb
1:
/* Check if the new image needs relocation. */
tbnz x16, IND_DONE_BIT, .Ldone
.Lloop:
and x12, x16, PAGE_MASK /* x12 = addr */
/* Test the entry flags. */
.Ltest_source:
tbz x16, IND_SOURCE_BIT, .Ltest_indirection
/* Invalidate dest page to PoC. */
mov x0, x13
add x20, x0, #PAGE_SIZE
sub x1, x15, #1
bic x0, x0, x1
2: dc ivac, x0
add x0, x0, x15
cmp x0, x20
b.lo 2b
dsb sy
mov x20, x13
mov x21, x12
copy_page x20, x21, x0, x1, x2, x3, x4, x5, x6, x7
/* dest += PAGE_SIZE */
add x13, x13, PAGE_SIZE
b .Lnext
.Ltest_indirection:
tbz x16, IND_INDIRECTION_BIT, .Ltest_destination
/* ptr = addr */
mov x14, x12
b .Lnext
.Ltest_destination:
tbz x16, IND_DESTINATION_BIT, .Lnext
/* dest = addr */
mov x13, x12
.Lnext:
/* entry = *ptr++ */
ldr x16, [x14], #8
/* while (!(entry & DONE)) */
tbz x16, IND_DONE_BIT, .Lloop
.Ldone:
/* wait for writes from copy_page to finish */
dsb nsh
ic iallu
dsb nsh
isb
/* Start new image. */
mov x0, xzr
mov x1, xzr
mov x2, xzr
mov x3, xzr
br x17
ENDPROC(arm64_relocate_new_kernel)
.ltorg
.align 3 /* To keep the 64-bit values below naturally aligned. */
.Lcopy_end:
.org KEXEC_CONTROL_PAGE_SIZE
/*
* arm64_relocate_new_kernel_size - Number of bytes to copy to the
* control_code_page.
*/
.globl arm64_relocate_new_kernel_size
arm64_relocate_new_kernel_size:
.quad .Lcopy_end - arm64_relocate_new_kernel

View File

@@ -31,7 +31,6 @@
#include <linux/screen_info.h>
#include <linux/init.h>
#include <linux/kexec.h>
#include <linux/crash_dump.h>
#include <linux/root_dev.h>
#include <linux/cpu.h>
#include <linux/interrupt.h>
@@ -220,6 +219,12 @@ static void __init request_standard_resources(void)
if (kernel_data.start >= res->start &&
kernel_data.end <= res->end)
request_resource(res, &kernel_data);
#ifdef CONFIG_KEXEC_CORE
/* Userspace will find "Crash kernel" region in /proc/iomem. */
if (crashk_res.end && crashk_res.start >= res->start &&
crashk_res.end <= res->end)
request_resource(res, &crashk_res);
#endif
}
}

View File

@@ -37,6 +37,7 @@
#include <linux/completion.h>
#include <linux/of.h>
#include <linux/irq_work.h>
#include <linux/kexec.h>
#include <asm/alternative.h>
#include <asm/atomic.h>
@@ -63,16 +64,29 @@
* where to place its SVC stack
*/
struct secondary_data secondary_data;
/* Number of CPUs which aren't online, but looping in kernel text. */
int cpus_stuck_in_kernel;
enum ipi_msg_type {
IPI_RESCHEDULE,
IPI_CALL_FUNC,
IPI_CPU_STOP,
IPI_CPU_CRASH_STOP,
IPI_TIMER,
IPI_IRQ_WORK,
IPI_WAKEUP
};
#ifdef CONFIG_HOTPLUG_CPU
static int op_cpu_kill(unsigned int cpu);
#else
static inline int op_cpu_kill(unsigned int cpu)
{
return -ENOSYS;
}
#endif
/*
* Boot a secondary CPU, and assign it the specified idle task.
* This also gives us the initial stack to use for this CPU.
@@ -90,12 +104,14 @@ static DECLARE_COMPLETION(cpu_running);
int __cpu_up(unsigned int cpu, struct task_struct *idle)
{
int ret;
long status;
/*
* We need to tell the secondary core where to find its stack and the
* page tables.
*/
secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
update_cpu_boot_status(CPU_MMU_OFF);
__flush_dcache_area(&secondary_data, sizeof(secondary_data));
/*
@@ -119,6 +135,32 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
}
secondary_data.stack = NULL;
status = READ_ONCE(secondary_data.status);
if (ret && status) {
if (status == CPU_MMU_OFF)
status = READ_ONCE(__early_cpu_boot_status);
switch (status) {
default:
pr_err("CPU%u: failed in unknown state : 0x%lx\n",
cpu, status);
break;
case CPU_KILL_ME:
if (!op_cpu_kill(cpu)) {
pr_crit("CPU%u: died during early boot\n", cpu);
break;
}
/* Fall through */
pr_crit("CPU%u: may not have shut down cleanly\n", cpu);
case CPU_STUCK_IN_KERNEL:
pr_crit("CPU%u: is stuck in kernel\n", cpu);
cpus_stuck_in_kernel++;
break;
case CPU_PANIC_KERNEL:
panic("CPU%u detected unsupported configuration\n", cpu);
}
}
return ret;
}
@@ -184,6 +226,9 @@ asmlinkage void secondary_start_kernel(void)
*/
pr_info("CPU%u: Booted secondary processor [%08x]\n",
cpu, read_cpuid_id());
update_cpu_boot_status(CPU_BOOT_SUCCESS);
/* Make sure the status update is visible before we complete */
smp_wmb();
set_cpu_online(cpu, true);
complete(&cpu_running);
@@ -311,6 +356,30 @@ void cpu_die(void)
}
#endif
/*
* Kill the calling secondary CPU, early in bringup before it is turned
* online.
*/
void cpu_die_early(void)
{
int cpu = smp_processor_id();
pr_crit("CPU%d: will not boot\n", cpu);
/* Mark this CPU absent */
set_cpu_present(cpu, 0);
#ifdef CONFIG_HOTPLUG_CPU
update_cpu_boot_status(CPU_KILL_ME);
/* Check if we can park ourselves */
if (cpu_ops[cpu] && cpu_ops[cpu]->cpu_die)
cpu_ops[cpu]->cpu_die(cpu);
#endif
update_cpu_boot_status(CPU_STUCK_IN_KERNEL);
cpu_park_loop();
}
static void __init hyp_mode_check(void)
{
if (is_hyp_mode_available())
@@ -634,6 +703,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
S(IPI_RESCHEDULE, "Rescheduling interrupts"),
S(IPI_CALL_FUNC, "Function call interrupts"),
S(IPI_CPU_STOP, "CPU stop interrupts"),
S(IPI_CPU_CRASH_STOP, "CPU stop (for crash dump) interrupts"),
S(IPI_TIMER, "Timer broadcast interrupts"),
S(IPI_IRQ_WORK, "IRQ work interrupts"),
S(IPI_WAKEUP, "CPU wake-up interrupts"),
@@ -718,6 +788,29 @@ static void ipi_cpu_stop(unsigned int cpu)
cpu_relax();
}
#ifdef CONFIG_KEXEC_CORE
static atomic_t waiting_for_crash_ipi = ATOMIC_INIT(0);
#endif
static void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs)
{
#ifdef CONFIG_KEXEC_CORE
crash_save_cpu(regs, cpu);
atomic_dec(&waiting_for_crash_ipi);
local_irq_disable();
#ifdef CONFIG_HOTPLUG_CPU
if (cpu_ops[cpu]->cpu_die)
cpu_ops[cpu]->cpu_die(cpu);
#endif
/* just in case */
cpu_park_loop();
#endif
}
/*
* Main handler for inter-processor interrupts
*/
@@ -748,6 +841,15 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
irq_exit();
break;
case IPI_CPU_CRASH_STOP:
if (IS_ENABLED(CONFIG_KEXEC_CORE)) {
irq_enter();
ipi_cpu_crash_stop(cpu, regs);
unreachable();
}
break;
#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
case IPI_TIMER:
irq_enter();
@@ -816,6 +918,39 @@ void smp_send_stop(void)
pr_warning("SMP: failed to stop secondary CPUs\n");
}
#ifdef CONFIG_KEXEC_CORE
void smp_send_crash_stop(void)
{
cpumask_t mask;
unsigned long timeout;
if (num_online_cpus() == 1)
return;
cpumask_copy(&mask, cpu_online_mask);
cpumask_clear_cpu(smp_processor_id(), &mask);
atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1);
pr_crit("SMP: stopping secondary CPUs\n");
smp_cross_call(&mask, IPI_CPU_CRASH_STOP);
/* Wait up to one second for other CPUs to stop */
timeout = USEC_PER_SEC;
while ((atomic_read(&waiting_for_crash_ipi) > 0) && timeout--)
udelay(1);
if (atomic_read(&waiting_for_crash_ipi) > 0)
pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
cpumask_pr_args(&mask));
}
bool smp_crash_stop_failed(void)
{
return (atomic_read(&waiting_for_crash_ipi) > 0);
}
#endif
/*
* not supported here
*/
@@ -823,3 +958,21 @@ int setup_profiling_timer(unsigned int multiplier)
{
return -EINVAL;
}
static bool have_cpu_die(void)
{
#ifdef CONFIG_HOTPLUG_CPU
int any_cpu = raw_smp_processor_id();
if (cpu_ops[any_cpu]->cpu_die)
return true;
#endif
return false;
}
bool cpus_are_stuck_in_kernel(void)
{
bool smp_spin_tables = (num_possible_cpus() > 1 && !have_cpu_die());
return !!cpus_stuck_in_kernel || smp_spin_tables;
}

View File

@@ -29,11 +29,14 @@
#include <linux/gfp.h>
#include <linux/memblock.h>
#include <linux/sort.h>
#include <linux/of.h>
#include <linux/of_fdt.h>
#include <linux/dma-mapping.h>
#include <linux/dma-contiguous.h>
#include <linux/efi.h>
#include <linux/swiotlb.h>
#include <linux/kexec.h>
#include <linux/crash_dump.h>
#include <asm/boot.h>
#include <asm/fixmap.h>
@@ -75,6 +78,142 @@ static int __init early_initrd(char *p)
early_param("initrd", early_initrd);
#endif
#ifdef CONFIG_KEXEC_CORE
/*
* reserve_crashkernel() - reserves memory for crash kernel
*
* This function reserves memory area given in "crashkernel=" kernel command
* line parameter. The memory reserved is used by dump capture kernel when
* primary kernel is crashing.
*/
static void __init reserve_crashkernel(void)
{
unsigned long long crash_base, crash_size;
int ret;
ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
&crash_size, &crash_base);
/* no crashkernel= or invalid value specified */
if (ret || !crash_size)
return;
crash_size = PAGE_ALIGN(crash_size);
if (crash_base == 0) {
/* Current arm64 boot protocol requires 2MB alignment */
crash_base = memblock_find_in_range(0, ARCH_LOW_ADDRESS_LIMIT,
crash_size, SZ_2M);
if (crash_base == 0) {
pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
crash_size);
return;
}
} else {
/* User specifies base address explicitly. */
if (!memblock_is_region_memory(crash_base, crash_size)) {
pr_warn("cannot reserve crashkernel: region is not memory\n");
return;
}
if (memblock_is_region_reserved(crash_base, crash_size)) {
pr_warn("cannot reserve crashkernel: region overlaps reserved memory\n");
return;
}
if (!IS_ALIGNED(crash_base, SZ_2M)) {
pr_warn("cannot reserve crashkernel: base address is not 2MB aligned\n");
return;
}
}
memblock_reserve(crash_base, crash_size);
pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
crash_base, crash_base + crash_size, crash_size >> 20);
crashk_res.start = crash_base;
crashk_res.end = crash_base + crash_size - 1;
}
static void __init kexec_reserve_crashkres_pages(void)
{
#ifdef CONFIG_HIBERNATION
phys_addr_t addr;
struct page *page;
if (!crashk_res.end)
return;
/*
* To reduce the size of hibernation image, all the pages are
* marked as Reserved initially.
*/
for (addr = crashk_res.start; addr < (crashk_res.end + 1);
addr += PAGE_SIZE) {
page = phys_to_page(addr);
SetPageReserved(page);
}
#endif
}
#else
static void __init reserve_crashkernel(void)
{
}
static void __init kexec_reserve_crashkres_pages(void)
{
}
#endif /* CONFIG_KEXEC_CORE */
#ifdef CONFIG_CRASH_DUMP
static int __init early_init_dt_scan_elfcorehdr(unsigned long node,
const char *uname, int depth, void *data)
{
const __be32 *reg;
int len;
if (depth != 1 || strcmp(uname, "chosen") != 0)
return 0;
reg = of_get_flat_dt_prop(node, "linux,elfcorehdr", &len);
if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells)))
return 1;
elfcorehdr_addr = dt_mem_next_cell(dt_root_addr_cells, &reg);
elfcorehdr_size = dt_mem_next_cell(dt_root_size_cells, &reg);
return 1;
}
/*
* reserve_elfcorehdr() - reserves memory for elf core header
*
* This function reserves the memory occupied by an elf core header
* described in the device tree. This region contains all the
* information about primary kernel's core image and is used by a dump
* capture kernel to access the system memory on primary kernel.
*/
static void __init reserve_elfcorehdr(void)
{
of_scan_flat_dt(early_init_dt_scan_elfcorehdr, NULL);
if (!elfcorehdr_size)
return;
if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
pr_warn("elfcorehdr is overlapped\n");
return;
}
memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
pr_info("Reserving %lldKB of memory at 0x%llx for elfcorehdr\n",
elfcorehdr_size >> 10, elfcorehdr_addr);
}
#else
static void __init reserve_elfcorehdr(void)
{
}
#endif /* CONFIG_CRASH_DUMP */
/*
* Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
* currently assumes that for memory starting above 4G, 32-bit devices will
@@ -168,10 +307,45 @@ static int __init early_mem(char *p)
}
early_param("mem", early_mem);
static int __init early_init_dt_scan_usablemem(unsigned long node,
const char *uname, int depth, void *data)
{
struct memblock_region *usablemem = data;
const __be32 *reg;
int len;
if (depth != 1 || strcmp(uname, "chosen") != 0)
return 0;
reg = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len);
if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells)))
return 1;
usablemem->base = dt_mem_next_cell(dt_root_addr_cells, &reg);
usablemem->size = dt_mem_next_cell(dt_root_size_cells, &reg);
return 1;
}
static void __init fdt_enforce_memory_region(void)
{
struct memblock_region reg = {
.size = 0,
};
of_scan_flat_dt(early_init_dt_scan_usablemem, &reg);
if (reg.size)
memblock_cap_memory_range(reg.base, reg.size);
}
void __init arm64_memblock_init(void)
{
const s64 linear_region_size = -(s64)PAGE_OFFSET;
/* Handle linux,usable-memory-range property */
fdt_enforce_memory_region();
/*
* Ensure that the linear region takes up exactly half of the kernel
* virtual address space. This way, we can distinguish a linear address
@@ -248,6 +422,11 @@ void __init arm64_memblock_init(void)
arm64_dma_phys_limit = max_zone_dma_phys();
else
arm64_dma_phys_limit = PHYS_MASK + 1;
reserve_crashkernel();
reserve_elfcorehdr();
dma_contiguous_reserve(arm64_dma_phys_limit);
memblock_allow_resize();
@@ -361,6 +540,8 @@ void __init mem_init(void)
/* this will put all unused low memory onto the freelists */
free_all_bootmem();
kexec_reserve_crashkres_pages();
mem_init_print_info(NULL);
#define MLK(b, t) b, t, ((t) - (b)) >> 10

View File

@@ -21,6 +21,8 @@
#include <linux/kernel.h>
#include <linux/errno.h>
#include <linux/init.h>
#include <linux/ioport.h>
#include <linux/kexec.h>
#include <linux/libfdt.h>
#include <linux/mman.h>
#include <linux/nodemask.h>
@@ -156,29 +158,10 @@ static void split_pud(pud_t *old_pud, pmd_t *pmd)
} while (pmd++, i++, i < PTRS_PER_PMD);
}
#ifdef CONFIG_DEBUG_PAGEALLOC
static bool block_mappings_allowed(phys_addr_t (*pgtable_alloc)(void))
{
/*
* If debug_page_alloc is enabled we must map the linear map
* using pages. However, other mappings created by
* create_mapping_noalloc must use sections in some cases. Allow
* sections to be used in those cases, where no pgtable_alloc
* function is provided.
*/
return !pgtable_alloc || !debug_pagealloc_enabled();
}
#else
static bool block_mappings_allowed(phys_addr_t (*pgtable_alloc)(void))
{
return true;
}
#endif
static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
phys_addr_t phys, pgprot_t prot,
phys_addr_t (*pgtable_alloc)(void))
phys_addr_t (*pgtable_alloc)(void),
bool allow_block_mappings)
{
pmd_t *pmd;
unsigned long next;
@@ -209,7 +192,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
next = pmd_addr_end(addr, end);
/* try section mapping first */
if (((addr | next | phys) & ~SECTION_MASK) == 0 &&
block_mappings_allowed(pgtable_alloc)) {
(!pgtable_alloc || allow_block_mappings)) {
pmd_t old_pmd =*pmd;
pmd_set_huge(pmd, phys, prot);
/*
@@ -248,7 +231,8 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next,
static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
phys_addr_t phys, pgprot_t prot,
phys_addr_t (*pgtable_alloc)(void))
phys_addr_t (*pgtable_alloc)(void),
bool allow_block_mappings)
{
pud_t *pud;
unsigned long next;
@@ -269,7 +253,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
* For 4K granule only, attempt to put down a 1GB block
*/
if (use_1G_block(addr, next, phys) &&
block_mappings_allowed(pgtable_alloc)) {
(!pgtable_alloc || allow_block_mappings)) {
pud_t old_pud = *pud;
pud_set_huge(pud, phys, prot);
@@ -290,7 +274,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
}
} else {
alloc_init_pmd(pud, addr, next, phys, prot,
pgtable_alloc);
pgtable_alloc, allow_block_mappings);
}
phys += next - addr;
} while (pud++, addr = next, addr != end);
@@ -304,7 +288,8 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
*/
static void init_pgd(pgd_t *pgd, phys_addr_t phys, unsigned long virt,
phys_addr_t size, pgprot_t prot,
phys_addr_t (*pgtable_alloc)(void))
phys_addr_t (*pgtable_alloc)(void),
bool allow_block_mappings)
{
unsigned long addr, length, end, next;
@@ -322,7 +307,8 @@ static void init_pgd(pgd_t *pgd, phys_addr_t phys, unsigned long virt,
end = addr + length;
do {
next = pgd_addr_end(addr, end);
alloc_init_pud(pgd, addr, next, phys, prot, pgtable_alloc);
alloc_init_pud(pgd, addr, next, phys, prot, pgtable_alloc,
(!pgtable_alloc || allow_block_mappings));
phys += next - addr;
} while (pgd++, addr = next, addr != end);
}
@@ -340,9 +326,11 @@ static phys_addr_t late_pgtable_alloc(void)
static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
unsigned long virt, phys_addr_t size,
pgprot_t prot,
phys_addr_t (*alloc)(void))
phys_addr_t (*alloc)(void),
bool allow_block_mappings)
{
init_pgd(pgd_offset_raw(pgdir, virt), phys, virt, size, prot, alloc);
init_pgd(pgd_offset_raw(pgdir, virt), phys, virt, size, prot, alloc,
allow_block_mappings);
}
/*
@@ -358,16 +346,15 @@ static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
&phys, virt);
return;
}
__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot,
NULL);
__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, true);
}
void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
unsigned long virt, phys_addr_t size,
pgprot_t prot)
pgprot_t prot, bool allow_block_mappings)
{
__create_pgd_mapping(mm->pgd, phys, virt, size, prot,
late_pgtable_alloc);
late_pgtable_alloc, allow_block_mappings);
}
static void create_mapping_late(phys_addr_t phys, unsigned long virt,
@@ -380,57 +367,36 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
}
__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot,
late_pgtable_alloc);
late_pgtable_alloc, !debug_pagealloc_enabled());
}
static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end)
static void __init __map_memblock(pgd_t *pgd, phys_addr_t start,
phys_addr_t end, pgprot_t prot,
bool allow_block_mappings)
{
unsigned long kernel_start = __pa(_text);
unsigned long kernel_end = __pa(__init_begin);
/*
* Take care not to create a writable alias for the
* read-only text and rodata sections of the kernel image.
*/
/* No overlap with the kernel text/rodata */
if (end < kernel_start || start >= kernel_end) {
__create_pgd_mapping(pgd, start, __phys_to_virt(start),
end - start, PAGE_KERNEL,
early_pgtable_alloc);
return;
}
/*
* This block overlaps the kernel text/rodata mappings.
* Map the portion(s) which don't overlap.
*/
if (start < kernel_start)
__create_pgd_mapping(pgd, start,
__phys_to_virt(start),
kernel_start - start, PAGE_KERNEL,
early_pgtable_alloc);
if (kernel_end < end)
__create_pgd_mapping(pgd, kernel_end,
__phys_to_virt(kernel_end),
end - kernel_end, PAGE_KERNEL,
early_pgtable_alloc);
/*
* Map the linear alias of the [_text, __init_begin) interval as
* read-only/non-executable. This makes the contents of the
* region accessible to subsystems such as hibernate, but
* protects it from inadvertent modification or execution.
*/
__create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start),
kernel_end - kernel_start, PAGE_KERNEL_RO,
early_pgtable_alloc);
__create_pgd_mapping(pgd, start, __phys_to_virt(start), end - start,
prot, early_pgtable_alloc, allow_block_mappings);
}
static void __init map_mem(pgd_t *pgd)
{
unsigned long kernel_start = __pa(_text);
unsigned long kernel_end = __pa(__init_begin);
struct memblock_region *reg;
/*
* Take care not to create a writable alias for the
* read-only text and rodata sections of the kernel image.
* So temporarily mark them as NOMAP to skip mappings in
* the following for-loop
*/
memblock_mark_nomap(kernel_start, kernel_end - kernel_start);
#ifdef CONFIG_KEXEC_CORE
if (crashk_res.end)
memblock_mark_nomap(crashk_res.start,
resource_size(&crashk_res));
#endif
/* map all the memory banks */
for_each_memblock(memory, reg) {
phys_addr_t start = reg->base;
@@ -441,8 +407,33 @@ static void __init map_mem(pgd_t *pgd)
if (memblock_is_nomap(reg))
continue;
__map_memblock(pgd, start, end);
__map_memblock(pgd, start, end,
PAGE_KERNEL, !debug_pagealloc_enabled());
}
/*
* Map the linear alias of the [_text, __init_begin) interval as
* read-only/non-executable. This makes the contents of the
* region accessible to subsystems such as hibernate, but
* protects it from inadvertent modification or execution.
*/
__map_memblock(pgd, kernel_start, kernel_end,
PAGE_KERNEL_RO, !debug_pagealloc_enabled());
memblock_clear_nomap(kernel_start, kernel_end - kernel_start);
#ifdef CONFIG_KEXEC_CORE
/*
* Use page-level mappings here so that we can shrink the region
* in page granularity and put back unused memory to buddy system
* through /sys/kernel/kexec_crash_size interface.
*/
if (crashk_res.end) {
__map_memblock(pgd, crashk_res.start, crashk_res.end + 1,
PAGE_KERNEL, false);
memblock_clear_nomap(crashk_res.start,
resource_size(&crashk_res));
}
#endif
}
void mark_rodata_ro(void)
@@ -481,7 +472,7 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end,
BUG_ON(!PAGE_ALIGNED(size));
__create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot,
early_pgtable_alloc);
early_pgtable_alloc, !debug_pagealloc_enabled());
vma->addr = va_start;
vma->phys_addr = pa_start;

View File

@@ -125,6 +125,19 @@ int set_memory_x(unsigned long addr, int numpages)
}
EXPORT_SYMBOL_GPL(set_memory_x);
int set_memory_valid(unsigned long addr, int numpages, int enable)
{
if (enable)
return __change_memory_common(addr, PAGE_SIZE * numpages,
__pgprot(PTE_VALID),
__pgprot(0));
else
return __change_memory_common(addr, PAGE_SIZE * numpages,
__pgprot(0),
__pgprot(PTE_VALID));
}
#ifdef CONFIG_DEBUG_PAGEALLOC
void __kernel_map_pages(struct page *page, int numpages, int enable)
{

View File

@@ -74,7 +74,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
addr = PAGE_ALIGN(addr);
vma = find_vma(current->mm, addr);
if (TASK_SIZE - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
goto success;
}

View File

@@ -76,14 +76,14 @@ void ath79_ddr_set_pci_windows(void)
{
BUG_ON(!ath79_ddr_pci_win_base);
__raw_writel(AR71XX_PCI_WIN0_OFFS, ath79_ddr_pci_win_base + 0);
__raw_writel(AR71XX_PCI_WIN1_OFFS, ath79_ddr_pci_win_base + 1);
__raw_writel(AR71XX_PCI_WIN2_OFFS, ath79_ddr_pci_win_base + 2);
__raw_writel(AR71XX_PCI_WIN3_OFFS, ath79_ddr_pci_win_base + 3);
__raw_writel(AR71XX_PCI_WIN4_OFFS, ath79_ddr_pci_win_base + 4);
__raw_writel(AR71XX_PCI_WIN5_OFFS, ath79_ddr_pci_win_base + 5);
__raw_writel(AR71XX_PCI_WIN6_OFFS, ath79_ddr_pci_win_base + 6);
__raw_writel(AR71XX_PCI_WIN7_OFFS, ath79_ddr_pci_win_base + 7);
__raw_writel(AR71XX_PCI_WIN0_OFFS, ath79_ddr_pci_win_base + 0x0);
__raw_writel(AR71XX_PCI_WIN1_OFFS, ath79_ddr_pci_win_base + 0x4);
__raw_writel(AR71XX_PCI_WIN2_OFFS, ath79_ddr_pci_win_base + 0x8);
__raw_writel(AR71XX_PCI_WIN3_OFFS, ath79_ddr_pci_win_base + 0xc);
__raw_writel(AR71XX_PCI_WIN4_OFFS, ath79_ddr_pci_win_base + 0x10);
__raw_writel(AR71XX_PCI_WIN5_OFFS, ath79_ddr_pci_win_base + 0x14);
__raw_writel(AR71XX_PCI_WIN6_OFFS, ath79_ddr_pci_win_base + 0x18);
__raw_writel(AR71XX_PCI_WIN7_OFFS, ath79_ddr_pci_win_base + 0x1c);
}
EXPORT_SYMBOL_GPL(ath79_ddr_set_pci_windows);

View File

@@ -816,8 +816,10 @@ int __compute_return_epc_for_insn(struct pt_regs *regs,
break;
}
/* Compact branch: BNEZC || JIALC */
if (insn.i_format.rs)
if (!insn.i_format.rs) {
/* JIALC: set $31/ra */
regs->regs[31] = epc + 4;
}
regs->cp0_epc += 8;
break;
#endif

View File

@@ -11,6 +11,7 @@
#include <asm/asm.h>
#include <asm/asmmacro.h>
#include <asm/compiler.h>
#include <asm/irqflags.h>
#include <asm/regdef.h>
#include <asm/mipsregs.h>
#include <asm/stackframe.h>
@@ -137,6 +138,7 @@ work_pending:
andi t0, a2, _TIF_NEED_RESCHED # a2 is preloaded with TI_FLAGS
beqz t0, work_notifysig
work_resched:
TRACE_IRQS_OFF
jal schedule
local_irq_disable # make sure need_resched and
@@ -173,6 +175,7 @@ syscall_exit_work:
beqz t0, work_pending # trace bit set?
local_irq_enable # could let syscall_trace_leave()
# call schedule() instead
TRACE_IRQS_ON
move a0, sp
jal syscall_trace_leave
b resume_userspace

View File

@@ -55,7 +55,6 @@ DECLARE_BITMAP(state_support, CPS_PM_STATE_COUNT);
* state. Actually per-core rather than per-CPU.
*/
static DEFINE_PER_CPU_ALIGNED(u32*, ready_count);
static DEFINE_PER_CPU_ALIGNED(void*, ready_count_alloc);
/* Indicates online CPUs coupled with the current CPU */
static DEFINE_PER_CPU_ALIGNED(cpumask_t, online_coupled);
@@ -625,7 +624,6 @@ static int __init cps_gen_core_entries(unsigned cpu)
{
enum cps_pm_state state;
unsigned core = cpu_data[cpu].core;
unsigned dlinesz = cpu_data[cpu].dcache.linesz;
void *entry_fn, *core_rc;
for (state = CPS_PM_NC_WAIT; state < CPS_PM_STATE_COUNT; state++) {
@@ -645,16 +643,11 @@ static int __init cps_gen_core_entries(unsigned cpu)
}
if (!per_cpu(ready_count, core)) {
core_rc = kmalloc(dlinesz * 2, GFP_KERNEL);
core_rc = kmalloc(sizeof(u32), GFP_KERNEL);
if (!core_rc) {
pr_err("Failed allocate core %u ready_count\n", core);
return -ENOMEM;
}
per_cpu(ready_count_alloc, core) = core_rc;
/* Ensure ready_count is aligned to a cacheline boundary */
core_rc += dlinesz - 1;
core_rc = (void *)((unsigned long)core_rc & ~(dlinesz - 1));
per_cpu(ready_count, core) = core_rc;
}

View File

@@ -194,6 +194,8 @@ void show_stack(struct task_struct *task, unsigned long *sp)
{
struct pt_regs regs;
mm_segment_t old_fs = get_fs();
regs.cp0_status = KSU_KERNEL;
if (sp) {
regs.regs[29] = (unsigned long)sp;
regs.regs[31] = 0;

View File

@@ -92,7 +92,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}

View File

@@ -107,31 +107,31 @@ static struct rt2880_pmx_group mt7620a_pinmux_data[] = {
};
static struct rt2880_pmx_func pwm1_grp_mt7628[] = {
FUNC("sdcx", 3, 19, 1),
FUNC("sdxc d6", 3, 19, 1),
FUNC("utif", 2, 19, 1),
FUNC("gpio", 1, 19, 1),
FUNC("pwm", 0, 19, 1),
FUNC("pwm1", 0, 19, 1),
};
static struct rt2880_pmx_func pwm0_grp_mt7628[] = {
FUNC("sdcx", 3, 18, 1),
FUNC("sdxc d7", 3, 18, 1),
FUNC("utif", 2, 18, 1),
FUNC("gpio", 1, 18, 1),
FUNC("pwm", 0, 18, 1),
FUNC("pwm0", 0, 18, 1),
};
static struct rt2880_pmx_func uart2_grp_mt7628[] = {
FUNC("sdcx", 3, 20, 2),
FUNC("sdxc d5 d4", 3, 20, 2),
FUNC("pwm", 2, 20, 2),
FUNC("gpio", 1, 20, 2),
FUNC("uart", 0, 20, 2),
FUNC("uart2", 0, 20, 2),
};
static struct rt2880_pmx_func uart1_grp_mt7628[] = {
FUNC("sdcx", 3, 45, 2),
FUNC("sw_r", 3, 45, 2),
FUNC("pwm", 2, 45, 2),
FUNC("gpio", 1, 45, 2),
FUNC("uart", 0, 45, 2),
FUNC("uart1", 0, 45, 2),
};
static struct rt2880_pmx_func i2c_grp_mt7628[] = {
@@ -143,21 +143,21 @@ static struct rt2880_pmx_func i2c_grp_mt7628[] = {
static struct rt2880_pmx_func refclk_grp_mt7628[] = { FUNC("reclk", 0, 36, 1) };
static struct rt2880_pmx_func perst_grp_mt7628[] = { FUNC("perst", 0, 37, 1) };
static struct rt2880_pmx_func wdt_grp_mt7628[] = { FUNC("wdt", 0, 15, 38) };
static struct rt2880_pmx_func wdt_grp_mt7628[] = { FUNC("wdt", 0, 38, 1) };
static struct rt2880_pmx_func spi_grp_mt7628[] = { FUNC("spi", 0, 7, 4) };
static struct rt2880_pmx_func sd_mode_grp_mt7628[] = {
FUNC("jtag", 3, 22, 8),
FUNC("utif", 2, 22, 8),
FUNC("gpio", 1, 22, 8),
FUNC("sdcx", 0, 22, 8),
FUNC("sdxc", 0, 22, 8),
};
static struct rt2880_pmx_func uart0_grp_mt7628[] = {
FUNC("-", 3, 12, 2),
FUNC("-", 2, 12, 2),
FUNC("gpio", 1, 12, 2),
FUNC("uart", 0, 12, 2),
FUNC("uart0", 0, 12, 2),
};
static struct rt2880_pmx_func i2s_grp_mt7628[] = {
@@ -171,7 +171,7 @@ static struct rt2880_pmx_func spi_cs1_grp_mt7628[] = {
FUNC("-", 3, 6, 1),
FUNC("refclk", 2, 6, 1),
FUNC("gpio", 1, 6, 1),
FUNC("spi", 0, 6, 1),
FUNC("spi cs1", 0, 6, 1),
};
static struct rt2880_pmx_func spis_grp_mt7628[] = {
@@ -188,28 +188,44 @@ static struct rt2880_pmx_func gpio_grp_mt7628[] = {
FUNC("gpio", 0, 11, 1),
};
#define MT7628_GPIO_MODE_MASK 0x3
static struct rt2880_pmx_func wled_kn_grp_mt7628[] = {
FUNC("rsvd", 3, 35, 1),
FUNC("rsvd", 2, 35, 1),
FUNC("gpio", 1, 35, 1),
FUNC("wled_kn", 0, 35, 1),
};
#define MT7628_GPIO_MODE_PWM1 30
#define MT7628_GPIO_MODE_PWM0 28
#define MT7628_GPIO_MODE_UART2 26
#define MT7628_GPIO_MODE_UART1 24
#define MT7628_GPIO_MODE_I2C 20
#define MT7628_GPIO_MODE_REFCLK 18
#define MT7628_GPIO_MODE_PERST 16
#define MT7628_GPIO_MODE_WDT 14
#define MT7628_GPIO_MODE_SPI 12
#define MT7628_GPIO_MODE_SDMODE 10
#define MT7628_GPIO_MODE_UART0 8
#define MT7628_GPIO_MODE_I2S 6
#define MT7628_GPIO_MODE_CS1 4
#define MT7628_GPIO_MODE_SPIS 2
#define MT7628_GPIO_MODE_GPIO 0
static struct rt2880_pmx_func wled_an_grp_mt7628[] = {
FUNC("rsvd", 3, 44, 1),
FUNC("rsvd", 2, 44, 1),
FUNC("gpio", 1, 44, 1),
FUNC("wled_an", 0, 44, 1),
};
#define MT7628_GPIO_MODE_MASK 0x3
#define MT7628_GPIO_MODE_WLED_KN 48
#define MT7628_GPIO_MODE_WLED_AN 32
#define MT7628_GPIO_MODE_PWM1 30
#define MT7628_GPIO_MODE_PWM0 28
#define MT7628_GPIO_MODE_UART2 26
#define MT7628_GPIO_MODE_UART1 24
#define MT7628_GPIO_MODE_I2C 20
#define MT7628_GPIO_MODE_REFCLK 18
#define MT7628_GPIO_MODE_PERST 16
#define MT7628_GPIO_MODE_WDT 14
#define MT7628_GPIO_MODE_SPI 12
#define MT7628_GPIO_MODE_SDMODE 10
#define MT7628_GPIO_MODE_UART0 8
#define MT7628_GPIO_MODE_I2S 6
#define MT7628_GPIO_MODE_CS1 4
#define MT7628_GPIO_MODE_SPIS 2
#define MT7628_GPIO_MODE_GPIO 0
static struct rt2880_pmx_group mt7628an_pinmux_data[] = {
GRP_G("pmw1", pwm1_grp_mt7628, MT7628_GPIO_MODE_MASK,
GRP_G("pwm1", pwm1_grp_mt7628, MT7628_GPIO_MODE_MASK,
1, MT7628_GPIO_MODE_PWM1),
GRP_G("pmw1", pwm0_grp_mt7628, MT7628_GPIO_MODE_MASK,
GRP_G("pwm0", pwm0_grp_mt7628, MT7628_GPIO_MODE_MASK,
1, MT7628_GPIO_MODE_PWM0),
GRP_G("uart2", uart2_grp_mt7628, MT7628_GPIO_MODE_MASK,
1, MT7628_GPIO_MODE_UART2),
@@ -233,6 +249,10 @@ static struct rt2880_pmx_group mt7628an_pinmux_data[] = {
1, MT7628_GPIO_MODE_SPIS),
GRP_G("gpio", gpio_grp_mt7628, MT7628_GPIO_MODE_MASK,
1, MT7628_GPIO_MODE_GPIO),
GRP_G("wled_an", wled_an_grp_mt7628, MT7628_GPIO_MODE_MASK,
1, MT7628_GPIO_MODE_WLED_AN),
GRP_G("wled_kn", wled_kn_grp_mt7628, MT7628_GPIO_MODE_MASK,
1, MT7628_GPIO_MODE_WLED_KN),
{ 0 }
};
@@ -439,7 +459,7 @@ void __init ralink_clk_init(void)
ralink_clk_add("10000c00.uartlite", periph_rate);
ralink_clk_add("10180000.wmac", xtal_rate);
if (IS_ENABLED(CONFIG_USB) && is_mt76x8()) {
if (IS_ENABLED(CONFIG_USB) && !is_mt76x8()) {
/*
* When the CPU goes into sleep mode, the BUS clock will be
* too low for USB to function properly. Adjust the busses

View File

@@ -109,5 +109,5 @@ void prom_soc_init(struct ralink_soc_info *soc_info)
soc_info->mem_size_max = RT2880_MEM_SIZE_MAX;
rt2880_pinmux_data = rt2880_pinmux_data_act;
ralink_soc == RT2880_SOC;
ralink_soc = RT2880_SOC;
}

View File

@@ -88,7 +88,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
struct vm_area_struct *vma, *prev;
unsigned long task_size = TASK_SIZE;
int do_color_align, last_mmap;
struct vm_unmapped_area_info info;
@@ -115,9 +115,10 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
else
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
vma = find_vma_prev(mm, addr, &prev);
if (task_size - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)) &&
(!prev || addr >= vm_end_gap(prev)))
goto found_addr;
}
@@ -141,7 +142,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
const unsigned long len, const unsigned long pgoff,
const unsigned long flags)
{
struct vm_area_struct *vma;
struct vm_area_struct *vma, *prev;
struct mm_struct *mm = current->mm;
unsigned long addr = addr0;
int do_color_align, last_mmap;
@@ -175,9 +176,11 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
addr = COLOR_ALIGN(addr, last_mmap, pgoff);
else
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
vma = find_vma_prev(mm, addr, &prev);
if (TASK_SIZE - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)) &&
(!prev || addr >= vm_end_gap(prev)))
goto found_addr;
}

View File

@@ -44,8 +44,22 @@ extern void __init dump_numa_cpu_topology(void);
extern int sysfs_add_device_to_node(struct device *dev, int nid);
extern void sysfs_remove_device_from_node(struct device *dev, int nid);
static inline int early_cpu_to_node(int cpu)
{
int nid;
nid = numa_cpu_lookup_table[cpu];
/*
* Fall back to node 0 if nid is unset (it should be, except bugs).
* This allows callers to safely do NODE_DATA(early_cpu_to_node(cpu)).
*/
return (nid < 0) ? 0 : nid;
}
#else
static inline int early_cpu_to_node(int cpu) { return 0; }
static inline void dump_numa_cpu_topology(void) {}
static inline int sysfs_add_device_to_node(struct device *dev, int nid)

View File

@@ -304,9 +304,17 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity)
*
* For pHyp, we have to enable IO for log retrieval. Otherwise,
* 0xFF's is always returned from PCI config space.
*
* When the @severity is EEH_LOG_PERM, the PE is going to be
* removed. Prior to that, the drivers for devices included in
* the PE will be closed. The drivers rely on working IO path
* to bring the devices to quiet state. Otherwise, PCI traffic
* from those devices after they are removed is like to cause
* another unexpected EEH error.
*/
if (!(pe->type & EEH_PE_PHB)) {
if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG))
if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG) ||
severity == EEH_LOG_PERM)
eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
/*

View File

@@ -655,7 +655,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
*/
#define MAX_WAIT_FOR_RECOVERY 300
static void eeh_handle_normal_event(struct eeh_pe *pe)
static bool eeh_handle_normal_event(struct eeh_pe *pe)
{
struct pci_bus *frozen_bus;
int rc = 0;
@@ -665,7 +665,7 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
if (!frozen_bus) {
pr_err("%s: Cannot find PCI bus for PHB#%d-PE#%x\n",
__func__, pe->phb->global_number, pe->addr);
return;
return false;
}
eeh_pe_update_time_stamp(pe);
@@ -790,7 +790,7 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)
pr_info("EEH: Notify device driver to resume\n");
eeh_pe_dev_traverse(pe, eeh_report_resume, NULL);
return;
return false;
excess_failures:
/*
@@ -831,7 +831,11 @@ perm_error:
pci_lock_rescan_remove();
pcibios_remove_pci_devices(frozen_bus);
pci_unlock_rescan_remove();
/* The passed PE should no longer be used */
return true;
}
return false;
}
static void eeh_handle_special_event(void)
@@ -897,7 +901,14 @@ static void eeh_handle_special_event(void)
*/
if (rc == EEH_NEXT_ERR_FROZEN_PE ||
rc == EEH_NEXT_ERR_FENCED_PHB) {
eeh_handle_normal_event(pe);
/*
* eeh_handle_normal_event() can make the PE stale if it
* determines that the PE cannot possibly be recovered.
* Don't modify the PE state if that's the case.
*/
if (eeh_handle_normal_event(pe))
continue;
eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
} else {
pci_lock_rescan_remove();

View File

@@ -514,6 +514,15 @@ int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
#endif
#endif
/*
* jprobes use jprobe_return() which skips the normal return
* path of the function, and this messes up the accounting of the
* function graph tracer.
*
* Pause function graph tracing while performing the jprobe function.
*/
pause_graph_tracing();
return 1;
}
@@ -536,6 +545,8 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
* saved regs...
*/
memcpy(regs, &kcb->jprobe_saved_regs, sizeof(struct pt_regs));
/* It's OK to start function graph tracing again */
unpause_graph_tracing();
preempt_enable_no_resched();
return 1;
}

View File

@@ -751,7 +751,7 @@ void __init setup_arch(char **cmdline_p)
static void * __init pcpu_fc_alloc(unsigned int cpu, size_t size, size_t align)
{
return __alloc_bootmem_node(NODE_DATA(cpu_to_node(cpu)), size, align,
return __alloc_bootmem_node(NODE_DATA(early_cpu_to_node(cpu)), size, align,
__pa(MAX_DMA_ADDRESS));
}
@@ -762,7 +762,7 @@ static void __init pcpu_fc_free(void *ptr, size_t size)
static int pcpu_cpu_distance(unsigned int from, unsigned int to)
{
if (cpu_to_node(from) == cpu_to_node(to))
if (early_cpu_to_node(from) == early_cpu_to_node(to))
return LOCAL_DISTANCE;
else
return REMOTE_DISTANCE;

View File

@@ -2693,6 +2693,27 @@ static int kvmppc_vcpu_run_hv(struct kvm_run *run, struct kvm_vcpu *vcpu)
return -EINVAL;
}
/*
* Don't allow entry with a suspended transaction, because
* the guest entry/exit code will lose it.
* If the guest has TM enabled, save away their TM-related SPRs
* (they will get restored by the TM unavailable interrupt).
*/
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
if (cpu_has_feature(CPU_FTR_TM) && current->thread.regs &&
(current->thread.regs->msr & MSR_TM)) {
if (MSR_TM_ACTIVE(current->thread.regs->msr)) {
run->exit_reason = KVM_EXIT_FAIL_ENTRY;
run->fail_entry.hardware_entry_failure_reason = 0;
return -EINVAL;
}
current->thread.tm_tfhar = mfspr(SPRN_TFHAR);
current->thread.tm_tfiar = mfspr(SPRN_TFIAR);
current->thread.tm_texasr = mfspr(SPRN_TEXASR);
current->thread.regs->msr &= ~MSR_TM;
}
#endif
kvmppc_core_prepare_to_enter(vcpu);
/* No need to go into the guest when all we'll do is come back out */

View File

@@ -179,6 +179,16 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_1T_SEGMENT)
b slb_finish_load
8: /* invalid EA */
/*
* It's possible the bad EA is too large to fit in the SLB cache, which
* would mean we'd fail to invalidate it on context switch. So mark the
* SLB cache as full so we force a full flush. We also set cr7+eq to
* mark the address as a kernel address, so slb_finish_load() skips
* trying to insert it into the SLB cache.
*/
li r9,SLB_CACHE_ENTRIES + 1
sth r9,PACASLBCACHEPTR(r13)
crset 4*cr7+eq
li r10,0 /* BAD_VSID */
li r9,0 /* BAD_VSID */
li r11,SLB_VSID_USER /* flags don't much matter */

View File

@@ -105,7 +105,7 @@ static int slice_area_is_free(struct mm_struct *mm, unsigned long addr,
if ((mm->task_size - len) < addr)
return 0;
vma = find_vma(mm, addr);
return (!vma || (addr + len) <= vma->vm_start);
return (!vma || (addr + len) <= vm_start_gap(vma));
}
static int slice_low_has_vma(struct mm_struct *mm, unsigned long slice)

View File

@@ -110,6 +110,7 @@ static struct property *dlpar_clone_drconf_property(struct device_node *dn)
for (i = 0; i < num_lmbs; i++) {
lmbs[i].base_addr = be64_to_cpu(lmbs[i].base_addr);
lmbs[i].drc_index = be32_to_cpu(lmbs[i].drc_index);
lmbs[i].aa_index = be32_to_cpu(lmbs[i].aa_index);
lmbs[i].flags = be32_to_cpu(lmbs[i].flags);
}
@@ -553,6 +554,7 @@ static void dlpar_update_drconf_property(struct device_node *dn,
for (i = 0; i < num_lmbs; i++) {
lmbs[i].base_addr = cpu_to_be64(lmbs[i].base_addr);
lmbs[i].drc_index = cpu_to_be32(lmbs[i].drc_index);
lmbs[i].aa_index = cpu_to_be32(lmbs[i].aa_index);
lmbs[i].flags = cpu_to_be32(lmbs[i].flags);
}

View File

@@ -15,7 +15,9 @@
BUILD_BUG_ON(sizeof(addrtype) != (high - low + 1) * sizeof(long));\
asm volatile( \
" lctlg %1,%2,%0\n" \
: : "Q" (*(addrtype *)(&array)), "i" (low), "i" (high));\
: \
: "Q" (*(addrtype *)(&array)), "i" (low), "i" (high) \
: "memory"); \
}
#define __ctl_store(array, low, high) { \

View File

@@ -229,12 +229,17 @@ ENTRY(sie64a)
lctlg %c1,%c1,__LC_USER_ASCE # load primary asce
.Lsie_done:
# some program checks are suppressing. C code (e.g. do_protection_exception)
# will rewind the PSW by the ILC, which is 4 bytes in case of SIE. Other
# instructions between sie64a and .Lsie_done should not cause program
# interrupts. So lets use a nop (47 00 00 00) as a landing pad.
# will rewind the PSW by the ILC, which is often 4 bytes in case of SIE. There
# are some corner cases (e.g. runtime instrumentation) where ILC is unpredictable.
# Other instructions between sie64a and .Lsie_done should not cause program
# interrupts. So lets use 3 nops as a landing pad for all possible rewinds.
# See also .Lcleanup_sie
.Lrewind_pad:
nop 0
.Lrewind_pad6:
nopr 7
.Lrewind_pad4:
nopr 7
.Lrewind_pad2:
nopr 7
.globl sie_exit
sie_exit:
lg %r14,__SF_EMPTY+8(%r15) # load guest register save area
@@ -247,7 +252,9 @@ sie_exit:
stg %r14,__SF_EMPTY+16(%r15) # set exit reason code
j sie_exit
EX_TABLE(.Lrewind_pad,.Lsie_fault)
EX_TABLE(.Lrewind_pad6,.Lsie_fault)
EX_TABLE(.Lrewind_pad4,.Lsie_fault)
EX_TABLE(.Lrewind_pad2,.Lsie_fault)
EX_TABLE(sie_exit,.Lsie_fault)
#endif

View File

@@ -97,7 +97,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr && addr >= mmap_min_addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}
@@ -135,7 +135,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr && addr >= mmap_min_addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}

View File

@@ -372,7 +372,7 @@ void __init vmem_map_init(void)
ro_end = (unsigned long)&_eshared & PAGE_MASK;
for_each_memblock(memory, reg) {
start = reg->base;
end = reg->base + reg->size - 1;
end = reg->base + reg->size;
if (start >= ro_end || end <= ro_start)
vmem_add_mem(start, end - start, 0);
else if (start >= ro_start && end <= ro_end)

View File

@@ -63,7 +63,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}
@@ -113,7 +113,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}

View File

@@ -183,9 +183,9 @@ config NR_CPUS
int "Maximum number of CPUs"
depends on SMP
range 2 32 if SPARC32
range 2 1024 if SPARC64
range 2 4096 if SPARC64
default 32 if SPARC32
default 64 if SPARC64
default 4096 if SPARC64
source kernel/Kconfig.hz

View File

@@ -52,7 +52,7 @@
#define CTX_NR_MASK TAG_CONTEXT_BITS
#define CTX_HW_MASK (CTX_NR_MASK | CTX_PGSZ_MASK)
#define CTX_FIRST_VERSION ((_AC(1,UL) << CTX_VERSION_SHIFT) + _AC(1,UL))
#define CTX_FIRST_VERSION BIT(CTX_VERSION_SHIFT)
#define CTX_VALID(__ctx) \
(!(((__ctx.sparc64_ctx_val) ^ tlb_context_cache) & CTX_VERSION_MASK))
#define CTX_HWBITS(__ctx) ((__ctx.sparc64_ctx_val) & CTX_HW_MASK)

View File

@@ -17,13 +17,8 @@ extern spinlock_t ctx_alloc_lock;
extern unsigned long tlb_context_cache;
extern unsigned long mmu_context_bmap[];
DECLARE_PER_CPU(struct mm_struct *, per_cpu_secondary_mm);
void get_new_mmu_context(struct mm_struct *mm);
#ifdef CONFIG_SMP
void smp_new_mmu_context_version(void);
#else
#define smp_new_mmu_context_version() do { } while (0)
#endif
int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
void destroy_context(struct mm_struct *mm);
@@ -74,8 +69,9 @@ void __flush_tlb_mm(unsigned long, unsigned long);
static inline void switch_mm(struct mm_struct *old_mm, struct mm_struct *mm, struct task_struct *tsk)
{
unsigned long ctx_valid, flags;
int cpu;
int cpu = smp_processor_id();
per_cpu(per_cpu_secondary_mm, cpu) = mm;
if (unlikely(mm == &init_mm))
return;
@@ -121,7 +117,6 @@ static inline void switch_mm(struct mm_struct *old_mm, struct mm_struct *mm, str
* for the first time, we must flush that context out of the
* local TLB.
*/
cpu = smp_processor_id();
if (!ctx_valid || !cpumask_test_cpu(cpu, mm_cpumask(mm))) {
cpumask_set_cpu(cpu, mm_cpumask(mm));
__flush_tlb_mm(CTX_HWBITS(mm->context),
@@ -131,26 +126,7 @@ static inline void switch_mm(struct mm_struct *old_mm, struct mm_struct *mm, str
}
#define deactivate_mm(tsk,mm) do { } while (0)
/* Activate a new MM instance for the current task. */
static inline void activate_mm(struct mm_struct *active_mm, struct mm_struct *mm)
{
unsigned long flags;
int cpu;
spin_lock_irqsave(&mm->context.lock, flags);
if (!CTX_VALID(mm->context))
get_new_mmu_context(mm);
cpu = smp_processor_id();
if (!cpumask_test_cpu(cpu, mm_cpumask(mm)))
cpumask_set_cpu(cpu, mm_cpumask(mm));
load_secondary_context(mm);
__flush_tlb_mm(CTX_HWBITS(mm->context), SECONDARY_CONTEXT);
tsb_context_switch(mm);
spin_unlock_irqrestore(&mm->context.lock, flags);
}
#define activate_mm(active_mm, mm) switch_mm(active_mm, mm, NULL)
#endif /* !(__ASSEMBLY__) */
#endif /* !(__SPARC64_MMU_CONTEXT_H) */

View File

@@ -20,7 +20,6 @@
#define PIL_SMP_CALL_FUNC 1
#define PIL_SMP_RECEIVE_SIGNAL 2
#define PIL_SMP_CAPTURE 3
#define PIL_SMP_CTX_NEW_VERSION 4
#define PIL_DEVICE_IRQ 5
#define PIL_SMP_CALL_FUNC_SNGL 6
#define PIL_DEFERRED_PCR_WORK 7

View File

@@ -327,6 +327,7 @@ struct vio_dev {
int compat_len;
u64 dev_no;
u64 id;
unsigned long channel_id;

View File

@@ -1034,17 +1034,26 @@ static void __init init_cpu_send_mondo_info(struct trap_per_cpu *tb)
{
#ifdef CONFIG_SMP
unsigned long page;
void *mondo, *p;
BUILD_BUG_ON((NR_CPUS * sizeof(u16)) > (PAGE_SIZE - 64));
BUILD_BUG_ON((NR_CPUS * sizeof(u16)) > PAGE_SIZE);
/* Make sure mondo block is 64byte aligned */
p = kzalloc(127, GFP_KERNEL);
if (!p) {
prom_printf("SUN4V: Error, cannot allocate mondo block.\n");
prom_halt();
}
mondo = (void *)(((unsigned long)p + 63) & ~0x3f);
tb->cpu_mondo_block_pa = __pa(mondo);
page = get_zeroed_page(GFP_KERNEL);
if (!page) {
prom_printf("SUN4V: Error, cannot allocate cpu mondo page.\n");
prom_printf("SUN4V: Error, cannot allocate cpu list page.\n");
prom_halt();
}
tb->cpu_mondo_block_pa = __pa(page);
tb->cpu_list_pa = __pa(page + 64);
tb->cpu_list_pa = __pa(page);
#endif
}

View File

@@ -37,7 +37,6 @@ void handle_stdfmna(struct pt_regs *regs, unsigned long sfar, unsigned long sfsr
/* smp_64.c */
void __irq_entry smp_call_function_client(int irq, struct pt_regs *regs);
void __irq_entry smp_call_function_single_client(int irq, struct pt_regs *regs);
void __irq_entry smp_new_mmu_context_version_client(int irq, struct pt_regs *regs);
void __irq_entry smp_penguin_jailcell(int irq, struct pt_regs *regs);
void __irq_entry smp_receive_signal_client(int irq, struct pt_regs *regs);

View File

@@ -959,37 +959,6 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page)
preempt_enable();
}
void __irq_entry smp_new_mmu_context_version_client(int irq, struct pt_regs *regs)
{
struct mm_struct *mm;
unsigned long flags;
clear_softint(1 << irq);
/* See if we need to allocate a new TLB context because
* the version of the one we are using is now out of date.
*/
mm = current->active_mm;
if (unlikely(!mm || (mm == &init_mm)))
return;
spin_lock_irqsave(&mm->context.lock, flags);
if (unlikely(!CTX_VALID(mm->context)))
get_new_mmu_context(mm);
spin_unlock_irqrestore(&mm->context.lock, flags);
load_secondary_context(mm);
__flush_tlb_mm(CTX_HWBITS(mm->context),
SECONDARY_CONTEXT);
}
void smp_new_mmu_context_version(void)
{
smp_cross_call(&xcall_new_mmu_context_version, 0, 0, 0);
}
#ifdef CONFIG_KGDB
void kgdb_roundup_cpus(unsigned long flags)
{

View File

@@ -118,7 +118,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
vma = find_vma(mm, addr);
if (task_size - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}
@@ -181,7 +181,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
vma = find_vma(mm, addr);
if (task_size - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}

View File

@@ -85,7 +85,7 @@ static void dump_tl1_traplog(struct tl1_traplog *p)
void bad_trap(struct pt_regs *regs, long lvl)
{
char buffer[32];
char buffer[36];
siginfo_t info;
if (notify_die(DIE_TRAP, "bad trap", regs,
@@ -116,7 +116,7 @@ void bad_trap(struct pt_regs *regs, long lvl)
void bad_trap_tl1(struct pt_regs *regs, long lvl)
{
char buffer[32];
char buffer[36];
if (notify_die(DIE_TRAP_TL1, "bad trap tl1", regs,
0, lvl, SIGTRAP) == NOTIFY_STOP)

View File

@@ -470,13 +470,16 @@ __tsb_context_switch:
.type copy_tsb,#function
copy_tsb: /* %o0=old_tsb_base, %o1=old_tsb_size
* %o2=new_tsb_base, %o3=new_tsb_size
* %o4=page_size_shift
*/
sethi %uhi(TSB_PASS_BITS), %g7
srlx %o3, 4, %o3
add %o0, %o1, %g1 /* end of old tsb */
add %o0, %o1, %o1 /* end of old tsb */
sllx %g7, 32, %g7
sub %o3, 1, %o3 /* %o3 == new tsb hash mask */
mov %o4, %g1 /* page_size_shift */
661: prefetcha [%o0] ASI_N, #one_read
.section .tsb_phys_patch, "ax"
.word 661b
@@ -501,9 +504,9 @@ copy_tsb: /* %o0=old_tsb_base, %o1=old_tsb_size
/* This can definitely be computed faster... */
srlx %o0, 4, %o5 /* Build index */
and %o5, 511, %o5 /* Mask index */
sllx %o5, PAGE_SHIFT, %o5 /* Put into vaddr position */
sllx %o5, %g1, %o5 /* Put into vaddr position */
or %o4, %o5, %o4 /* Full VADDR. */
srlx %o4, PAGE_SHIFT, %o4 /* Shift down to create index */
srlx %o4, %g1, %o4 /* Shift down to create index */
and %o4, %o3, %o4 /* Mask with new_tsb_nents-1 */
sllx %o4, 4, %o4 /* Shift back up into tsb ent offset */
TSB_STORE(%o2 + %o4, %g2) /* Store TAG */
@@ -511,7 +514,7 @@ copy_tsb: /* %o0=old_tsb_base, %o1=old_tsb_size
TSB_STORE(%o2 + %o4, %g3) /* Store TTE */
80: add %o0, 16, %o0
cmp %o0, %g1
cmp %o0, %o1
bne,pt %xcc, 90b
nop

View File

@@ -50,7 +50,7 @@ tl0_resv03e: BTRAP(0x3e) BTRAP(0x3f) BTRAP(0x40)
tl0_irq1: TRAP_IRQ(smp_call_function_client, 1)
tl0_irq2: TRAP_IRQ(smp_receive_signal_client, 2)
tl0_irq3: TRAP_IRQ(smp_penguin_jailcell, 3)
tl0_irq4: TRAP_IRQ(smp_new_mmu_context_version_client, 4)
tl0_irq4: BTRAP(0x44)
#else
tl0_irq1: BTRAP(0x41)
tl0_irq2: BTRAP(0x42)

View File

@@ -284,13 +284,16 @@ static struct vio_dev *vio_create_one(struct mdesc_handle *hp, u64 mp,
if (!id) {
dev_set_name(&vdev->dev, "%s", bus_id_name);
vdev->dev_no = ~(u64)0;
vdev->id = ~(u64)0;
} else if (!cfg_handle) {
dev_set_name(&vdev->dev, "%s-%llu", bus_id_name, *id);
vdev->dev_no = *id;
vdev->id = ~(u64)0;
} else {
dev_set_name(&vdev->dev, "%s-%llu-%llu", bus_id_name,
*cfg_handle, *id);
vdev->dev_no = *cfg_handle;
vdev->id = *id;
}
vdev->dev.parent = parent;
@@ -333,27 +336,84 @@ static void vio_add(struct mdesc_handle *hp, u64 node)
(void) vio_create_one(hp, node, &root_vdev->dev);
}
struct vio_md_node_query {
const char *type;
u64 dev_no;
u64 id;
};
static int vio_md_node_match(struct device *dev, void *arg)
{
struct vio_md_node_query *query = (struct vio_md_node_query *) arg;
struct vio_dev *vdev = to_vio_dev(dev);
if (vdev->mp == (u64) arg)
return 1;
if (vdev->dev_no != query->dev_no)
return 0;
if (vdev->id != query->id)
return 0;
if (strcmp(vdev->type, query->type))
return 0;
return 0;
return 1;
}
static void vio_remove(struct mdesc_handle *hp, u64 node)
{
const char *type;
const u64 *id, *cfg_handle;
u64 a;
struct vio_md_node_query query;
struct device *dev;
dev = device_find_child(&root_vdev->dev, (void *) node,
type = mdesc_get_property(hp, node, "device-type", NULL);
if (!type) {
type = mdesc_get_property(hp, node, "name", NULL);
if (!type)
type = mdesc_node_name(hp, node);
}
query.type = type;
id = mdesc_get_property(hp, node, "id", NULL);
cfg_handle = NULL;
mdesc_for_each_arc(a, hp, node, MDESC_ARC_TYPE_BACK) {
u64 target;
target = mdesc_arc_target(hp, a);
cfg_handle = mdesc_get_property(hp, target,
"cfg-handle", NULL);
if (cfg_handle)
break;
}
if (!id) {
query.dev_no = ~(u64)0;
query.id = ~(u64)0;
} else if (!cfg_handle) {
query.dev_no = *id;
query.id = ~(u64)0;
} else {
query.dev_no = *cfg_handle;
query.id = *id;
}
dev = device_find_child(&root_vdev->dev, &query,
vio_md_node_match);
if (dev) {
printk(KERN_INFO "VIO: Removing device %s\n", dev_name(dev));
device_unregister(dev);
put_device(dev);
} else {
if (!id)
printk(KERN_ERR "VIO: Removed unknown %s node.\n",
type);
else if (!cfg_handle)
printk(KERN_ERR "VIO: Removed unknown %s node %llu.\n",
type, *id);
else
printk(KERN_ERR "VIO: Removed unknown %s node %llu-%llu.\n",
type, *cfg_handle, *id);
}
}

View File

@@ -115,7 +115,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
addr = ALIGN(addr, HPAGE_SIZE);
vma = find_vma(mm, addr);
if (task_size - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}
if (mm->get_unmapped_area == arch_get_unmapped_area)

View File

@@ -656,10 +656,58 @@ EXPORT_SYMBOL(__flush_dcache_range);
/* get_new_mmu_context() uses "cache + 1". */
DEFINE_SPINLOCK(ctx_alloc_lock);
unsigned long tlb_context_cache = CTX_FIRST_VERSION - 1;
unsigned long tlb_context_cache = CTX_FIRST_VERSION;
#define MAX_CTX_NR (1UL << CTX_NR_BITS)
#define CTX_BMAP_SLOTS BITS_TO_LONGS(MAX_CTX_NR)
DECLARE_BITMAP(mmu_context_bmap, MAX_CTX_NR);
DEFINE_PER_CPU(struct mm_struct *, per_cpu_secondary_mm) = {0};
static void mmu_context_wrap(void)
{
unsigned long old_ver = tlb_context_cache & CTX_VERSION_MASK;
unsigned long new_ver, new_ctx, old_ctx;
struct mm_struct *mm;
int cpu;
bitmap_zero(mmu_context_bmap, 1 << CTX_NR_BITS);
/* Reserve kernel context */
set_bit(0, mmu_context_bmap);
new_ver = (tlb_context_cache & CTX_VERSION_MASK) + CTX_FIRST_VERSION;
if (unlikely(new_ver == 0))
new_ver = CTX_FIRST_VERSION;
tlb_context_cache = new_ver;
/*
* Make sure that any new mm that are added into per_cpu_secondary_mm,
* are going to go through get_new_mmu_context() path.
*/
mb();
/*
* Updated versions to current on those CPUs that had valid secondary
* contexts
*/
for_each_online_cpu(cpu) {
/*
* If a new mm is stored after we took this mm from the array,
* it will go into get_new_mmu_context() path, because we
* already bumped the version in tlb_context_cache.
*/
mm = per_cpu(per_cpu_secondary_mm, cpu);
if (unlikely(!mm || mm == &init_mm))
continue;
old_ctx = mm->context.sparc64_ctx_val;
if (likely((old_ctx & CTX_VERSION_MASK) == old_ver)) {
new_ctx = (old_ctx & ~CTX_VERSION_MASK) | new_ver;
set_bit(new_ctx & CTX_NR_MASK, mmu_context_bmap);
mm->context.sparc64_ctx_val = new_ctx;
}
}
}
/* Caller does TLB context flushing on local CPU if necessary.
* The caller also ensures that CTX_VALID(mm->context) is false.
@@ -675,48 +723,30 @@ void get_new_mmu_context(struct mm_struct *mm)
{
unsigned long ctx, new_ctx;
unsigned long orig_pgsz_bits;
int new_version;
spin_lock(&ctx_alloc_lock);
retry:
/* wrap might have happened, test again if our context became valid */
if (unlikely(CTX_VALID(mm->context)))
goto out;
orig_pgsz_bits = (mm->context.sparc64_ctx_val & CTX_PGSZ_MASK);
ctx = (tlb_context_cache + 1) & CTX_NR_MASK;
new_ctx = find_next_zero_bit(mmu_context_bmap, 1 << CTX_NR_BITS, ctx);
new_version = 0;
if (new_ctx >= (1 << CTX_NR_BITS)) {
new_ctx = find_next_zero_bit(mmu_context_bmap, ctx, 1);
if (new_ctx >= ctx) {
int i;
new_ctx = (tlb_context_cache & CTX_VERSION_MASK) +
CTX_FIRST_VERSION;
if (new_ctx == 1)
new_ctx = CTX_FIRST_VERSION;
/* Don't call memset, for 16 entries that's just
* plain silly...
*/
mmu_context_bmap[0] = 3;
mmu_context_bmap[1] = 0;
mmu_context_bmap[2] = 0;
mmu_context_bmap[3] = 0;
for (i = 4; i < CTX_BMAP_SLOTS; i += 4) {
mmu_context_bmap[i + 0] = 0;
mmu_context_bmap[i + 1] = 0;
mmu_context_bmap[i + 2] = 0;
mmu_context_bmap[i + 3] = 0;
}
new_version = 1;
goto out;
mmu_context_wrap();
goto retry;
}
}
if (mm->context.sparc64_ctx_val)
cpumask_clear(mm_cpumask(mm));
mmu_context_bmap[new_ctx>>6] |= (1UL << (new_ctx & 63));
new_ctx |= (tlb_context_cache & CTX_VERSION_MASK);
out:
tlb_context_cache = new_ctx;
mm->context.sparc64_ctx_val = new_ctx | orig_pgsz_bits;
out:
spin_unlock(&ctx_alloc_lock);
if (unlikely(new_version))
smp_new_mmu_context_version();
}
static int numa_enabled = 1;

View File

@@ -451,7 +451,8 @@ retry_tsb_alloc:
extern void copy_tsb(unsigned long old_tsb_base,
unsigned long old_tsb_size,
unsigned long new_tsb_base,
unsigned long new_tsb_size);
unsigned long new_tsb_size,
unsigned long page_size_shift);
unsigned long old_tsb_base = (unsigned long) old_tsb;
unsigned long new_tsb_base = (unsigned long) new_tsb;
@@ -459,7 +460,9 @@ retry_tsb_alloc:
old_tsb_base = __pa(old_tsb_base);
new_tsb_base = __pa(new_tsb_base);
}
copy_tsb(old_tsb_base, old_size, new_tsb_base, new_size);
copy_tsb(old_tsb_base, old_size, new_tsb_base, new_size,
tsb_index == MM_TSB_BASE ?
PAGE_SHIFT : REAL_HPAGE_SHIFT);
}
mm->context.tsb_block[tsb_index].tsb = new_tsb;

View File

@@ -971,11 +971,6 @@ xcall_capture:
wr %g0, (1 << PIL_SMP_CAPTURE), %set_softint
retry
.globl xcall_new_mmu_context_version
xcall_new_mmu_context_version:
wr %g0, (1 << PIL_SMP_CTX_NEW_VERSION), %set_softint
retry
#ifdef CONFIG_KGDB
.globl xcall_kgdb_capture
xcall_kgdb_capture:

View File

@@ -232,7 +232,7 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
addr = ALIGN(addr, huge_page_size(h));
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}
if (current->mm->get_unmapped_area == arch_get_unmapped_area)

View File

@@ -221,6 +221,9 @@ struct x86_emulate_ops {
void (*get_cpuid)(struct x86_emulate_ctxt *ctxt,
u32 *eax, u32 *ebx, u32 *ecx, u32 *edx);
void (*set_nmi_mask)(struct x86_emulate_ctxt *ctxt, bool masked);
unsigned (*get_hflags)(struct x86_emulate_ctxt *ctxt);
void (*set_hflags)(struct x86_emulate_ctxt *ctxt, unsigned hflags);
};
typedef u32 __attribute__((vector_size(16))) sse128_t;
@@ -290,7 +293,6 @@ struct x86_emulate_ctxt {
/* interruptibility state, as a result of execution of STI or MOV SS */
int interruptibility;
int emul_flags;
bool perm_ok; /* do not check permissions if true */
bool ud; /* inject an #UD if host doesn't support insn */

View File

@@ -7,6 +7,7 @@
bool pat_enabled(void);
void pat_disable(const char *reason);
extern void pat_init(void);
extern void init_cache_modes(void);
extern int reserve_memtype(u64 start, u64 end,
enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);

View File

@@ -161,8 +161,8 @@ void kvm_async_pf_task_wait(u32 token)
*/
rcu_irq_exit();
native_safe_halt();
rcu_irq_enter();
local_irq_disable();
rcu_irq_enter();
}
}
if (!n.halted)

View File

@@ -1048,6 +1048,13 @@ void __init setup_arch(char **cmdline_p)
if (mtrr_trim_uncached_memory(max_pfn))
max_pfn = e820_end_of_ram_pfn();
/*
* This call is required when the CPU does not support PAT. If
* mtrr_bp_init() invoked it already via pat_init() the call has no
* effect.
*/
init_cache_modes();
#ifdef CONFIG_X86_32
/* max_low_pfn get updated here */
find_low_pfn_range();

View File

@@ -143,7 +143,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
if (end - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}
@@ -186,7 +186,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr &&
(!vma || addr + len <= vma->vm_start))
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}

View File

@@ -737,18 +737,20 @@ out:
static int move_to_next_stateful_cpuid_entry(struct kvm_vcpu *vcpu, int i)
{
struct kvm_cpuid_entry2 *e = &vcpu->arch.cpuid_entries[i];
int j, nent = vcpu->arch.cpuid_nent;
struct kvm_cpuid_entry2 *ej;
int j = i;
int nent = vcpu->arch.cpuid_nent;
e->flags &= ~KVM_CPUID_FLAG_STATE_READ_NEXT;
/* when no next entry is found, the current entry[i] is reselected */
for (j = i + 1; ; j = (j + 1) % nent) {
struct kvm_cpuid_entry2 *ej = &vcpu->arch.cpuid_entries[j];
if (ej->function == e->function) {
ej->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
return j;
}
}
return 0; /* silence gcc, even though control never reaches here */
do {
j = (j + 1) % nent;
ej = &vcpu->arch.cpuid_entries[j];
} while (ej->function != e->function);
ej->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
return j;
}
/* find an entry with matching function, matching index (if needed), and that

View File

@@ -2531,7 +2531,7 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
u64 smbase;
int ret;
if ((ctxt->emul_flags & X86EMUL_SMM_MASK) == 0)
if ((ctxt->ops->get_hflags(ctxt) & X86EMUL_SMM_MASK) == 0)
return emulate_ud(ctxt);
/*
@@ -2580,11 +2580,11 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
return X86EMUL_UNHANDLEABLE;
}
if ((ctxt->emul_flags & X86EMUL_SMM_INSIDE_NMI_MASK) == 0)
if ((ctxt->ops->get_hflags(ctxt) & X86EMUL_SMM_INSIDE_NMI_MASK) == 0)
ctxt->ops->set_nmi_mask(ctxt, false);
ctxt->emul_flags &= ~X86EMUL_SMM_INSIDE_NMI_MASK;
ctxt->emul_flags &= ~X86EMUL_SMM_MASK;
ctxt->ops->set_hflags(ctxt, ctxt->ops->get_hflags(ctxt) &
~(X86EMUL_SMM_INSIDE_NMI_MASK | X86EMUL_SMM_MASK));
return X86EMUL_CONTINUE;
}
@@ -5296,6 +5296,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
const struct x86_emulate_ops *ops = ctxt->ops;
int rc = X86EMUL_CONTINUE;
int saved_dst_type = ctxt->dst.type;
unsigned emul_flags;
ctxt->mem_read.pos = 0;
@@ -5310,6 +5311,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
goto done;
}
emul_flags = ctxt->ops->get_hflags(ctxt);
if (unlikely(ctxt->d &
(No64|Undefined|Sse|Mmx|Intercept|CheckPerm|Priv|Prot|String))) {
if ((ctxt->mode == X86EMUL_MODE_PROT64 && (ctxt->d & No64)) ||
@@ -5343,7 +5345,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
fetch_possible_mmx_operand(ctxt, &ctxt->dst);
}
if (unlikely(ctxt->emul_flags & X86EMUL_GUEST_MASK) && ctxt->intercept) {
if (unlikely(emul_flags & X86EMUL_GUEST_MASK) && ctxt->intercept) {
rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_PRE_EXCEPT);
if (rc != X86EMUL_CONTINUE)
@@ -5372,7 +5374,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
goto done;
}
if (unlikely(ctxt->emul_flags & X86EMUL_GUEST_MASK) && (ctxt->d & Intercept)) {
if (unlikely(emul_flags & X86EMUL_GUEST_MASK) && (ctxt->d & Intercept)) {
rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_POST_EXCEPT);
if (rc != X86EMUL_CONTINUE)
@@ -5426,7 +5428,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
special_insn:
if (unlikely(ctxt->emul_flags & X86EMUL_GUEST_MASK) && (ctxt->d & Intercept)) {
if (unlikely(emul_flags & X86EMUL_GUEST_MASK) && (ctxt->d & Intercept)) {
rc = emulator_check_intercept(ctxt, ctxt->intercept,
X86_ICPT_POST_MEMACCESS);
if (rc != X86EMUL_CONTINUE)

View File

@@ -3433,12 +3433,15 @@ static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn)
return kvm_setup_async_pf(vcpu, gva, kvm_vcpu_gfn_to_hva(vcpu, gfn), &arch);
}
static bool can_do_async_pf(struct kvm_vcpu *vcpu)
bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu)
{
if (unlikely(!lapic_in_kernel(vcpu) ||
kvm_event_needs_reinjection(vcpu)))
return false;
if (is_guest_mode(vcpu))
return false;
return kvm_x86_ops->interrupt_allowed(vcpu);
}
@@ -3454,7 +3457,7 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool prefault, gfn_t gfn,
if (!async)
return false; /* *pfn has correct page already */
if (!prefault && can_do_async_pf(vcpu)) {
if (!prefault && kvm_can_do_async_pf(vcpu)) {
trace_kvm_try_async_get_page(gva, gfn);
if (kvm_find_async_pf_gfn(vcpu, gfn)) {
trace_kvm_async_pf_doublefault(gva, gfn);

Some files were not shown because too many files have changed in this diff Show More