Commit Graph

1556 Commits

Author SHA1 Message Date
Dan Williams
4a01255ba5 init: initialize jump labels before command line option parsing
[ Upstream commit 6041186a32 ]

When a module option, or core kernel argument, toggles a static-key it
requires jump labels to be initialized early.  While x86, PowerPC, and
ARM64 arrange for jump_label_init() to be called before parse_args(),
ARM does not.

  Kernel command line: rdinit=/sbin/init page_alloc.shuffle=1 panic=-1 console=ttyAMA0,115200 page_alloc.shuffle=1
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 0 at ./include/linux/jump_label.h:303
  page_alloc_shuffle+0x12c/0x1ac
  static_key_enable(): static key 'page_alloc_shuffle_key+0x0/0x4' used
  before call to jump_label_init()
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted
  5.1.0-rc4-next-20190410-00003-g3367c36ce744 #1
  Hardware name: ARM Integrator/CP (Device Tree)
  [<c0011c68>] (unwind_backtrace) from [<c000ec48>] (show_stack+0x10/0x18)
  [<c000ec48>] (show_stack) from [<c07e9710>] (dump_stack+0x18/0x24)
  [<c07e9710>] (dump_stack) from [<c001bb1c>] (__warn+0xe0/0x108)
  [<c001bb1c>] (__warn) from [<c001bb88>] (warn_slowpath_fmt+0x44/0x6c)
  [<c001bb88>] (warn_slowpath_fmt) from [<c0b0c4a8>]
  (page_alloc_shuffle+0x12c/0x1ac)
  [<c0b0c4a8>] (page_alloc_shuffle) from [<c0b0c550>] (shuffle_store+0x28/0x48)
  [<c0b0c550>] (shuffle_store) from [<c003e6a0>] (parse_args+0x1f4/0x350)
  [<c003e6a0>] (parse_args) from [<c0ac3c00>] (start_kernel+0x1c0/0x488)

Move the fallback call to jump_label_init() to occur before
parse_args().

The redundant calls to jump_label_init() in other archs are left intact
in case they have static key toggling use cases that are even earlier
than option parsing.

Link: http://lkml.kernel.org/r/155544804466.1032396.13418949511615676665.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Guenter Roeck <groeck@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Russell King <rmk@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-15 12:49:51 +09:00
Linus Torvalds
1c2c0dd807 init: rename and re-order boot_cpu_state_init()
commit b5b1404d08 upstream.

This is purely a preparatory patch for upcoming changes during the 4.19
merge window.

We have a function called "boot_cpu_state_init()" that isn't really
about the bootup cpu state: that is done much earlier by the similarly
named "boot_cpu_init()" (note lack of "state" in name).

This function initializes some hotplug CPU state, and needs to run after
the percpu data has been properly initialized.  It even has a comment to
that effect.

Except it _doesn't_ actually run after the percpu data has been properly
initialized.  On x86 it happens to do that, but on at least arm and
arm64, the percpu base pointers are initialized by the arch-specific
'smp_prepare_boot_cpu()' hook, which ran _after_ boot_cpu_state_init().

This had some unexpected results, and in particular we have a patch
pending for the merge window that did the obvious cleanup of using
'this_cpu_write()' in the cpu hotplug init code:

  -       per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE;
  +       this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);

which is obviously the right thing to do.  Except because of the
ordering issue, it actually failed miserably and unexpectedly on arm64.

So this just fixes the ordering, and changes the name of the function to
be 'boot_cpu_hotplug_init()' to make it obvious that it's about cpu
hotplug state, because the core CPU state was supposed to have already
been done earlier.

Marked for stable, since the (not yet merged) patch that will show this
problem is marked for stable.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Mian Yousaf Kaukab <yousaf.kaukab@suse.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-12 16:46:48 +09:00
Linus Torvalds
592aa0be00 Stop the ad-hoc games with -Wno-maybe-initialized
commit 78a5255ffb upstream.

We have some rather random rules about when we accept the
"maybe-initialized" warnings, and when we don't.

For example, we consider it unreliable for gcc versions < 4.9, but also
if -O3 is enabled, or if optimizing for size.  And then various kernel
config options disabled it, because they know that they trigger that
warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES).

And now gcc-10 seems to be introducing a lot of those warnings too, so
it falls under the same heading as 4.9 did.

At the same time, we have a very straightforward way to _enable_ that
warning when wanted: use "W=2" to enable more warnings.

So stop playing these ad-hoc games, and just disable that warning by
default, with the known and straight-forward "if you want to work on the
extra compiler warnings, use W=123".

Would it be great to have code that is always so obvious that it never
confuses the compiler whether a variable is used initialized or not?
Yes, it would.  In a perfect world, the compilers would be smarter, and
our source code would be simpler.

That's currently not the world we live in, though.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-12 16:26:08 +09:00
Masahiro Yamada
ba3c68463a kbuild: compute false-positive -Wmaybe-uninitialized cases in Kconfig
commit b303c6df80 upstream.

Since -Wmaybe-uninitialized was introduced by GCC 4.7, we have patched
various false positives:

 - commit e74fc973b6 ("Turn off -Wmaybe-uninitialized when building
   with -Os") turned off this option for -Os.

 - commit 815eb71e71 ("Kbuild: disable 'maybe-uninitialized' warning
   for CONFIG_PROFILE_ALL_BRANCHES") turned off this option for
   CONFIG_PROFILE_ALL_BRANCHES

 - commit a76bcf557e ("Kbuild: enable -Wmaybe-uninitialized warning
   for "make W=1"") turned off this option for GCC < 4.9
   Arnd provided more explanation in https://lkml.org/lkml/2017/3/14/903

I think this looks better by shifting the logic from Makefile to Kconfig.

Link: https://github.com/ClangBuiltLinux/linux/issues/350
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-12 16:26:06 +09:00
Arnd Bergmann
e6a1386f32 module: fix DEBUG_SET_MODULE_RONX typo
PD#SWPL-31258

[ Upstream commit 4d217a5adc ]

The newly added 'rodata_enabled' global variable is protected by
the wrong #ifdef, leading to a link error when CONFIG_DEBUG_SET_MODULE_RONX
is turned on:

kernel/module.o: In function `disable_ro_nx':
module.c:(.text.unlikely.disable_ro_nx+0x88): undefined reference to `rodata_enabled'
kernel/module.o: In function `module_disable_ro':
module.c:(.text.module_disable_ro+0x8c): undefined reference to `rodata_enabled'
kernel/module.o: In function `module_enable_ro':
module.c:(.text.module_enable_ro+0xb0): undefined reference to `rodata_enabled'

CONFIG_SET_MODULE_RONX does not exist, so use the correct one instead.

Fixes: 39290b389e ("module: extend 'rodata=off' boot cmdline parameter to module mappings")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Change-Id: I3ed114cc957edca46684dc14bc47e133cd1725de
2023-04-21 13:52:34 +09:00
Yonghui Yu
9f010c5760 init: wait emmc ready before root
PD#171658: init: wait emmc ready before root

A temporary ugly patch

Change-Id: I9a75fabf7460f92b0b9ae51992ac0d3c964798d2
Signed-off-by: Yonghui Yu <yonghui.yu@amlogic.com>
2018-08-19 19:12:34 -07:00
Victor Wan
810c6dd972 Merge branch 'android-4.9' into amlogic-4.9-dev
Signed-off-by: Victor Wan <victor.wan@amlogic.com>

Conflicts:
	arch/arm/configs/bcm2835_defconfig
	arch/arm/configs/sunxi_defconfig
	include/linux/cpufreq.h
	init/main.c
2018-04-24 17:43:19 +08:00
Greg Hackmann
05baf14727 Merge tag 'v4.9.93' into android-4.9
This is the 4.9.93 stable release

Change-Id: I4293d83f45982c6fd479bddbf9b0f811248ddc30
Signed-off-by: Greg Hackmann <ghackmann@google.com>
2018-04-09 11:39:17 -07:00
AKASHI Takahiro
7f9a93b30b module: extend 'rodata=off' boot cmdline parameter to module mappings
commit 39290b389e upstream.

The current "rodata=off" parameter disables read-only kernel mappings
under CONFIG_DEBUG_RODATA:
    commit d2aa1acad2 ("mm/init: Add 'rodata=off' boot cmdline parameter
    to disable read-only kernel mappings")

This patch is a logical extension to module mappings ie. read-only mappings
at module loading can be disabled even if CONFIG_DEBUG_SET_MODULE_RONX
(mainly for debug use). Please note, however, that it only affects RO/RW
permissions, keeping NX set.

This is the first step to make CONFIG_DEBUG_SET_MODULE_RONX mandatory
(always-on) in the future as CONFIG_DEBUG_RODATA on x86 and arm64.

Suggested-by: and Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Link: http://lkml.kernel.org/r/20161114061505.15238-1-takahiro.akashi@linaro.org
Signed-off-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Alex Shi <alex.shi@linaro.org> [v4.9 backport]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Will Deacon <will.deacon@arm.com>
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08 12:12:52 +02:00
Sami Tolvanen
00a195e7c0 add support for clang Control Flow Integrity (CFI)
This change adds the CONFIG_CFI_CLANG option, CFI error handling,
and a faster look-up table for cross module CFI checks.

Bug: 67506682
Change-Id: Ic009f0a629b552a0eb16e6d89808c7029e91447d
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2018-02-28 15:09:58 -08:00
Greg Kroah-Hartman
71f1469722 Merge 4.9.79 into android-4.9
Changes in 4.9.79
	x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
	orangefs: use list_for_each_entry_safe in purge_waiting_ops
	orangefs: initialize op on loop restart in orangefs_devreq_read
	usbip: prevent vhci_hcd driver from leaking a socket pointer address
	usbip: Fix implicit fallthrough warning
	usbip: Fix potential format overflow in userspace tools
	can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once
	can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
	KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2
	Prevent timer value 0 for MWAITX
	drivers: base: cacheinfo: fix x86 with CONFIG_OF enabled
	drivers: base: cacheinfo: fix boot error message when acpi is enabled
	mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
	hwpoison, memcg: forcibly uncharge LRU pages
	cma: fix calculation of aligned offset
	mm, page_alloc: fix potential false positive in __zone_watermark_ok
	ipc: msg, make msgrcv work with LONG_MIN
	ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
	ACPICA: Namespace: fix operand cache leak
	netfilter: nfnetlink_cthelper: Add missing permission checks
	netfilter: xt_osf: Add missing permission checks
	reiserfs: fix race in prealloc discard
	reiserfs: don't preallocate blocks for extended attributes
	fs/fcntl: f_setown, avoid undefined behaviour
	scsi: libiscsi: fix shifting of DID_REQUEUE host byte
	Revert "module: Add retpoline tag to VERMAGIC"
	mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
	Input: trackpoint - force 3 buttons if 0 button is reported
	orangefs: fix deadlock; do not write i_size in read_iter
	um: link vmlinux with -no-pie
	vsyscall: Fix permissions for emulate mode with KAISER/PTI
	eventpoll.h: add missing epoll event masks
	dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
	ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
	ipv6: fix udpv6 sendmsg crash caused by too small MTU
	ipv6: ip6_make_skb() needs to clear cork.base.dst
	lan78xx: Fix failure in USB Full Speed
	net: igmp: fix source address check for IGMPv3 reports
	net: qdisc_pkt_len_init() should be more robust
	net: tcp: close sock if net namespace is exiting
	pppoe: take ->needed_headroom of lower device into account on xmit
	r8169: fix memory corruption on retrieval of hardware statistics.
	sctp: do not allow the v4 socket to bind a v4mapped v6 address
	sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
	tipc: fix a memory leak in tipc_nl_node_get_link()
	vmxnet3: repair memory leak
	net: Allow neigh contructor functions ability to modify the primary_key
	ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
	ppp: unlock all_ppp_mutex before registering device
	be2net: restore properly promisc mode after queues reconfiguration
	ip6_gre: init dev->mtu and dev->hard_header_len correctly
	gso: validate gso_type in GSO handlers
	mlxsw: spectrum_router: Don't log an error on missing neighbor
	tun: fix a memory leak for tfile->tx_array
	flow_dissector: properly cap thoff field
	perf/x86/amd/power: Do not load AMD power module on !AMD platforms
	x86/microcode/intel: Extend BDW late-loading further with LLC size check
	hrtimer: Reset hrtimer cpu base proper on CPU hotplug
	x86: bpf_jit: small optimization in emit_bpf_tail_call()
	bpf: fix bpf_tail_call() x64 JIT
	bpf: introduce BPF_JIT_ALWAYS_ON config
	bpf: arsh is not supported in 32 bit alu thus reject it
	bpf: avoid false sharing of map refcount with max_entries
	bpf: fix divides by zero
	bpf: fix 32-bit divide by zero
	bpf: reject stores into ctx via st and xadd
	nfsd: auth: Fix gid sorting when rootsquash enabled
	Linux 4.9.79

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-01-31 14:13:00 +01:00
Alexei Starovoitov
a3d6dd6a66 bpf: introduce BPF_JIT_ALWAYS_ON config
[ upstream commit 290af86629 ]

The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.

A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."

To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64

The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_harden

v2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)

v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
  It will be sent when the trees are merged back to net-next

Considered doing:
  int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-31 12:55:56 +01:00
Victor Wan
20946741c8 Merge branch 'android-4.9' into amlogic-4.9-dev
Conflicts:
	Makefile
	init/main.c
2018-01-22 20:17:25 +08:00
Victor Wan
2c95ea743b Merge branch 'android-4.9' into amlogic-4.9-dev
Conflicts:
	arch/arm/configs/omap2plus_defconfig
	drivers/Makefile
	drivers/android/binder.c
2018-01-08 18:44:19 +08:00
Greg Kroah-Hartman
bc7ff9b998 Merge 4.9.75 into android-4.9
Changes in 4.9.75
	tcp_bbr: reset full pipe detection on loss recovery undo
	tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
	x86/boot: Add early cmdline parsing for options with arguments
	KAISER: Kernel Address Isolation
	kaiser: merged update
	kaiser: do not set _PAGE_NX on pgd_none
	kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE
	kaiser: fix build and FIXME in alloc_ldt_struct()
	kaiser: KAISER depends on SMP
	kaiser: fix regs to do_nmi() ifndef CONFIG_KAISER
	kaiser: fix perf crashes
	kaiser: ENOMEM if kaiser_pagetable_walk() NULL
	kaiser: tidied up asm/kaiser.h somewhat
	kaiser: tidied up kaiser_add/remove_mapping slightly
	kaiser: align addition to x86/mm/Makefile
	kaiser: cleanups while trying for gold link
	kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET
	kaiser: delete KAISER_REAL_SWITCH option
	kaiser: vmstat show NR_KAISERTABLE as nr_overhead
	kaiser: enhanced by kernel and user PCIDs
	kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user
	kaiser: PCID 0 for kernel and 128 for user
	kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user
	kaiser: paranoid_entry pass cr3 need to paranoid_exit
	kaiser: kaiser_remove_mapping() move along the pgd
	kaiser: fix unlikely error in alloc_ldt_struct()
	kaiser: add "nokaiser" boot option, using ALTERNATIVE
	x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling
	x86/kaiser: Check boottime cmdline params
	kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush
	kaiser: drop is_atomic arg to kaiser_pagetable_walk()
	kaiser: asm/tlbflush.h handle noPGE at lower level
	kaiser: kaiser_flush_tlb_on_return_to_user() check PCID
	x86/paravirt: Dont patch flush_tlb_single
	x86/kaiser: Reenable PARAVIRT
	kaiser: disabled on Xen PV
	x86/kaiser: Move feature detection up
	KPTI: Rename to PAGE_TABLE_ISOLATION
	KPTI: Report when enabled
	kaiser: Set _PAGE_NX only if supported
	Linux 4.9.75

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-01-05 22:28:25 +01:00
Hugh Dickins
0994a2cf8f kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE
Kaiser only needs to map one page of the stack; and
kernel/fork.c did not build on powerpc (no __PAGE_KERNEL).
It's all cleaner if linux/kaiser.h provides kaiser_map_thread_stack()
and kaiser_unmap_thread_stack() wrappers around asm/kaiser.h's
kaiser_add_mapping() and kaiser_remove_mapping().  And use
linux/kaiser.h in init/main.c to avoid the #ifdefs there.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-05 15:46:32 +01:00
Richard Fellner
13be4483bb KAISER: Kernel Address Isolation
This patch introduces our implementation of KAISER (Kernel Address Isolation to
have Side-channels Efficiently Removed), a kernel isolation technique to close
hardware side channels on kernel address information.

More information about the patch can be found on:

        https://github.com/IAIK/KAISER

From: Richard Fellner <richard.fellner@student.tugraz.at>
From: Daniel Gruss <daniel.gruss@iaik.tugraz.at>
Subject: [RFC, PATCH] x86_64: KAISER - do not map kernel in user mode
Date: Thu, 4 May 2017 14:26:50 +0200
Link: http://marc.info/?l=linux-kernel&m=149390087310405&w=2
Kaiser-4.10-SHA1: c4b1831d44c6144d3762ccc72f0c4e71a0c713e5

To: <linux-kernel@vger.kernel.org>
To: <kernel-hardening@lists.openwall.com>
Cc: <clementine.maurice@iaik.tugraz.at>
Cc: <moritz.lipp@iaik.tugraz.at>
Cc: Michael Schwarz <michael.schwarz@iaik.tugraz.at>
Cc: Richard Fellner <richard.fellner@student.tugraz.at>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <kirill.shutemov@linux.intel.com>
Cc: <anders.fogh@gdata-adan.de>

After several recent works [1,2,3] KASLR on x86_64 was basically
considered dead by many researchers. We have been working on an
efficient but effective fix for this problem and found that not mapping
the kernel space when running in user mode is the solution to this
problem [4] (the corresponding paper [5] will be presented at ESSoS17).

With this RFC patch we allow anybody to configure their kernel with the
flag CONFIG_KAISER to add our defense mechanism.

If there are any questions we would love to answer them.
We also appreciate any comments!

Cheers,
Daniel (+ the KAISER team from Graz University of Technology)

[1] http://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf
[2] https://www.blackhat.com/docs/us-16/materials/us-16-Fogh-Using-Undocumented-CPU-Behaviour-To-See-Into-Kernel-Mode-And-Break-KASLR-In-The-Process.pdf
[3] https://www.blackhat.com/docs/us-16/materials/us-16-Jang-Breaking-Kernel-Address-Space-Layout-Randomization-KASLR-With-Intel-TSX.pdf
[4] https://github.com/IAIK/KAISER
[5] https://gruss.cc/files/kaiser.pdf

[patch based also on
https://raw.githubusercontent.com/IAIK/KAISER/master/KAISER/0001-KAISER-Kernel-Address-Isolation.patch]

Signed-off-by: Richard Fellner <richard.fellner@student.tugraz.at>
Signed-off-by: Moritz Lipp <moritz.lipp@iaik.tugraz.at>
Signed-off-by: Daniel Gruss <daniel.gruss@iaik.tugraz.at>
Signed-off-by: Michael Schwarz <michael.schwarz@iaik.tugraz.at>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-05 15:46:32 +01:00
Nick Bray
f26d3c76d3 ANDROID: initramfs: call free_initrd() when skipping init
Memory allocated for initrd would not be reclaimed if initializing ramfs
was skipped.

Bug: 69901741
Test: "grep MemTotal /proc/meminfo" increases by a few MB on an Android
device with a/b boot.

Change-Id: Ifbe094d303ed12cfd6de6aa004a8a19137a2f58a
Signed-off-by: Nick Bray <ncbray@google.com>
2017-12-05 18:05:48 +00:00
Victor Wan
ee46236755 Merge branch 'android-4.9' into amlogic-4.9-dev 2017-11-14 17:18:44 +08:00
Greg Kroah-Hartman
f108c7d9b5 Merge 4.9.58 into android-4.9
Changes in 4.9.58
	MIPS: Fix minimum alignment requirement of IRQ stack
	Revert "bsg-lib: don't free job in bsg_prepare_job"
	xen-netback: Use GFP_ATOMIC to allocate hash
	locking/lockdep: Add nest_lock integrity test
	watchdog: kempld: fix gcc-4.3 build
	irqchip/crossbar: Fix incorrect type of local variables
	initramfs: finish fput() before accessing any binary from initramfs
	mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length
	ALSA: hda: Add Geminilake HDMI codec ID
	qed: Don't use attention PTT for configuring BW
	mac80211: fix power saving clients handling in iwlwifi
	net/mlx4_en: fix overflow in mlx4_en_init_timestamp()
	staging: vchiq_2835_arm: Make cache-line-size a required DT property
	netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value.
	iio: adc: xilinx: Fix error handling
	f2fs: do SSR for data when there is enough free space
	sched/fair: Update rq clock before changing a task's CPU affinity
	Btrfs: send, fix failure to rename top level inode due to name collision
	f2fs: do not wait for writeback in write_begin
	md/linear: shutup lockdep warnning
	sparc64: Migrate hvcons irq to panicked cpu
	net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs
	crypto: xts - Add ECB dependency
	mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next
	ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock
	slub: do not merge cache if slub_debug contains a never-merge flag
	scsi: scsi_dh_emc: return success in clariion_std_inquiry()
	ASoC: mediatek: add I2C dependency for CS42XX8
	drm/amdgpu: refuse to reserve io mem for split VRAM buffers
	net: mvpp2: release reference to txq_cpu[] entry after unmapping
	qede: Prevent index problems in loopback test
	qed: Reserve doorbell BAR space for present CPUs
	qed: Read queue state before releasing buffer
	i2c: at91: ensure state is restored after suspending
	ceph: don't update_dentry_lease unless we actually got one
	ceph: fix bogus endianness change in ceph_ioctl_set_layout
	ceph: clean up unsafe d_parent accesses in build_dentry_path
	uapi: fix linux/rds.h userspace compilation errors
	uapi: fix linux/mroute6.h userspace compilation errors
	IB/hfi1: Use static CTLE with Preset 6 for integrated HFIs
	IB/hfi1: Allocate context data on memory node
	target/iscsi: Fix unsolicited data seq_end_offset calculation
	hrtimer: Catch invalid clockids again
	nfsd/callback: Cleanup callback cred on shutdown
	powerpc/perf: Add restrictions to PMC5 in power9 DD1
	drm/nouveau/gr/gf100-: fix ccache error logging
	regulator: core: Resolve supplies before disabling unused regulators
	btmrvl: avoid double-disable_irq() race
	EDAC, mce_amd: Print IPID and Syndrome on a separate line
	cpufreq: CPPC: add ACPI_PROCESSOR dependency
	usb: dwc3: gadget: Correct ISOC DATA PIDs for short packets
	Linux 4.9.58

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-10-23 09:35:27 +02:00
tao zeng
a998ca2c01 mm: add pagetrace function
PD#151104: mm: add pagetrace function

1. implement pagetrace as a driver; you can get information
  of how many pages allocated by each function by read:

  cat /proc/pagetrace

2. fix wrong statistics of free memory of each migrate type.

Change-Id: Ib2dff4bb5b3dd288ee188007352fc7b353eda100
Signed-off-by: tao zeng <tao.zeng@amlogic.com>
2017-10-23 00:11:33 -07:00
Lokesh Vutla
aaf54d40b8 initramfs: finish fput() before accessing any binary from initramfs
[ Upstream commit 0886551480 ]

Commit 4a9d4b024a ("switch fput to task_work_add") implements a
schedule_work() for completing fput(), but did not guarantee calling
__fput() after unpacking initramfs.  Because of this, there is a
possibility that during boot a driver can see ETXTBSY when it tries to
load a binary from initramfs as fput() is still pending on that binary.

This patch makes sure that fput() is completed after unpacking initramfs
and removes the call to flush_delayed_fput() in kernel_init() which
happens very late after unpacking initramfs.

Link: http://lkml.kernel.org/r/20170201140540.22051-1-lokeshvutla@ti.com
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Reported-by: Murali Karicheri <m-karicheri2@ti.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tero Kristo <t-kristo@ti.com>
Cc: Sekhar Nori <nsekhar@ti.com>
Cc: Nishanth Menon <nm@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-21 17:21:33 +02:00
Victor Wan
63e4f22a8b Merge branch 'android-4.9' into amlogic-4.9-dev 2017-07-04 10:50:35 +08:00
Victor Wan
012fa222d9 Merge branch 'android-4.9' into amlogic-4.9-dev 2017-06-08 10:59:07 +08:00
David Zeuthen
8f3ac4c1ba ANDROID: Update init/do_mounts_dm.c to the latest ChromiumOS version.
This is needed for AVB.

Bug: None
Test: Compiles.
Change-Id: I45b5d435652ab66ec07420ab17f2c7889f7e4d95
Signed-off-by: David Zeuthen <zeuthen@google.com>
2017-06-05 11:26:17 -04:00
Arnd Bergmann
a2adc7c523 UPSTREAM: cgroup: move CONFIG_SOCK_CGROUP_DATA to init/Kconfig
We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is
not right when CONFIG_NET is disabled and there is no socket interface:

warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET)

I don't know what the correct solution for this is, but simply removing
the dependency on NET from SOCK_CGROUP_DATA by moving it out of the
'if NET' section avoids the warning and does not produce other build
errors.

Fixes: 483c4933ea ("cgroup: Fix CGROUP_BPF config")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Fixes: Change-Id: Ib41ef78fba02eb9e592558ddbf06f9ec0aa337b6
       ("UPSTREAM: cgroup: Fix CGROUP_BPF config")
(cherry picked from commit 73b3514735)
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-05-30 17:27:27 -07:00
Yixun Lan
e95ed5b571 Makefile: lock kernel version without localversion appended
PD#144946: Makefile: lock kernel version without localversion appended

after this change, we will lock kernel version to 4.x.y
but we can still get out the exact commit version from dmesg.

1) uname -r : 4.9.27
2) dmesg |grep "Linux version"
  [    0.000000@0] Linux version 4.9.27-638106-g5de5d7a  ....

Change-Id: I35072b399af403d3a1737300a8786cf36063c668
Signed-off-by: Yixun Lan <yixun.lan@amlogic.com>
2017-05-27 02:12:13 -07:00
Andy Lutomirski
cde30d1a2b UPSTREAM: cgroup: Fix CGROUP_BPF config
Cherry-pick from commit 483c4933ea

CGROUP_BPF depended on SOCK_CGROUP_DATA which can't be manually
enabled, making it rather challenging to turn CGROUP_BPF on.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug: 30950746
Change-Id: Ib41ef78fba02eb9e592558ddbf06f9ec0aa337b6
2017-05-22 15:30:59 -07:00
Daniel Mack
f791c42b63 UPSTREAM: cgroup: add support for eBPF programs
Cherry-pick from commit 3007098494

This patch adds two sets of eBPF program pointers to struct cgroup.
One for such that are directly pinned to a cgroup, and one for such
that are effective for it.

To illustrate the logic behind that, assume the following example
cgroup hierarchy.

  A - B - C
        \ D - E

If only B has a program attached, it will be effective for B, C, D
and E. If D then attaches a program itself, that will be effective for
both D and E, and the program in B will only affect B and C. Only one
program of a given type is effective for a cgroup.

Attaching and detaching programs will be done through the bpf(2)
syscall. For now, ingress and egress inet socket filtering are the
only supported use-cases.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug: 30950746
Change-Id: I3df35d8d3b1261503f9b5bcd90b18c9358f1ac28
2017-05-22 15:30:56 -07:00
Greg Kroah-Hartman
44e63be68c Merge 4.9.22 into android-4.9
Changes in 4.9.22:
	ppdev: check before attaching port
	ppdev: fix registering same device name
	drm/vmwgfx: Type-check lookups of fence objects
	drm/vmwgfx: NULL pointer dereference in vmw_surface_define_ioctl()
	drm/vmwgfx: avoid calling vzalloc with a 0 size in vmw_get_cap_3d_ioctl()
	drm/ttm, drm/vmwgfx: Relax permission checking when opening surfaces
	drm/vmwgfx: Remove getparam error message
	drm/vmwgfx: fix integer overflow in vmw_surface_define_ioctl()
	sysfs: be careful of error returns from ops->show()
	staging: android: ashmem: lseek failed due to no FMODE_LSEEK.
	arm/arm64: KVM: Take mmap_sem in stage2_unmap_vm
	arm/arm64: KVM: Take mmap_sem in kvm_arch_prepare_memory_region
	kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
	iio: bmg160: reset chip when probing
	arm64: mm: unaligned access by user-land should be received as SIGBUS
	cfg80211: check rdev resume callback only for registered wiphy
	Reset TreeId to zero on SMB2 TREE_CONNECT
	mm/page_alloc.c: fix print order in show_free_areas()
	ptrace: fix PTRACE_LISTEN race corrupting task->state
	dm verity fec: limit error correction recursion
	dm verity fec: fix bufio leaks
	ACPI / gpio: do not fall back to parsing _CRS when we get a deferral
	Kbuild: use cc-disable-warning consistently for maybe-uninitialized
	orangefs: move features validation to fix filesystem hang
	xfs: Honor FALLOC_FL_KEEP_SIZE when punching ends of files
	ring-buffer: Fix return value check in test_ringbuffer()
	mac80211: unconditionally start new netdev queues with iTXQ support
	brcmfmac: use local iftype avoiding use-after-free of virtual interface
	metag/usercopy: Drop unused macros
	metag/usercopy: Fix alignment error checking
	metag/usercopy: Add early abort to copy_to_user
	metag/usercopy: Zero rest of buffer from copy_from_user
	metag/usercopy: Set flags before ADDZ
	metag/usercopy: Fix src fixup in from user rapf loops
	metag/usercopy: Add missing fixups
	powerpc: Disable HFSCR[TM] if TM is not supported
	powerpc/mm: Add missing global TLB invalidate if cxl is active
	powerpc/64: Fix flush_(d|i)cache_range() called from modules
	powerpc: Don't try to fix up misaligned load-with-reservation instructions
	powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable()
	dm raid: fix NULL pointer dereference for raid1 without bitmap
	nios2: reserve boot memory for device tree
	xtensa: make __pa work with uncached KSEG addresses
	s390/decompressor: fix initrd corruption caused by bss clear
	s390/uaccess: get_user() should zero on failure (again)
	MIPS: Force o32 fp64 support on 32bit MIPS64r6 kernels
	MIPS: ralink: Fix typos in rt3883 pinctrl
	MIPS: End spinlocks with .insn
	MIPS: Lantiq: fix missing xbar kernel panic
	MIPS: Check TLB before handle_ri_rdhwr() for Loongson-3
	MIPS: Add MIPS_CPU_FTLB for Loongson-3A R2
	MIPS: Flush wrong invalid FTLB entry for huge page
	MIPS: c-r4k: Fix Loongson-3's vcache/scache waysize calculation
	Documentation: stable-kernel-rules: fix stable-tag format
	mm/mempolicy.c: fix error handling in set_mempolicy and mbind.
	random: use chacha20 for get_random_int/long
	drm/sun4i: tcon: Move SoC specific quirks to a DT matched data structure
	drm/sun4i: Add compatible strings for A31/A31s display pipelines
	drm/sun4i: Add compatible string for A31/A31s TCON (timing controller)
	clk: lpc32xx: add a quirk for PWM and MS clock dividers
	HID: usbhid: Add quirks for Mayflash/Dragonrise GameCube and PS3 adapters
	HID: i2c-hid: add a simple quirk to fix device defects
	usb: dwc3: gadget: delay unmap of bounced requests
	ASoC: Intel: bytct_rt5640: change default capture settings
	arm64: dts: hisi: fix hip06 sas am-max-trans quirk
	net/mlx4_core: Use device ID defines
	clocksource/drivers/arm_arch_timer: Don't assume clock runs in suspend
	scsi: ufs: introduce UFSHCD_QUIRK_PRDT_BYTE_GRAN quirk
	HID: sensor-hub add quirk for Microsoft Surface 3
	HID: sensor-hub: add quirk for Microchip MM7150
	HID: multitouch: enable the Surface 3 Type Cover to report multitouch data
	HID: multitouch: do not retrieve all reports for all devices
	mmc: sdhci-msm: Enable few quirks
	scsi: ufs: ensure that host pa_tactivate is higher than device
	svcauth_gss: Close connection when dropping an incoming message
	x86/intel_idle: Add CPU model 0x4a (Atom Z34xx series)
	arm64: PCI: Manage controller-specific data on per-controller basis
	arm64: PCI: Add local struct device pointers
	PCI: thunder-pem: Factor out resource lookup
	scsi: ufs: add quirk to increase host PA_SaveConfigTime
	ALSA: usb-audio: add implicit fb quirk for Axe-Fx II
	PCI: Expand "VPD access disabled" quirk message
	ALSA: usb-audio: Add native DSD support for TEAC 501/503 DAC
	platform/x86: acer-wmi: Only supports AMW0_GUID1 on acer family
	nvme: simplify stripe quirk
	ACPI / sysfs: Provide quirk mechanism to prevent GPE flooding
	HID: usbhid: Add quirk for the Futaba TOSD-5711BB VFD
	HID: usbhid: Add quirk for Mayflash/Dragonrise DolphinBar.
	drm/edid: constify edid quirk list
	drm/i915: fix INTEL_BDW_IDS definition
	drm/i915: more .is_mobile cleanups for BDW
	drm/i915: actually drive the BDW reserved IDs
	ASoC: Intel: bytcr_rt5640: quirks for Insyde devices
	scsi: ufs: introduce a new ufshcd_statea UFSHCD_STATE_EH_SCHEDULED
	scsi: ufs: issue link starup 2 times if device isn't active
	usb: chipidea: msm: Rely on core to override AHBBURST
	serial: 8250_omap: Add OMAP_DMA_TX_KICK quirk for AM437x
	Input: gpio_keys - add support for GPIO descriptors
	ARM: davinci: PM: support da8xx DT platforms
	usb: xhci: add quirk flag for broken PED bits
	usb: host: xhci-plat: enable BROKEN_PED quirk if platform requested
	usb: dwc3: host: pass quirk-broken-port-ped property for known broken revisions
	drm/mga: remove device_is_agp callback
	ARM: dts: STiH407-family: set snps,dis_u3_susphy_quirk
	PCI: Add ACS quirk for Intel Union Point
	sata: ahci-da850: implement a workaround for the softreset quirk
	ACPI / button: Change default behavior to lid_init_state=open
	ASoC: rt5670: Add missing 10EC5072 ACPI ID
	ASoC: codecs: rt5670: add quirk for Lenovo Thinkpad 10
	ASoC: Intel: Baytrail: add quirk for Lenovo Thinkpad 10
	ASoC: Intel: cht_bsw_rt5645: harden ACPI device detection
	ASoC: Intel: cht_bsw_rt5645: add Baytrail MCLK support
	ACPI: save NVS memory for Lenovo G50-45
	ASoC: sun4i-i2s: Add quirks to handle a31 compatible
	HID: wacom: don't apply generic settings to old devices
	arm: kernel: Add SMC structure parameter
	firmware: qcom: scm: Fix interrupted SCM calls
	drm/msm/adreno: move function declarations to header file
	ARM: smccc: Update HVC comment to describe new quirk parameter
	PCI: Add Broadcom Northstar2 PAXC quirk for device class and MPSS
	PCI: Disable MSI for HiSilicon Hip06/Hip07 Root Ports
	mmc: sdhci-of-esdhc: remove default broken-cd for ARM
	PCI: Sort the list of devices with D3 delay quirk by ID
	PCI: Add ACS quirk for Qualcomm QDF2400 and QDF2432
	watchdog: s3c2410: Fix infinite interrupt in soft mode
	platform/x86: asus-wmi: Set specified XUSB2PR value for X550LB
	platform/x86: asus-wmi: Detect quirk_no_rfkill from the DSDT
	x86/reboot/quirks: Add ASUS EeeBook X205TA reboot quirk
	x86/reboot/quirks: Add ASUS EeeBook X205TA/W reboot quirk
	usb-storage: Add ignore-residue quirk for Initio INIC-3619
	x86/reboot/quirks: Fix typo in ASUS EeeBook X205TA reboot quirk
	Linux 4.9.22

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-04-12 12:57:51 +02:00
Jason A. Donenfeld
7c03613344 random: use chacha20 for get_random_int/long
commit f5b98461cb upstream.

Now that our crng uses chacha20, we can rely on its speedy
characteristics for replacing MD5, while simultaneously achieving a
higher security guarantee. Before the idea was to use these functions if
you wanted random integers that aren't stupidly insecure but aren't
necessarily secure either, a vague gray zone, that hopefully was "good
enough" for its users. With chacha20, we can strengthen this claim,
since either we're using an rdrand-like instruction, or we're using the
same crng as /dev/urandom. And it's faster than what was before.

We could have chosen to replace this with a SipHash-derived function,
which might be slightly faster, but at the cost of having yet another
RNG construction in the kernel. By moving to chacha20, we have a single
RNG to analyze and verify, and we also already get good performance
improvements on all platforms.

Implementation-wise, rather than use a generic buffer for both
get_random_int/long and memcpy based on the size needs, we use a
specific buffer for 32-bit reads and for 64-bit reads. This way, we're
guaranteed to always have aligned accesses on all platforms. While
slightly more verbose in C, the assembly this generates is a lot
simpler than otherwise.

Finally, on 32-bit platforms where longs and ints are the same size,
we simply alias get_random_int to get_random_long.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Suggested-by: Theodore Ts'o <tytso@mit.edu>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12 12:41:15 +02:00
John Stultz
ac82d16f33 ANDROID: sched: Add Kconfig option DEFAULT_USE_ENERGY_AWARE to set ENERGY_AWARE feature flag
The ENERGY_AWARE sched feature flag cannot be set unless
CONFIG_SCHED_DEBUG is enabled.

So this patch allows the flag to default to true at build time
if the config is set.

Change-Id: I8835a571fdb7a8f8ee6a54af1e11a69f3b5ce8e6
Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-02-08 13:46:01 -08:00
Srivatsa Vaddagiri
26c2154816 ANDROID: sched: Introduce Window Assisted Load Tracking (WALT)
use a window based view of time in order to track task
demand and CPU utilization in the scheduler.

Window Assisted Load Tracking (WALT) implementation credits:
 Srivatsa Vaddagiri, Steve Muckle, Syed Rameez Mustafa, Joonwoo Park,
 Pavan Kumar Kondeti, Olav Haugan

2016-03-06: Integration with EAS/refactoring by Vikram Mulukutla
            and Todd Kjos

Change-Id: I21408236836625d4e7d7de1843d20ed5ff36c708

Includes fixes for issues:

eas/walt: Use walt_ktime_clock() instead of ktime_get_ns() to avoid a
race resulting in watchdog resets
BUG: 29353986
Change-Id: Ic1820e22a136f7c7ebd6f42e15f14d470f6bbbdb

Handle walt accounting anomoly during resume

During resume, there is a corner case where on wakeup, a task's
prev_runnable_sum can go negative. This is a workaround that
fixes the condition and warns (instead of crashing).

BUG: 29464099
Change-Id: I173e7874324b31a3584435530281708145773508

Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Srinath Sridharan <srinathsr@google.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
[jstultz: fwdported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:58 -08:00
Patrick Bellasi
2178e84423 ANDROID: FIXUP: sched: fix build for non-SMP target
Currently the build for a single-core (e.g. user-mode) Linux is broken
and this configuration is required (at least) to run some network tests.

The main issues for the current code support on single-core systems are:
1. {se,rq}::sched_avg is not available nor maintained for !SMP systems
   This means that load and utilisation signals are NOT available in single
   core systems. All the EAS code depends on these signals.
2. sched_group_energy is also SMP dependant. Again this means that all the
   EAS setup and preparation code (energyn model initialization) has to be
   properly guarded/disabled for !SMP systems.
3. SchedFreq depends on utilization signal, which is not available on
   !SMP systems.
4. SchedTune is useless on unicore systems if SchedFreq is not available.
5. WALT machinery is not required on single-core systems.

This patch addresses all these issues by enforcing some constraints for
single-core systems:
a) WALT, SchedTune and SchedTune are now dependant on SMP
b) The default governor for !SMP systems is INTERACTIVE
c) The energy model initialisation/build functions are
d) Other minor code re-arrangements and CONFIG_SMP guarding to enable
   single core builds.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:49 -08:00
Patrick Bellasi
ae71030fd5 ANDROID: sched/tune: add initial support for CGroups based boosting
To support task performance boosting, the usage of a single knob has the
advantage to be a simple solution, both from the implementation and the
usability standpoint.  However, on a real system it can be difficult to
identify a single value for the knob which fits the needs of multiple
different tasks. For example, some kernel threads and/or user-space
background services should be better managed the "standard" way while we
still want to be able to boost the performance of specific workloads.

In order to improve the flexibility of the task boosting mechanism this
patch is the first of a small series which extends the previous
implementation to introduce a "per task group" support.
This first patch introduces just the basic CGroups support, a new
"schedtune" CGroups controller is added which allows to configure
different boost value for different groups of tasks.
To keep the implementation simple but still effective for a boosting
strategy, the new controller:
  1. allows only a two layer hierarchy
  2. supports only a limited number of boost groups

A two layer hierarchy allows to place each task either:
  a) in the root control group
     thus being subject to a system-wide boosting value
  b) in a child of the root group
     thus being subject to the specific boost value defined by that
     "boost group"

The limited number of "boost groups" supported is mainly motivated by
the observation that in a real system it could be useful to have only
few classes of tasks which deserve different treatment.
For example, background vs foreground or interactive vs low-priority.
As an additional benefit, a limited number of boost groups allows also
to have a simpler implementation especially for the code required to
compute the boost value for CPUs which have runnable tasks belonging to
different boost groups.

cc: Tejun Heo <tj@kernel.org>
cc: Li Zefan <lizefan@huawei.com>
cc: Johannes Weiner <hannes@cmpxchg.org>
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:37 -08:00
Patrick Bellasi
69fa4c768a ANDROID: sched/tune: add sysctl interface to define a boost value
The current (CFS) scheduler implementation does not allow "to boost"
tasks performance by running them at a higher OPP compared to the
minimum required to meet their workload demands.

To support tasks performance boosting the scheduler should provide a
"knob" which allows to tune how much the system is going to be optimised
for energy efficiency vs performance.

This patch is the first of a series which provides a simple interface to
define a tuning knob. One system-wide "boost" tunable is exposed via:
  /proc/sys/kernel/sched_cfs_boost
which can be configured in the range [0..100], to define a percentage
where:
  - 0%   boost requires to operate in "standard" mode by scheduling
         tasks at the minimum capacities required by the workload demand
  - 100% boost requires to push at maximum the task performances,
         "regardless" of the incurred energy consumption

A boost value in between these two boundaries is used to bias the
power/performance trade-off, the higher the boost value the more the
scheduler is biased toward performance boosting instead of energy
efficiency.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
2017-01-31 10:46:35 -08:00
Jeremy Compostella
ba2055cde1 ANDROID: dm: fix dm_substitute_devices()
When candidate is the last parameter, candidate_end points to the '\0'
character and not the DM_FIELD_SEP character.  In such a situation, we
should not move the candidate_end pointer one character backward.

Signed-off-by: Jeremy Compostella <jeremy.compostella@intel.com>
2017-01-27 13:55:16 -08:00
Badhri Jagan Sridharan
ae8b490311 ANDROID: dm: Rebase on top of 4.9
1. "dm: optimize use SRCU and RCU" removes the use of dm_table_put.
2. "dm: remove request-based logic from make_request_fn wrapper" necessitates
    calling dm_setup_md_queue or else the request_queue's make_request_fn
    pointer ends being unset.

[    7.711600] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[    7.717519] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W       4.1.15-02273-gb057d16-dirty #33
[    7.726559] Hardware name: HiKey Development Board (DT)
[    7.731779] task: ffffffc005f8acc0 ti: ffffffc005f8c000 task.ti: ffffffc005f8c000
[    7.739257] PC is at 0x0
[    7.741787] LR is at generic_make_request+0x8c/0x108
....
[    9.082931] Call trace:
[    9.085372] [<          (null)>]           (null)
[    9.090074] [<ffffffc0003f4ac0>] submit_bio+0x98/0x1e0
[    9.095212] [<ffffffc0001e2618>] _submit_bh+0x120/0x1f0
[    9.096165] cfg80211: Calling CRDA to update world regulatory domain
[    9.106781] [<ffffffc0001e5450>] __bread_gfp+0x94/0x114
[    9.112004] [<ffffffc00024a748>] ext4_fill_super+0x18c/0x2d64
[    9.117750] [<ffffffc0001b275c>] mount_bdev+0x194/0x1c0
[    9.122973] [<ffffffc0002450dc>] ext4_mount+0x14/0x1c
[    9.128021] [<ffffffc0001b29a0>] mount_fs+0x3c/0x194
[    9.132985] [<ffffffc0001d059c>] vfs_kern_mount+0x4c/0x134
[    9.138467] [<ffffffc0001d2168>] do_mount+0x204/0xbbc
[    9.143514] [<ffffffc0001d2ec4>] SyS_mount+0x94/0xe8
[    9.148479] [<ffffffc000c54074>] mount_block_root+0x120/0x24c
[    9.154222] [<ffffffc000c543e8>] mount_root+0x110/0x12c
[    9.159443] [<ffffffc000c54574>] prepare_namespace+0x170/0x1b8
[    9.165273] [<ffffffc000c53d98>] kernel_init_freeable+0x23c/0x260
[    9.171365] [<ffffffc0009b1748>] kernel_init+0x10/0x118
[    9.176589] Code: bad PC value
[    9.179807] ---[ end trace 75e1bc52ba364d13 ]---

Bug: 27175947

Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I952d86fd1475f0825f9be1386e3497b36127abd0
2017-01-27 13:55:16 -08:00
Will Drewry
a058da8372 CHROMIUM: dm: boot time specification of dm=
This is a wrap-up of three patches pending upstream approval.
I'm bundling them because they are interdependent, and it'll be
easier to drop it on rebase later.

1. dm: allow a dm-fs-style device to be shared via dm-ioctl

Integrates feedback from Alisdair, Mike, and Kiyoshi.

Two main changes occur here:

- One function is added which allows for a programmatically created
mapped device to be inserted into the dm-ioctl hash table.  This binds
the device to a name and, optional, uuid which is needed by udev and
allows for userspace management of the mapped device.

- dm_table_complete() was extended to handle all of the final
functional changes required for the table to be operational once
called.

2. init: boot to device-mapper targets without an initr*

Add a dm= kernel parameter modeled after the md= parameter from
do_mounts_md.  It allows for device-mapper targets to be configured at
boot time for use early in the boot process (as the root device or
otherwise).  It also replaces /dev/XXX calls with major:minor opportunistically.

The format is dm="name uuid ro,table line 1,table line 2,...".  The
parser expects the comma to be safe to use as a newline substitute but,
otherwise, uses the normal separator of space.  Some attempt has been
made to make it forgiving of additional spaces (using skip_spaces()).

A mapped device created during boot will be assigned a minor of 0 and
may be access via /dev/dm-0.

An example dm-linear root with no uuid may look like:

root=/dev/dm-0  dm="lroot none ro, 0 4096 linear /dev/ubdb 0, 4096 4096 linear /dv/ubdc 0"

Once udev is started, /dev/dm-0 will become /dev/mapper/lroot.

Older upstream threads:
http://marc.info/?l=dm-devel&m=127429492521964&w=2
http://marc.info/?l=dm-devel&m=127429499422096&w=2
http://marc.info/?l=dm-devel&m=127429493922000&w=2

Latest upstream threads:
https://patchwork.kernel.org/patch/104859/
https://patchwork.kernel.org/patch/104860/
https://patchwork.kernel.org/patch/104861/

Bug: 27175947

Signed-off-by: Will Drewry <wad@chromium.org>

Review URL: http://codereview.chromium.org/2020011

Change-Id: I92bd53432a11241228d2e5ac89a3b20d19b05a31
2017-01-27 13:55:15 -08:00
Rom Lemarchand
c3c2e99fcc ANDROID: initramfs: Add skip_initramfs command line option
Add a skip_initramfs option to allow choosing whether to boot using
the initramfs or not at runtime.

Change-Id: If30428fa748c1d4d3d7b9d97c1f781de5e4558c3
Signed-off-by: Rom Lemarchand <romlem@google.com>
2017-01-27 13:52:19 -08:00
Linus Torvalds
faaae2a581 Re-enable CONFIG_MODVERSIONS in a slightly weaker form
This enables CONFIG_MODVERSIONS again, but allows for missing symbol CRC
information in order to work around the issue that newer binutils
versions seem to occasionally drop the CRC on the floor.  binutils 2.26
seems to work fine, while binutils 2.27 seems to break MODVERSIONS of
symbols that have been defined in assembler files.

[ We've had random missing CRC's before - it may be an old problem that
  just is now reliably triggered with the weak asm symbols and a new
  version of binutils ]

Some day I really do want to remove MODVERSIONS entirely.  Sadly, today
does not appear to be that day: Debian people apparently do want the
option to enable MODVERSIONS to make it easier to have external modules
across kernel versions, and this seems to be a fairly minimal fix for
the annoying problem.

Cc: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-29 16:01:30 -08:00
Linus Torvalds
cd3caefb46 Fix subtle CONFIG_MODVERSIONS problems
CONFIG_MODVERSIONS has been broken for pretty much the whole 4.9 series,
and quite frankly, nobody has cared very deeply.  We absolutely know how
to fix it, and it's not _complicated_, but it's not exactly pretty
either.

This oneliner fixes it without the ugliness, and allows for further
future cleanups.

  "We've secretly replaced their regular MODVERSIONS with nothing at
   all, let's see if they notice"

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-25 15:44:47 -08:00
Nicolas Schichan
18594e9bc4 init: use pr_cont() when displaying rotator during ramdisk loading.
Otherwise each individual rotator char would be printed in a new line:

(...)
[    0.642350] -
[    0.644374] |
[    0.646367] -
(...)

Signed-off-by: Nicolas Schichan <nicolas.schichan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-24 09:32:20 -08:00
Linus Torvalds
9ffc66941d Merge tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull gcc plugins update from Kees Cook:
 "This adds a new gcc plugin named "latent_entropy". It is designed to
  extract as much possible uncertainty from a running system at boot
  time as possible, hoping to capitalize on any possible variation in
  CPU operation (due to runtime data differences, hardware differences,
  SMP ordering, thermal timing variation, cache behavior, etc).

  At the very least, this plugin is a much more comprehensive example
  for how to manipulate kernel code using the gcc plugin internals"

* tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  latent_entropy: Mark functions with __latent_entropy
  gcc-plugins: Add latent_entropy plugin
2016-10-15 10:03:15 -07:00
Linus Torvalds
84d69848c9 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:

 - EXPORT_SYMBOL for asm source by Al Viro.

   This does bring a regression, because genksyms no longer generates
   checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is
   working on a patch to fix this.

   Plus, we are talking about functions like strcpy(), which rarely
   change prototypes.

 - Fixes for PPC fallout of the above by Stephen Rothwell and Nick
   Piggin

 - fixdep speedup by Alexey Dobriyan.

 - preparatory work by Nick Piggin to allow architectures to build with
   -ffunction-sections, -fdata-sections and --gc-sections

 - CONFIG_THIN_ARCHIVES support by Stephen Rothwell

 - fix for filenames with colons in the initramfs source by me.

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits)
  initramfs: Escape colons in depfile
  ppc: there is no clear_pages to export
  powerpc/64: whitelist unresolved modversions CRCs
  kbuild: -ffunction-sections fix for archs with conflicting sections
  kbuild: add arch specific post-link Makefile
  kbuild: allow archs to select link dead code/data elimination
  kbuild: allow architectures to use thin archives instead of ld -r
  kbuild: Regenerate genksyms lexer
  kbuild: genksyms fix for typeof handling
  fixdep: faster CONFIG_ search
  ia64: move exports to definitions
  sparc32: debride memcpy.S a bit
  [sparc] unify 32bit and 64bit string.h
  sparc: move exports to definitions
  ppc: move exports to definitions
  arm: move exports to definitions
  s390: move exports to definitions
  m68k: move exports to definitions
  alpha: move exports to actual definitions
  x86: move exports to actual definitions
  ...
2016-10-14 14:26:58 -07:00
Peter Zijlstra
26b5679e43 relay: Use irq_work instead of plain timer for deferred wakeup
Relay avoids calling wake_up_interruptible() for doing the wakeup of
readers/consumers, waiting for the generation of new data, from the
context of a process which produced the data.  This is apparently done to
prevent the possibility of a deadlock in case Scheduler itself is is
generating data for the relay, after acquiring rq->lock.

The following patch used a timer (to be scheduled at next jiffy), for
delegating the wakeup to another context.
	commit 7c9cb38302
	Author: Tom Zanussi <zanussi@comcast.net>
	Date:   Wed May 9 02:34:01 2007 -0700

	relay: use plain timer instead of delayed work

	relay doesn't need to use schedule_delayed_work() for waking readers
	when a simple timer will do.

Scheduling a plain timer, at next jiffies boundary, to do the wakeup
causes a significant wakeup latency for the Userspace client, which makes
relay less suitable for the high-frequency low-payload use cases where the
data gets generated at a very high rate, like multiple sub buffers getting
filled within a milli second.  Moreover the timer is re-scheduled on every
newly produced sub buffer so the timer keeps getting pushed out if sub
buffers are filled in a very quick succession (less than a jiffy gap
between filling of 2 sub buffers).  As a result relay runs out of sub
buffers to store the new data.

By using irq_work it is ensured that wakeup of userspace client, blocked
in the poll call, is done at earliest (through self IPI or next timer
tick) enabling it to always consume the data in time.  Also this makes
relay consistent with printk & ring buffers (trace), as they too use
irq_work for deferred wake up of readers.

[arnd@arndb.de: select CONFIG_IRQ_WORK]
 Link: http://lkml.kernel.org/r/20160912154035.3222156-1-arnd@arndb.de
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1472906487-1559-1-git-send-email-akash.goel@intel.com
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:32 -07:00
Emese Revfy
38addce8b6 gcc-plugins: Add latent_entropy plugin
This adds a new gcc plugin named "latent_entropy". It is designed to
extract as much possible uncertainty from a running system at boot time as
possible, hoping to capitalize on any possible variation in CPU operation
(due to runtime data differences, hardware differences, SMP ordering,
thermal timing variation, cache behavior, etc).

At the very least, this plugin is a much more comprehensive example for
how to manipulate kernel code using the gcc plugin internals.

The need for very-early boot entropy tends to be very architecture or
system design specific, so this plugin is more suited for those sorts
of special cases. The existing kernel RNG already attempts to extract
entropy from reliable runtime variation, but this plugin takes the idea to
a logical extreme by permuting a global variable based on any variation
in code execution (e.g. a different value (and permutation function)
is used to permute the global based on loop count, case statement,
if/then/else branching, etc).

To do this, the plugin starts by inserting a local variable in every
marked function. The plugin then adds logic so that the value of this
variable is modified by randomly chosen operations (add, xor and rol) and
random values (gcc generates separate static values for each location at
compile time and also injects the stack pointer at runtime). The resulting
value depends on the control flow path (e.g., loops and branches taken).

Before the function returns, the plugin mixes this local variable into
the latent_entropy global variable. The value of this global variable
is added to the kernel entropy pool in do_one_initcall() and _do_fork(),
though it does not credit any bytes of entropy to the pool; the contents
of the global are just used to mix the pool.

Additionally, the plugin can pre-initialize arrays with build-time
random contents, so that two different kernel builds running on identical
hardware will not have the same starting values.

Signed-off-by: Emese Revfy <re.emese@gmail.com>
[kees: expanded commit message and code comments]
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-10-10 14:51:44 -07:00
Linus Torvalds
997b611baf Merge branch 'parisc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "Changes include:

   - Fix boot of 32bit SMP kernel (initial kernel mapping was too small)

   - Added hardened usercopy checks

   - Drop bootmem and switch to memblock and NO_BOOTMEM implementation

   - Drop the BROKEN_RODATA config option (and thus remove the relevant
     code from the generic headers and files because parisc was the last
     architecture which used this config option)

   - Improve segfault reporting by printing human readable error strings

   - Various smaller changes, e.g. dwarf debug support for assembly
     code, update comments regarding copy_user_page_asm, switch to
     kmalloc_array()"

* 'parisc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Increase KERNEL_INITIAL_SIZE for 32-bit SMP kernels
  parisc: Drop bootmem and switch to memblock
  parisc: Add hardened usercopy feature
  parisc: Add cfi_startproc and cfi_endproc to assembly code
  parisc: Move hpmc stack into page aligned bss section
  parisc: Fix self-detected CPU stall warnings on Mako machines
  parisc: Report trap type as human readable string
  parisc: Update comment regarding implementation of copy_user_page_asm
  parisc: Use kmalloc_array() in add_system_map_addresses()
  parisc: Check return value of smp_boot_one_cpu()
  parisc: Drop BROKEN_RODATA config option
2016-10-07 20:50:37 -07:00
Helge Deller
b5d5cf2b8a parisc: Drop BROKEN_RODATA config option
PARISC was the only architecture which selected the BROKEN_RODATA config
option. Drop it and remove the special handling from init.h as well.

Signed-off-by: Helge Deller <deller@gmx.de>
2016-09-20 18:02:35 +02:00
Andy Lutomirski
c6c314a613 sched/core: Add try_get_task_stack() and put_task_stack()
There are a few places in the kernel that access stack memory
belonging to a different task.  Before we can start freeing task
stacks before the task_struct is freed, we need a way for those code
paths to pin the stack.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/17a434f50ad3d77000104f21666575e10a9c1fbd.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:53 +02:00