[ Upstream commit 6041186a32 ]
When a module option, or core kernel argument, toggles a static-key it
requires jump labels to be initialized early. While x86, PowerPC, and
ARM64 arrange for jump_label_init() to be called before parse_args(),
ARM does not.
Kernel command line: rdinit=/sbin/init page_alloc.shuffle=1 panic=-1 console=ttyAMA0,115200 page_alloc.shuffle=1
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at ./include/linux/jump_label.h:303
page_alloc_shuffle+0x12c/0x1ac
static_key_enable(): static key 'page_alloc_shuffle_key+0x0/0x4' used
before call to jump_label_init()
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted
5.1.0-rc4-next-20190410-00003-g3367c36ce744 #1
Hardware name: ARM Integrator/CP (Device Tree)
[<c0011c68>] (unwind_backtrace) from [<c000ec48>] (show_stack+0x10/0x18)
[<c000ec48>] (show_stack) from [<c07e9710>] (dump_stack+0x18/0x24)
[<c07e9710>] (dump_stack) from [<c001bb1c>] (__warn+0xe0/0x108)
[<c001bb1c>] (__warn) from [<c001bb88>] (warn_slowpath_fmt+0x44/0x6c)
[<c001bb88>] (warn_slowpath_fmt) from [<c0b0c4a8>]
(page_alloc_shuffle+0x12c/0x1ac)
[<c0b0c4a8>] (page_alloc_shuffle) from [<c0b0c550>] (shuffle_store+0x28/0x48)
[<c0b0c550>] (shuffle_store) from [<c003e6a0>] (parse_args+0x1f4/0x350)
[<c003e6a0>] (parse_args) from [<c0ac3c00>] (start_kernel+0x1c0/0x488)
Move the fallback call to jump_label_init() to occur before
parse_args().
The redundant calls to jump_label_init() in other archs are left intact
in case they have static key toggling use cases that are even earlier
than option parsing.
Link: http://lkml.kernel.org/r/155544804466.1032396.13418949511615676665.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Guenter Roeck <groeck@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Russell King <rmk@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit b5b1404d08 upstream.
This is purely a preparatory patch for upcoming changes during the 4.19
merge window.
We have a function called "boot_cpu_state_init()" that isn't really
about the bootup cpu state: that is done much earlier by the similarly
named "boot_cpu_init()" (note lack of "state" in name).
This function initializes some hotplug CPU state, and needs to run after
the percpu data has been properly initialized. It even has a comment to
that effect.
Except it _doesn't_ actually run after the percpu data has been properly
initialized. On x86 it happens to do that, but on at least arm and
arm64, the percpu base pointers are initialized by the arch-specific
'smp_prepare_boot_cpu()' hook, which ran _after_ boot_cpu_state_init().
This had some unexpected results, and in particular we have a patch
pending for the merge window that did the obvious cleanup of using
'this_cpu_write()' in the cpu hotplug init code:
- per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE;
+ this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);
which is obviously the right thing to do. Except because of the
ordering issue, it actually failed miserably and unexpectedly on arm64.
So this just fixes the ordering, and changes the name of the function to
be 'boot_cpu_hotplug_init()' to make it obvious that it's about cpu
hotplug state, because the core CPU state was supposed to have already
been done earlier.
Marked for stable, since the (not yet merged) patch that will show this
problem is marked for stable.
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Mian Yousaf Kaukab <yousaf.kaukab@suse.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 78a5255ffb upstream.
We have some rather random rules about when we accept the
"maybe-initialized" warnings, and when we don't.
For example, we consider it unreliable for gcc versions < 4.9, but also
if -O3 is enabled, or if optimizing for size. And then various kernel
config options disabled it, because they know that they trigger that
warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES).
And now gcc-10 seems to be introducing a lot of those warnings too, so
it falls under the same heading as 4.9 did.
At the same time, we have a very straightforward way to _enable_ that
warning when wanted: use "W=2" to enable more warnings.
So stop playing these ad-hoc games, and just disable that warning by
default, with the known and straight-forward "if you want to work on the
extra compiler warnings, use W=123".
Would it be great to have code that is always so obvious that it never
confuses the compiler whether a variable is used initialized or not?
Yes, it would. In a perfect world, the compilers would be smarter, and
our source code would be simpler.
That's currently not the world we live in, though.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b303c6df80 upstream.
Since -Wmaybe-uninitialized was introduced by GCC 4.7, we have patched
various false positives:
- commit e74fc973b6 ("Turn off -Wmaybe-uninitialized when building
with -Os") turned off this option for -Os.
- commit 815eb71e71 ("Kbuild: disable 'maybe-uninitialized' warning
for CONFIG_PROFILE_ALL_BRANCHES") turned off this option for
CONFIG_PROFILE_ALL_BRANCHES
- commit a76bcf557e ("Kbuild: enable -Wmaybe-uninitialized warning
for "make W=1"") turned off this option for GCC < 4.9
Arnd provided more explanation in https://lkml.org/lkml/2017/3/14/903
I think this looks better by shifting the logic from Makefile to Kconfig.
Link: https://github.com/ClangBuiltLinux/linux/issues/350
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
PD#SWPL-31258
[ Upstream commit 4d217a5adc ]
The newly added 'rodata_enabled' global variable is protected by
the wrong #ifdef, leading to a link error when CONFIG_DEBUG_SET_MODULE_RONX
is turned on:
kernel/module.o: In function `disable_ro_nx':
module.c:(.text.unlikely.disable_ro_nx+0x88): undefined reference to `rodata_enabled'
kernel/module.o: In function `module_disable_ro':
module.c:(.text.module_disable_ro+0x8c): undefined reference to `rodata_enabled'
kernel/module.o: In function `module_enable_ro':
module.c:(.text.module_enable_ro+0xb0): undefined reference to `rodata_enabled'
CONFIG_SET_MODULE_RONX does not exist, so use the correct one instead.
Fixes: 39290b389e ("module: extend 'rodata=off' boot cmdline parameter to module mappings")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Change-Id: I3ed114cc957edca46684dc14bc47e133cd1725de
Signed-off-by: Victor Wan <victor.wan@amlogic.com>
Conflicts:
arch/arm/configs/bcm2835_defconfig
arch/arm/configs/sunxi_defconfig
include/linux/cpufreq.h
init/main.c
commit 39290b389e upstream.
The current "rodata=off" parameter disables read-only kernel mappings
under CONFIG_DEBUG_RODATA:
commit d2aa1acad2 ("mm/init: Add 'rodata=off' boot cmdline parameter
to disable read-only kernel mappings")
This patch is a logical extension to module mappings ie. read-only mappings
at module loading can be disabled even if CONFIG_DEBUG_SET_MODULE_RONX
(mainly for debug use). Please note, however, that it only affects RO/RW
permissions, keeping NX set.
This is the first step to make CONFIG_DEBUG_SET_MODULE_RONX mandatory
(always-on) in the future as CONFIG_DEBUG_RODATA on x86 and arm64.
Suggested-by: and Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Link: http://lkml.kernel.org/r/20161114061505.15238-1-takahiro.akashi@linaro.org
Signed-off-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Alex Shi <alex.shi@linaro.org> [v4.9 backport]
Signed-off-by: Mark Rutland <mark.rutland@arm.com> [v4.9 backport]
Tested-by: Will Deacon <will.deacon@arm.com>
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This change adds the CONFIG_CFI_CLANG option, CFI error handling,
and a faster look-up table for cross module CFI checks.
Bug: 67506682
Change-Id: Ic009f0a629b552a0eb16e6d89808c7029e91447d
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Changes in 4.9.79
x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
orangefs: use list_for_each_entry_safe in purge_waiting_ops
orangefs: initialize op on loop restart in orangefs_devreq_read
usbip: prevent vhci_hcd driver from leaking a socket pointer address
usbip: Fix implicit fallthrough warning
usbip: Fix potential format overflow in userspace tools
can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once
can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2
Prevent timer value 0 for MWAITX
drivers: base: cacheinfo: fix x86 with CONFIG_OF enabled
drivers: base: cacheinfo: fix boot error message when acpi is enabled
mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
hwpoison, memcg: forcibly uncharge LRU pages
cma: fix calculation of aligned offset
mm, page_alloc: fix potential false positive in __zone_watermark_ok
ipc: msg, make msgrcv work with LONG_MIN
ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
ACPICA: Namespace: fix operand cache leak
netfilter: nfnetlink_cthelper: Add missing permission checks
netfilter: xt_osf: Add missing permission checks
reiserfs: fix race in prealloc discard
reiserfs: don't preallocate blocks for extended attributes
fs/fcntl: f_setown, avoid undefined behaviour
scsi: libiscsi: fix shifting of DID_REQUEUE host byte
Revert "module: Add retpoline tag to VERMAGIC"
mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
Input: trackpoint - force 3 buttons if 0 button is reported
orangefs: fix deadlock; do not write i_size in read_iter
um: link vmlinux with -no-pie
vsyscall: Fix permissions for emulate mode with KAISER/PTI
eventpoll.h: add missing epoll event masks
dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
ipv6: fix udpv6 sendmsg crash caused by too small MTU
ipv6: ip6_make_skb() needs to clear cork.base.dst
lan78xx: Fix failure in USB Full Speed
net: igmp: fix source address check for IGMPv3 reports
net: qdisc_pkt_len_init() should be more robust
net: tcp: close sock if net namespace is exiting
pppoe: take ->needed_headroom of lower device into account on xmit
r8169: fix memory corruption on retrieval of hardware statistics.
sctp: do not allow the v4 socket to bind a v4mapped v6 address
sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
tipc: fix a memory leak in tipc_nl_node_get_link()
vmxnet3: repair memory leak
net: Allow neigh contructor functions ability to modify the primary_key
ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
ppp: unlock all_ppp_mutex before registering device
be2net: restore properly promisc mode after queues reconfiguration
ip6_gre: init dev->mtu and dev->hard_header_len correctly
gso: validate gso_type in GSO handlers
mlxsw: spectrum_router: Don't log an error on missing neighbor
tun: fix a memory leak for tfile->tx_array
flow_dissector: properly cap thoff field
perf/x86/amd/power: Do not load AMD power module on !AMD platforms
x86/microcode/intel: Extend BDW late-loading further with LLC size check
hrtimer: Reset hrtimer cpu base proper on CPU hotplug
x86: bpf_jit: small optimization in emit_bpf_tail_call()
bpf: fix bpf_tail_call() x64 JIT
bpf: introduce BPF_JIT_ALWAYS_ON config
bpf: arsh is not supported in 32 bit alu thus reject it
bpf: avoid false sharing of map refcount with max_entries
bpf: fix divides by zero
bpf: fix 32-bit divide by zero
bpf: reject stores into ctx via st and xadd
nfsd: auth: Fix gid sorting when rootsquash enabled
Linux 4.9.79
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ upstream commit 290af86629 ]
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."
To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64
The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
v2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)
v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
It will be sent when the trees are merged back to net-next
Considered doing:
int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Changes in 4.9.75
tcp_bbr: reset full pipe detection on loss recovery undo
tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
x86/boot: Add early cmdline parsing for options with arguments
KAISER: Kernel Address Isolation
kaiser: merged update
kaiser: do not set _PAGE_NX on pgd_none
kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE
kaiser: fix build and FIXME in alloc_ldt_struct()
kaiser: KAISER depends on SMP
kaiser: fix regs to do_nmi() ifndef CONFIG_KAISER
kaiser: fix perf crashes
kaiser: ENOMEM if kaiser_pagetable_walk() NULL
kaiser: tidied up asm/kaiser.h somewhat
kaiser: tidied up kaiser_add/remove_mapping slightly
kaiser: align addition to x86/mm/Makefile
kaiser: cleanups while trying for gold link
kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET
kaiser: delete KAISER_REAL_SWITCH option
kaiser: vmstat show NR_KAISERTABLE as nr_overhead
kaiser: enhanced by kernel and user PCIDs
kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user
kaiser: PCID 0 for kernel and 128 for user
kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user
kaiser: paranoid_entry pass cr3 need to paranoid_exit
kaiser: kaiser_remove_mapping() move along the pgd
kaiser: fix unlikely error in alloc_ldt_struct()
kaiser: add "nokaiser" boot option, using ALTERNATIVE
x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling
x86/kaiser: Check boottime cmdline params
kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush
kaiser: drop is_atomic arg to kaiser_pagetable_walk()
kaiser: asm/tlbflush.h handle noPGE at lower level
kaiser: kaiser_flush_tlb_on_return_to_user() check PCID
x86/paravirt: Dont patch flush_tlb_single
x86/kaiser: Reenable PARAVIRT
kaiser: disabled on Xen PV
x86/kaiser: Move feature detection up
KPTI: Rename to PAGE_TABLE_ISOLATION
KPTI: Report when enabled
kaiser: Set _PAGE_NX only if supported
Linux 4.9.75
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Kaiser only needs to map one page of the stack; and
kernel/fork.c did not build on powerpc (no __PAGE_KERNEL).
It's all cleaner if linux/kaiser.h provides kaiser_map_thread_stack()
and kaiser_unmap_thread_stack() wrappers around asm/kaiser.h's
kaiser_add_mapping() and kaiser_remove_mapping(). And use
linux/kaiser.h in init/main.c to avoid the #ifdefs there.
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Memory allocated for initrd would not be reclaimed if initializing ramfs
was skipped.
Bug: 69901741
Test: "grep MemTotal /proc/meminfo" increases by a few MB on an Android
device with a/b boot.
Change-Id: Ifbe094d303ed12cfd6de6aa004a8a19137a2f58a
Signed-off-by: Nick Bray <ncbray@google.com>
Changes in 4.9.58
MIPS: Fix minimum alignment requirement of IRQ stack
Revert "bsg-lib: don't free job in bsg_prepare_job"
xen-netback: Use GFP_ATOMIC to allocate hash
locking/lockdep: Add nest_lock integrity test
watchdog: kempld: fix gcc-4.3 build
irqchip/crossbar: Fix incorrect type of local variables
initramfs: finish fput() before accessing any binary from initramfs
mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length
ALSA: hda: Add Geminilake HDMI codec ID
qed: Don't use attention PTT for configuring BW
mac80211: fix power saving clients handling in iwlwifi
net/mlx4_en: fix overflow in mlx4_en_init_timestamp()
staging: vchiq_2835_arm: Make cache-line-size a required DT property
netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value.
iio: adc: xilinx: Fix error handling
f2fs: do SSR for data when there is enough free space
sched/fair: Update rq clock before changing a task's CPU affinity
Btrfs: send, fix failure to rename top level inode due to name collision
f2fs: do not wait for writeback in write_begin
md/linear: shutup lockdep warnning
sparc64: Migrate hvcons irq to panicked cpu
net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs
crypto: xts - Add ECB dependency
mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next
ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock
slub: do not merge cache if slub_debug contains a never-merge flag
scsi: scsi_dh_emc: return success in clariion_std_inquiry()
ASoC: mediatek: add I2C dependency for CS42XX8
drm/amdgpu: refuse to reserve io mem for split VRAM buffers
net: mvpp2: release reference to txq_cpu[] entry after unmapping
qede: Prevent index problems in loopback test
qed: Reserve doorbell BAR space for present CPUs
qed: Read queue state before releasing buffer
i2c: at91: ensure state is restored after suspending
ceph: don't update_dentry_lease unless we actually got one
ceph: fix bogus endianness change in ceph_ioctl_set_layout
ceph: clean up unsafe d_parent accesses in build_dentry_path
uapi: fix linux/rds.h userspace compilation errors
uapi: fix linux/mroute6.h userspace compilation errors
IB/hfi1: Use static CTLE with Preset 6 for integrated HFIs
IB/hfi1: Allocate context data on memory node
target/iscsi: Fix unsolicited data seq_end_offset calculation
hrtimer: Catch invalid clockids again
nfsd/callback: Cleanup callback cred on shutdown
powerpc/perf: Add restrictions to PMC5 in power9 DD1
drm/nouveau/gr/gf100-: fix ccache error logging
regulator: core: Resolve supplies before disabling unused regulators
btmrvl: avoid double-disable_irq() race
EDAC, mce_amd: Print IPID and Syndrome on a separate line
cpufreq: CPPC: add ACPI_PROCESSOR dependency
usb: dwc3: gadget: Correct ISOC DATA PIDs for short packets
Linux 4.9.58
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
PD#151104: mm: add pagetrace function
1. implement pagetrace as a driver; you can get information
of how many pages allocated by each function by read:
cat /proc/pagetrace
2. fix wrong statistics of free memory of each migrate type.
Change-Id: Ib2dff4bb5b3dd288ee188007352fc7b353eda100
Signed-off-by: tao zeng <tao.zeng@amlogic.com>
This is needed for AVB.
Bug: None
Test: Compiles.
Change-Id: I45b5d435652ab66ec07420ab17f2c7889f7e4d95
Signed-off-by: David Zeuthen <zeuthen@google.com>
We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is
not right when CONFIG_NET is disabled and there is no socket interface:
warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET)
I don't know what the correct solution for this is, but simply removing
the dependency on NET from SOCK_CGROUP_DATA by moving it out of the
'if NET' section avoids the warning and does not produce other build
errors.
Fixes: 483c4933ea ("cgroup: Fix CGROUP_BPF config")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: Change-Id: Ib41ef78fba02eb9e592558ddbf06f9ec0aa337b6
("UPSTREAM: cgroup: Fix CGROUP_BPF config")
(cherry picked from commit 73b3514735)
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
PD#144946: Makefile: lock kernel version without localversion appended
after this change, we will lock kernel version to 4.x.y
but we can still get out the exact commit version from dmesg.
1) uname -r : 4.9.27
2) dmesg |grep "Linux version"
[ 0.000000@0] Linux version 4.9.27-638106-g5de5d7a ....
Change-Id: I35072b399af403d3a1737300a8786cf36063c668
Signed-off-by: Yixun Lan <yixun.lan@amlogic.com>
Cherry-pick from commit 483c4933ea
CGROUP_BPF depended on SOCK_CGROUP_DATA which can't be manually
enabled, making it rather challenging to turn CGROUP_BPF on.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug: 30950746
Change-Id: Ib41ef78fba02eb9e592558ddbf06f9ec0aa337b6
Cherry-pick from commit 3007098494
This patch adds two sets of eBPF program pointers to struct cgroup.
One for such that are directly pinned to a cgroup, and one for such
that are effective for it.
To illustrate the logic behind that, assume the following example
cgroup hierarchy.
A - B - C
\ D - E
If only B has a program attached, it will be effective for B, C, D
and E. If D then attaches a program itself, that will be effective for
both D and E, and the program in B will only affect B and C. Only one
program of a given type is effective for a cgroup.
Attaching and detaching programs will be done through the bpf(2)
syscall. For now, ingress and egress inet socket filtering are the
only supported use-cases.
Signed-off-by: Daniel Mack <daniel@zonque.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug: 30950746
Change-Id: I3df35d8d3b1261503f9b5bcd90b18c9358f1ac28
Changes in 4.9.22:
ppdev: check before attaching port
ppdev: fix registering same device name
drm/vmwgfx: Type-check lookups of fence objects
drm/vmwgfx: NULL pointer dereference in vmw_surface_define_ioctl()
drm/vmwgfx: avoid calling vzalloc with a 0 size in vmw_get_cap_3d_ioctl()
drm/ttm, drm/vmwgfx: Relax permission checking when opening surfaces
drm/vmwgfx: Remove getparam error message
drm/vmwgfx: fix integer overflow in vmw_surface_define_ioctl()
sysfs: be careful of error returns from ops->show()
staging: android: ashmem: lseek failed due to no FMODE_LSEEK.
arm/arm64: KVM: Take mmap_sem in stage2_unmap_vm
arm/arm64: KVM: Take mmap_sem in kvm_arch_prepare_memory_region
kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
iio: bmg160: reset chip when probing
arm64: mm: unaligned access by user-land should be received as SIGBUS
cfg80211: check rdev resume callback only for registered wiphy
Reset TreeId to zero on SMB2 TREE_CONNECT
mm/page_alloc.c: fix print order in show_free_areas()
ptrace: fix PTRACE_LISTEN race corrupting task->state
dm verity fec: limit error correction recursion
dm verity fec: fix bufio leaks
ACPI / gpio: do not fall back to parsing _CRS when we get a deferral
Kbuild: use cc-disable-warning consistently for maybe-uninitialized
orangefs: move features validation to fix filesystem hang
xfs: Honor FALLOC_FL_KEEP_SIZE when punching ends of files
ring-buffer: Fix return value check in test_ringbuffer()
mac80211: unconditionally start new netdev queues with iTXQ support
brcmfmac: use local iftype avoiding use-after-free of virtual interface
metag/usercopy: Drop unused macros
metag/usercopy: Fix alignment error checking
metag/usercopy: Add early abort to copy_to_user
metag/usercopy: Zero rest of buffer from copy_from_user
metag/usercopy: Set flags before ADDZ
metag/usercopy: Fix src fixup in from user rapf loops
metag/usercopy: Add missing fixups
powerpc: Disable HFSCR[TM] if TM is not supported
powerpc/mm: Add missing global TLB invalidate if cxl is active
powerpc/64: Fix flush_(d|i)cache_range() called from modules
powerpc: Don't try to fix up misaligned load-with-reservation instructions
powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable()
dm raid: fix NULL pointer dereference for raid1 without bitmap
nios2: reserve boot memory for device tree
xtensa: make __pa work with uncached KSEG addresses
s390/decompressor: fix initrd corruption caused by bss clear
s390/uaccess: get_user() should zero on failure (again)
MIPS: Force o32 fp64 support on 32bit MIPS64r6 kernels
MIPS: ralink: Fix typos in rt3883 pinctrl
MIPS: End spinlocks with .insn
MIPS: Lantiq: fix missing xbar kernel panic
MIPS: Check TLB before handle_ri_rdhwr() for Loongson-3
MIPS: Add MIPS_CPU_FTLB for Loongson-3A R2
MIPS: Flush wrong invalid FTLB entry for huge page
MIPS: c-r4k: Fix Loongson-3's vcache/scache waysize calculation
Documentation: stable-kernel-rules: fix stable-tag format
mm/mempolicy.c: fix error handling in set_mempolicy and mbind.
random: use chacha20 for get_random_int/long
drm/sun4i: tcon: Move SoC specific quirks to a DT matched data structure
drm/sun4i: Add compatible strings for A31/A31s display pipelines
drm/sun4i: Add compatible string for A31/A31s TCON (timing controller)
clk: lpc32xx: add a quirk for PWM and MS clock dividers
HID: usbhid: Add quirks for Mayflash/Dragonrise GameCube and PS3 adapters
HID: i2c-hid: add a simple quirk to fix device defects
usb: dwc3: gadget: delay unmap of bounced requests
ASoC: Intel: bytct_rt5640: change default capture settings
arm64: dts: hisi: fix hip06 sas am-max-trans quirk
net/mlx4_core: Use device ID defines
clocksource/drivers/arm_arch_timer: Don't assume clock runs in suspend
scsi: ufs: introduce UFSHCD_QUIRK_PRDT_BYTE_GRAN quirk
HID: sensor-hub add quirk for Microsoft Surface 3
HID: sensor-hub: add quirk for Microchip MM7150
HID: multitouch: enable the Surface 3 Type Cover to report multitouch data
HID: multitouch: do not retrieve all reports for all devices
mmc: sdhci-msm: Enable few quirks
scsi: ufs: ensure that host pa_tactivate is higher than device
svcauth_gss: Close connection when dropping an incoming message
x86/intel_idle: Add CPU model 0x4a (Atom Z34xx series)
arm64: PCI: Manage controller-specific data on per-controller basis
arm64: PCI: Add local struct device pointers
PCI: thunder-pem: Factor out resource lookup
scsi: ufs: add quirk to increase host PA_SaveConfigTime
ALSA: usb-audio: add implicit fb quirk for Axe-Fx II
PCI: Expand "VPD access disabled" quirk message
ALSA: usb-audio: Add native DSD support for TEAC 501/503 DAC
platform/x86: acer-wmi: Only supports AMW0_GUID1 on acer family
nvme: simplify stripe quirk
ACPI / sysfs: Provide quirk mechanism to prevent GPE flooding
HID: usbhid: Add quirk for the Futaba TOSD-5711BB VFD
HID: usbhid: Add quirk for Mayflash/Dragonrise DolphinBar.
drm/edid: constify edid quirk list
drm/i915: fix INTEL_BDW_IDS definition
drm/i915: more .is_mobile cleanups for BDW
drm/i915: actually drive the BDW reserved IDs
ASoC: Intel: bytcr_rt5640: quirks for Insyde devices
scsi: ufs: introduce a new ufshcd_statea UFSHCD_STATE_EH_SCHEDULED
scsi: ufs: issue link starup 2 times if device isn't active
usb: chipidea: msm: Rely on core to override AHBBURST
serial: 8250_omap: Add OMAP_DMA_TX_KICK quirk for AM437x
Input: gpio_keys - add support for GPIO descriptors
ARM: davinci: PM: support da8xx DT platforms
usb: xhci: add quirk flag for broken PED bits
usb: host: xhci-plat: enable BROKEN_PED quirk if platform requested
usb: dwc3: host: pass quirk-broken-port-ped property for known broken revisions
drm/mga: remove device_is_agp callback
ARM: dts: STiH407-family: set snps,dis_u3_susphy_quirk
PCI: Add ACS quirk for Intel Union Point
sata: ahci-da850: implement a workaround for the softreset quirk
ACPI / button: Change default behavior to lid_init_state=open
ASoC: rt5670: Add missing 10EC5072 ACPI ID
ASoC: codecs: rt5670: add quirk for Lenovo Thinkpad 10
ASoC: Intel: Baytrail: add quirk for Lenovo Thinkpad 10
ASoC: Intel: cht_bsw_rt5645: harden ACPI device detection
ASoC: Intel: cht_bsw_rt5645: add Baytrail MCLK support
ACPI: save NVS memory for Lenovo G50-45
ASoC: sun4i-i2s: Add quirks to handle a31 compatible
HID: wacom: don't apply generic settings to old devices
arm: kernel: Add SMC structure parameter
firmware: qcom: scm: Fix interrupted SCM calls
drm/msm/adreno: move function declarations to header file
ARM: smccc: Update HVC comment to describe new quirk parameter
PCI: Add Broadcom Northstar2 PAXC quirk for device class and MPSS
PCI: Disable MSI for HiSilicon Hip06/Hip07 Root Ports
mmc: sdhci-of-esdhc: remove default broken-cd for ARM
PCI: Sort the list of devices with D3 delay quirk by ID
PCI: Add ACS quirk for Qualcomm QDF2400 and QDF2432
watchdog: s3c2410: Fix infinite interrupt in soft mode
platform/x86: asus-wmi: Set specified XUSB2PR value for X550LB
platform/x86: asus-wmi: Detect quirk_no_rfkill from the DSDT
x86/reboot/quirks: Add ASUS EeeBook X205TA reboot quirk
x86/reboot/quirks: Add ASUS EeeBook X205TA/W reboot quirk
usb-storage: Add ignore-residue quirk for Initio INIC-3619
x86/reboot/quirks: Fix typo in ASUS EeeBook X205TA reboot quirk
Linux 4.9.22
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit f5b98461cb upstream.
Now that our crng uses chacha20, we can rely on its speedy
characteristics for replacing MD5, while simultaneously achieving a
higher security guarantee. Before the idea was to use these functions if
you wanted random integers that aren't stupidly insecure but aren't
necessarily secure either, a vague gray zone, that hopefully was "good
enough" for its users. With chacha20, we can strengthen this claim,
since either we're using an rdrand-like instruction, or we're using the
same crng as /dev/urandom. And it's faster than what was before.
We could have chosen to replace this with a SipHash-derived function,
which might be slightly faster, but at the cost of having yet another
RNG construction in the kernel. By moving to chacha20, we have a single
RNG to analyze and verify, and we also already get good performance
improvements on all platforms.
Implementation-wise, rather than use a generic buffer for both
get_random_int/long and memcpy based on the size needs, we use a
specific buffer for 32-bit reads and for 64-bit reads. This way, we're
guaranteed to always have aligned accesses on all platforms. While
slightly more verbose in C, the assembly this generates is a lot
simpler than otherwise.
Finally, on 32-bit platforms where longs and ints are the same size,
we simply alias get_random_int to get_random_long.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Suggested-by: Theodore Ts'o <tytso@mit.edu>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The ENERGY_AWARE sched feature flag cannot be set unless
CONFIG_SCHED_DEBUG is enabled.
So this patch allows the flag to default to true at build time
if the config is set.
Change-Id: I8835a571fdb7a8f8ee6a54af1e11a69f3b5ce8e6
Signed-off-by: John Stultz <john.stultz@linaro.org>
use a window based view of time in order to track task
demand and CPU utilization in the scheduler.
Window Assisted Load Tracking (WALT) implementation credits:
Srivatsa Vaddagiri, Steve Muckle, Syed Rameez Mustafa, Joonwoo Park,
Pavan Kumar Kondeti, Olav Haugan
2016-03-06: Integration with EAS/refactoring by Vikram Mulukutla
and Todd Kjos
Change-Id: I21408236836625d4e7d7de1843d20ed5ff36c708
Includes fixes for issues:
eas/walt: Use walt_ktime_clock() instead of ktime_get_ns() to avoid a
race resulting in watchdog resets
BUG: 29353986
Change-Id: Ic1820e22a136f7c7ebd6f42e15f14d470f6bbbdb
Handle walt accounting anomoly during resume
During resume, there is a corner case where on wakeup, a task's
prev_runnable_sum can go negative. This is a workaround that
fixes the condition and warns (instead of crashing).
BUG: 29464099
Change-Id: I173e7874324b31a3584435530281708145773508
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Srinath Sridharan <srinathsr@google.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
[jstultz: fwdported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Andres Oportus <andresoportus@google.com>
Currently the build for a single-core (e.g. user-mode) Linux is broken
and this configuration is required (at least) to run some network tests.
The main issues for the current code support on single-core systems are:
1. {se,rq}::sched_avg is not available nor maintained for !SMP systems
This means that load and utilisation signals are NOT available in single
core systems. All the EAS code depends on these signals.
2. sched_group_energy is also SMP dependant. Again this means that all the
EAS setup and preparation code (energyn model initialization) has to be
properly guarded/disabled for !SMP systems.
3. SchedFreq depends on utilization signal, which is not available on
!SMP systems.
4. SchedTune is useless on unicore systems if SchedFreq is not available.
5. WALT machinery is not required on single-core systems.
This patch addresses all these issues by enforcing some constraints for
single-core systems:
a) WALT, SchedTune and SchedTune are now dependant on SMP
b) The default governor for !SMP systems is INTERACTIVE
c) The energy model initialisation/build functions are
d) Other minor code re-arrangements and CONFIG_SMP guarding to enable
single core builds.
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
To support task performance boosting, the usage of a single knob has the
advantage to be a simple solution, both from the implementation and the
usability standpoint. However, on a real system it can be difficult to
identify a single value for the knob which fits the needs of multiple
different tasks. For example, some kernel threads and/or user-space
background services should be better managed the "standard" way while we
still want to be able to boost the performance of specific workloads.
In order to improve the flexibility of the task boosting mechanism this
patch is the first of a small series which extends the previous
implementation to introduce a "per task group" support.
This first patch introduces just the basic CGroups support, a new
"schedtune" CGroups controller is added which allows to configure
different boost value for different groups of tasks.
To keep the implementation simple but still effective for a boosting
strategy, the new controller:
1. allows only a two layer hierarchy
2. supports only a limited number of boost groups
A two layer hierarchy allows to place each task either:
a) in the root control group
thus being subject to a system-wide boosting value
b) in a child of the root group
thus being subject to the specific boost value defined by that
"boost group"
The limited number of "boost groups" supported is mainly motivated by
the observation that in a real system it could be useful to have only
few classes of tasks which deserve different treatment.
For example, background vs foreground or interactive vs low-priority.
As an additional benefit, a limited number of boost groups allows also
to have a simpler implementation especially for the code required to
compute the boost value for CPUs which have runnable tasks belonging to
different boost groups.
cc: Tejun Heo <tj@kernel.org>
cc: Li Zefan <lizefan@huawei.com>
cc: Johannes Weiner <hannes@cmpxchg.org>
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
The current (CFS) scheduler implementation does not allow "to boost"
tasks performance by running them at a higher OPP compared to the
minimum required to meet their workload demands.
To support tasks performance boosting the scheduler should provide a
"knob" which allows to tune how much the system is going to be optimised
for energy efficiency vs performance.
This patch is the first of a series which provides a simple interface to
define a tuning knob. One system-wide "boost" tunable is exposed via:
/proc/sys/kernel/sched_cfs_boost
which can be configured in the range [0..100], to define a percentage
where:
- 0% boost requires to operate in "standard" mode by scheduling
tasks at the minimum capacities required by the workload demand
- 100% boost requires to push at maximum the task performances,
"regardless" of the incurred energy consumption
A boost value in between these two boundaries is used to bias the
power/performance trade-off, the higher the boost value the more the
scheduler is biased toward performance boosting instead of energy
efficiency.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Andres Oportus <andresoportus@google.com>
When candidate is the last parameter, candidate_end points to the '\0'
character and not the DM_FIELD_SEP character. In such a situation, we
should not move the candidate_end pointer one character backward.
Signed-off-by: Jeremy Compostella <jeremy.compostella@intel.com>
This is a wrap-up of three patches pending upstream approval.
I'm bundling them because they are interdependent, and it'll be
easier to drop it on rebase later.
1. dm: allow a dm-fs-style device to be shared via dm-ioctl
Integrates feedback from Alisdair, Mike, and Kiyoshi.
Two main changes occur here:
- One function is added which allows for a programmatically created
mapped device to be inserted into the dm-ioctl hash table. This binds
the device to a name and, optional, uuid which is needed by udev and
allows for userspace management of the mapped device.
- dm_table_complete() was extended to handle all of the final
functional changes required for the table to be operational once
called.
2. init: boot to device-mapper targets without an initr*
Add a dm= kernel parameter modeled after the md= parameter from
do_mounts_md. It allows for device-mapper targets to be configured at
boot time for use early in the boot process (as the root device or
otherwise). It also replaces /dev/XXX calls with major:minor opportunistically.
The format is dm="name uuid ro,table line 1,table line 2,...". The
parser expects the comma to be safe to use as a newline substitute but,
otherwise, uses the normal separator of space. Some attempt has been
made to make it forgiving of additional spaces (using skip_spaces()).
A mapped device created during boot will be assigned a minor of 0 and
may be access via /dev/dm-0.
An example dm-linear root with no uuid may look like:
root=/dev/dm-0 dm="lroot none ro, 0 4096 linear /dev/ubdb 0, 4096 4096 linear /dv/ubdc 0"
Once udev is started, /dev/dm-0 will become /dev/mapper/lroot.
Older upstream threads:
http://marc.info/?l=dm-devel&m=127429492521964&w=2http://marc.info/?l=dm-devel&m=127429499422096&w=2http://marc.info/?l=dm-devel&m=127429493922000&w=2
Latest upstream threads:
https://patchwork.kernel.org/patch/104859/https://patchwork.kernel.org/patch/104860/https://patchwork.kernel.org/patch/104861/
Bug: 27175947
Signed-off-by: Will Drewry <wad@chromium.org>
Review URL: http://codereview.chromium.org/2020011
Change-Id: I92bd53432a11241228d2e5ac89a3b20d19b05a31
Add a skip_initramfs option to allow choosing whether to boot using
the initramfs or not at runtime.
Change-Id: If30428fa748c1d4d3d7b9d97c1f781de5e4558c3
Signed-off-by: Rom Lemarchand <romlem@google.com>
This enables CONFIG_MODVERSIONS again, but allows for missing symbol CRC
information in order to work around the issue that newer binutils
versions seem to occasionally drop the CRC on the floor. binutils 2.26
seems to work fine, while binutils 2.27 seems to break MODVERSIONS of
symbols that have been defined in assembler files.
[ We've had random missing CRC's before - it may be an old problem that
just is now reliably triggered with the weak asm symbols and a new
version of binutils ]
Some day I really do want to remove MODVERSIONS entirely. Sadly, today
does not appear to be that day: Debian people apparently do want the
option to enable MODVERSIONS to make it easier to have external modules
across kernel versions, and this seems to be a fairly minimal fix for
the annoying problem.
Cc: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CONFIG_MODVERSIONS has been broken for pretty much the whole 4.9 series,
and quite frankly, nobody has cared very deeply. We absolutely know how
to fix it, and it's not _complicated_, but it's not exactly pretty
either.
This oneliner fixes it without the ugliness, and allows for further
future cleanups.
"We've secretly replaced their regular MODVERSIONS with nothing at
all, let's see if they notice"
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Otherwise each individual rotator char would be printed in a new line:
(...)
[ 0.642350] -
[ 0.644374] |
[ 0.646367] -
(...)
Signed-off-by: Nicolas Schichan <nicolas.schichan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull gcc plugins update from Kees Cook:
"This adds a new gcc plugin named "latent_entropy". It is designed to
extract as much possible uncertainty from a running system at boot
time as possible, hoping to capitalize on any possible variation in
CPU operation (due to runtime data differences, hardware differences,
SMP ordering, thermal timing variation, cache behavior, etc).
At the very least, this plugin is a much more comprehensive example
for how to manipulate kernel code using the gcc plugin internals"
* tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
latent_entropy: Mark functions with __latent_entropy
gcc-plugins: Add latent_entropy plugin
Pull kbuild updates from Michal Marek:
- EXPORT_SYMBOL for asm source by Al Viro.
This does bring a regression, because genksyms no longer generates
checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is
working on a patch to fix this.
Plus, we are talking about functions like strcpy(), which rarely
change prototypes.
- Fixes for PPC fallout of the above by Stephen Rothwell and Nick
Piggin
- fixdep speedup by Alexey Dobriyan.
- preparatory work by Nick Piggin to allow architectures to build with
-ffunction-sections, -fdata-sections and --gc-sections
- CONFIG_THIN_ARCHIVES support by Stephen Rothwell
- fix for filenames with colons in the initramfs source by me.
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits)
initramfs: Escape colons in depfile
ppc: there is no clear_pages to export
powerpc/64: whitelist unresolved modversions CRCs
kbuild: -ffunction-sections fix for archs with conflicting sections
kbuild: add arch specific post-link Makefile
kbuild: allow archs to select link dead code/data elimination
kbuild: allow architectures to use thin archives instead of ld -r
kbuild: Regenerate genksyms lexer
kbuild: genksyms fix for typeof handling
fixdep: faster CONFIG_ search
ia64: move exports to definitions
sparc32: debride memcpy.S a bit
[sparc] unify 32bit and 64bit string.h
sparc: move exports to definitions
ppc: move exports to definitions
arm: move exports to definitions
s390: move exports to definitions
m68k: move exports to definitions
alpha: move exports to actual definitions
x86: move exports to actual definitions
...
Relay avoids calling wake_up_interruptible() for doing the wakeup of
readers/consumers, waiting for the generation of new data, from the
context of a process which produced the data. This is apparently done to
prevent the possibility of a deadlock in case Scheduler itself is is
generating data for the relay, after acquiring rq->lock.
The following patch used a timer (to be scheduled at next jiffy), for
delegating the wakeup to another context.
commit 7c9cb38302
Author: Tom Zanussi <zanussi@comcast.net>
Date: Wed May 9 02:34:01 2007 -0700
relay: use plain timer instead of delayed work
relay doesn't need to use schedule_delayed_work() for waking readers
when a simple timer will do.
Scheduling a plain timer, at next jiffies boundary, to do the wakeup
causes a significant wakeup latency for the Userspace client, which makes
relay less suitable for the high-frequency low-payload use cases where the
data gets generated at a very high rate, like multiple sub buffers getting
filled within a milli second. Moreover the timer is re-scheduled on every
newly produced sub buffer so the timer keeps getting pushed out if sub
buffers are filled in a very quick succession (less than a jiffy gap
between filling of 2 sub buffers). As a result relay runs out of sub
buffers to store the new data.
By using irq_work it is ensured that wakeup of userspace client, blocked
in the poll call, is done at earliest (through self IPI or next timer
tick) enabling it to always consume the data in time. Also this makes
relay consistent with printk & ring buffers (trace), as they too use
irq_work for deferred wake up of readers.
[arnd@arndb.de: select CONFIG_IRQ_WORK]
Link: http://lkml.kernel.org/r/20160912154035.3222156-1-arnd@arndb.de
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1472906487-1559-1-git-send-email-akash.goel@intel.com
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds a new gcc plugin named "latent_entropy". It is designed to
extract as much possible uncertainty from a running system at boot time as
possible, hoping to capitalize on any possible variation in CPU operation
(due to runtime data differences, hardware differences, SMP ordering,
thermal timing variation, cache behavior, etc).
At the very least, this plugin is a much more comprehensive example for
how to manipulate kernel code using the gcc plugin internals.
The need for very-early boot entropy tends to be very architecture or
system design specific, so this plugin is more suited for those sorts
of special cases. The existing kernel RNG already attempts to extract
entropy from reliable runtime variation, but this plugin takes the idea to
a logical extreme by permuting a global variable based on any variation
in code execution (e.g. a different value (and permutation function)
is used to permute the global based on loop count, case statement,
if/then/else branching, etc).
To do this, the plugin starts by inserting a local variable in every
marked function. The plugin then adds logic so that the value of this
variable is modified by randomly chosen operations (add, xor and rol) and
random values (gcc generates separate static values for each location at
compile time and also injects the stack pointer at runtime). The resulting
value depends on the control flow path (e.g., loops and branches taken).
Before the function returns, the plugin mixes this local variable into
the latent_entropy global variable. The value of this global variable
is added to the kernel entropy pool in do_one_initcall() and _do_fork(),
though it does not credit any bytes of entropy to the pool; the contents
of the global are just used to mix the pool.
Additionally, the plugin can pre-initialize arrays with build-time
random contents, so that two different kernel builds running on identical
hardware will not have the same starting values.
Signed-off-by: Emese Revfy <re.emese@gmail.com>
[kees: expanded commit message and code comments]
Signed-off-by: Kees Cook <keescook@chromium.org>
Pull parisc updates from Helge Deller:
"Changes include:
- Fix boot of 32bit SMP kernel (initial kernel mapping was too small)
- Added hardened usercopy checks
- Drop bootmem and switch to memblock and NO_BOOTMEM implementation
- Drop the BROKEN_RODATA config option (and thus remove the relevant
code from the generic headers and files because parisc was the last
architecture which used this config option)
- Improve segfault reporting by printing human readable error strings
- Various smaller changes, e.g. dwarf debug support for assembly
code, update comments regarding copy_user_page_asm, switch to
kmalloc_array()"
* 'parisc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Increase KERNEL_INITIAL_SIZE for 32-bit SMP kernels
parisc: Drop bootmem and switch to memblock
parisc: Add hardened usercopy feature
parisc: Add cfi_startproc and cfi_endproc to assembly code
parisc: Move hpmc stack into page aligned bss section
parisc: Fix self-detected CPU stall warnings on Mako machines
parisc: Report trap type as human readable string
parisc: Update comment regarding implementation of copy_user_page_asm
parisc: Use kmalloc_array() in add_system_map_addresses()
parisc: Check return value of smp_boot_one_cpu()
parisc: Drop BROKEN_RODATA config option
PARISC was the only architecture which selected the BROKEN_RODATA config
option. Drop it and remove the special handling from init.h as well.
Signed-off-by: Helge Deller <deller@gmx.de>