linux

mirror of https://github.com/hardkernel/linux.git synced 2026-04-10 23:18:10 +09:00

Author	SHA1	Message	Date
Tao Huang	8d2d0b6a51	Merge tag 'lsk-v4.4-18.02-android' of git://git.linaro.org/kernel/linux-linaro-stable.git LSK 18.02 v4.4-android * tag 'lsk-v4.4-18.02-android': (131 commits) Linux 4.4.114 nfsd: auth: Fix gid sorting when rootsquash enabled net: tcp: close sock if net namespace is exiting flow_dissector: properly cap thoff field ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY net: Allow neigh contructor functions ability to modify the primary_key vmxnet3: repair memory leak sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf sctp: do not allow the v4 socket to bind a v4mapped v6 address r8169: fix memory corruption on retrieval of hardware statistics. pppoe: take ->needed_headroom of lower device into account on xmit net: qdisc_pkt_len_init() should be more robust tcp: __tcp_hdrlen() helper net: igmp: fix source address check for IGMPv3 reports lan78xx: Fix failure in USB Full Speed ipv6: ip6_make_skb() needs to clear cork.base.dst ipv6: fix udpv6 sendmsg crash caused by too small MTU ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state hrtimer: Reset hrtimer cpu base proper on CPU hotplug ...	2018-02-07 20:59:20 +08:00
Tao Huang	e7f55d159a	printk: add support show process information on printks Change-Id: I34cf76388ceb2e1f6b6417638c82bf774641ebac Signed-off-by: Tao Huang <huangtao@rock-chips.com>	2018-02-02 19:26:07 +08:00
Alex Shi	59e35359ec	Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android	2018-02-01 12:02:38 +08:00
Alex Shi	a40f2a595a	Merge tag 'v4.4.114' into linux-linaro-lsk-v4.4 This is the 4.4.114 stable release	2018-02-01 12:02:34 +08:00
Thomas Gleixner	360496cab5	hrtimer: Reset hrtimer cpu base proper on CPU hotplug commit `d5421ea43d` upstream. The hrtimer interrupt code contains a hang detection and mitigation mechanism, which prevents that a long delayed hrtimer interrupt causes a continous retriggering of interrupts which prevent the system from making progress. If a hang is detected then the timer hardware is programmed with a certain delay into the future and a flag is set in the hrtimer cpu base which prevents newly enqueued timers from reprogramming the timer hardware prior to the chosen delay. The subsequent hrtimer interrupt after the delay clears the flag and resumes normal operation. If such a hang happens in the last hrtimer interrupt before a CPU is unplugged then the hang_detected flag is set and stays that way when the CPU is plugged in again. At that point the timer hardware is not armed and it cannot be armed because the hang_detected flag is still active, so nothing clears that flag. As a consequence the CPU does not receive hrtimer interrupts and no timers expire on that CPU which results in RCU stalls and other malfunctions. Clear the flag along with some other less critical members of the hrtimer cpu base to ensure starting from a clean state when a CPU is plugged in. Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the root cause of that hard to reproduce heisenbug. Once understood it's trivial and certainly justifies a brown paperbag. Fixes: `41d2e49493` ("hrtimer: Tune hrtimer_interrupt hang logic") Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Sewior <bigeasy@linutronix.de> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:06:12 +01:00
Thomas Gleixner	839003061a	timers: Plug locking race vs. timer migration commit `b831275a35` upstream. Linus noticed that lock_timer_base() lacks a READ_ONCE() for accessing the timer flags. As a consequence the compiler is allowed to reload the flags between the initial check for TIMER_MIGRATION and the following timer base computation and the spin lock of the base. While this has not been observed (yet), we need to make sure that it never happens. Fixes: `0eeda71bc3` ("timer: Replace timer base by a cpu index") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:06:08 +01:00
Vegard Nossum	08b1cf4964	time: Avoid undefined behaviour in ktime_add_safe() commit `979515c564` upstream. I ran into this: ================================================================================ UBSAN: Undefined behaviour in kernel/time/hrtimer.c:310:16 signed integer overflow: 9223372036854775807 + 50000 cannot be represented in type 'long long int' CPU: 2 PID: 4798 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #91 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 0000000000000000 ffff88010ce6fb88 ffffffff82344740 0000000041b58ab3 ffffffff84f97a20 ffffffff82344694 ffff88010ce6fbb0 ffff88010ce6fb60 000000000000c350 ffff88010ce6f968 dffffc0000000000 ffffffff857bc320 Call Trace: [<ffffffff82344740>] dump_stack+0xac/0xfc [<ffffffff82344694>] ? _atomic_dec_and_lock+0xc4/0xc4 [<ffffffff8242df78>] ubsan_epilogue+0xd/0x8a [<ffffffff8242e6b4>] handle_overflow+0x202/0x23d [<ffffffff8242e4b2>] ? val_to_string.constprop.6+0x11e/0x11e [<ffffffff8236df71>] ? timerqueue_add+0x151/0x410 [<ffffffff81485c48>] ? hrtimer_start_range_ns+0x3b8/0x1380 [<ffffffff81795631>] ? memset+0x31/0x40 [<ffffffff8242e6fd>] __ubsan_handle_add_overflow+0xe/0x10 [<ffffffff81488ac9>] hrtimer_nanosleep+0x5d9/0x790 [<ffffffff814884f0>] ? hrtimer_init_sleeper+0x80/0x80 [<ffffffff813a9ffb>] ? __might_sleep+0x5b/0x260 [<ffffffff8148be10>] common_nsleep+0x20/0x30 [<ffffffff814906c7>] SyS_clock_nanosleep+0x197/0x210 [<ffffffff81490530>] ? SyS_clock_getres+0x150/0x150 [<ffffffff823c7113>] ? __this_cpu_preempt_check+0x13/0x20 [<ffffffff8162ef60>] ? __context_tracking_exit.part.3+0x30/0x1b0 [<ffffffff81490530>] ? SyS_clock_getres+0x150/0x150 [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0 [<ffffffff845f85aa>] entry_SYSCALL64_slow_path+0x25/0x25 ================================================================================ Add a new ktime_add_unsafe() helper which doesn't check for overflow, but doesn't throw a UBSAN warning when it does overflow either. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:06:08 +01:00
Daniel Bristot de Oliveira	1d00e3d9b7	sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks commit `3effcb4247` upstream. We have been facing some problems with self-suspending constrained deadline tasks. The main reason is that the original CBS was not designed for such sort of tasks. One problem reported by Xunlei Pang takes place when a task suspends, and then is awakened before the deadline, but so close to the deadline that its remaining runtime can cause the task to have an absolute density higher than allowed. In such situation, the original CBS assumes that the task is facing an early activation, and so it replenishes the task and set another deadline, one deadline in the future. This rule works fine for implicit deadline tasks. Moreover, it allows the system to adapt the period of a task in which the external event source suffered from a clock drift. However, this opens the window for bandwidth leakage for constrained deadline tasks. For instance, a task with the following parameters: runtime = 5 ms deadline = 7 ms [density] = 5 / 7 = 0.71 period = 1000 ms If the task runs for 1 ms, and then suspends for another 1ms, it will be awakened with the following parameters: remaining runtime = 4 laxity = 5 presenting a absolute density of 4 / 5 = 0.80. In this case, the original CBS would assume the task had an early wakeup. Then, CBS will reset the runtime, and the absolute deadline will be postponed by one relative deadline, allowing the task to run. The problem is that, if the task runs this pattern forever, it will keep receiving bandwidth, being able to run 1ms every 2ms. Following this behavior, the task would be able to run 500 ms in 1 sec. Thus running more than the 5 ms / 1 sec the admission control allowed it to run. Trying to address the self-suspending case, Luca Abeni, Giuseppe Lipari, and Juri Lelli [1] revisited the CBS in order to deal with self-suspending tasks. In the new approach, rather than replenishing/postponing the absolute deadline, the revised wakeup rule adjusts the remaining runtime, reducing it to fit into the allowed density. A revised version of the idea is: At a given time t, the maximum absolute density of a task cannot be higher than its relative density, that is: runtime / (deadline - t) <= dl_runtime / dl_deadline Knowing the laxity of a task (deadline - t), it is possible to move it to the other side of the equality, thus enabling to define max remaining runtime a task can use within the absolute deadline, without over-running the allowed density: runtime = (dl_runtime / dl_deadline) * (deadline - t) For instance, in our previous example, the task could still run: runtime = ( 5 / 7 ) * 5 runtime = 3.57 ms Without causing damage for other deadline tasks. It is note worthy that the laxity cannot be negative because that would cause a negative runtime. Thus, this patch depends on the patch: `df8eac8caf` ("sched/deadline: Throttle a constrained deadline task activated after the deadline") Which throttles a constrained deadline task activated after the deadline. Finally, it is also possible to use the revised wakeup rule for all other tasks, but that would require some more discussions about pros and cons. [The main difference from the original commit is that the BW_SHIFT define was not present yet. As BW_SHIFT was introduced in a new feature, I just used the value (20), likewise we used to use before the #define. Other changes were required because of comments. - bistrot] Reported-by: Xunlei Pang <xpang@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> [peterz: replaced dl_is_constrained with dl_is_implicit] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Romulo Silva de Oliveira <romulo.deoliveira@ufsc.br> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it> Link: http://lkml.kernel.org/r/5c800ab3a74a168a84ee5f3f84d12a02e11383be.1495803804.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:06:07 +01:00
Alex Shi	293c379504	Merge remote-tracking branch 'lts/linux-4.4.y' into linux-linaro-lsk-v4.4 Conflicts: keep lsk used current_stack_pointer and arch_within_stack_frames in arch/x86/include/asm/thread_info.h	2018-01-31 13:33:20 +08:00
Tao Huang	640193f76b	Merge branch 'linux-linaro-lsk-v4.4-android' of git://git.linaro.org/kernel/linux-linaro-stable.git * linux-linaro-lsk-v4.4-android: (733 commits) LSK-ANDROID: memcg: Remove wrong ->attach callback LSK-ANDROID: arm64: mm: Fix __create_pgd_mapping() call ANDROID: sdcardfs: Move default_normal to superblock blkdev: Refactoring block io latency histogram codes FROMLIST: arm64: kpti: Fix the interaction between ASID switching and software PAN FROMLIST: arm64: Move post_ttbr_update_workaround to C code FROMLIST: arm64: mm: Rename post_ttbr0_update_workaround sched: EAS: Initialize push_task as NULL to avoid direct reference on out_unlock path fscrypt: updates on 4.15-rc4 ANDROID: uid_sys_stats: fix the comment BACKPORT: tee: indicate privileged dev in gen_caps BACKPORT: tee: optee: sync with new naming of interrupts BACKPORT: tee: tee_shm: Constify dma_buf_ops structures. BACKPORT: tee: optee: interruptible RPC sleep BACKPORT: tee: optee: add const to tee_driver_ops and tee_desc structures BACKPORT: tee.txt: standardize document format BACKPORT: tee: add forward declaration for struct device BACKPORT: tee: optee: fix uninitialized symbol 'parg' BACKPORT: tee: add ARM_SMCCC dependency BACKPORT: selinux: nlmsgtab: add SOCK_DESTROY to the netlink mapping tables ... Conflicts: arch/arm64/kernel/vdso.c drivers/usb/host/xhci-plat.c include/drm/drmP.h include/linux/kasan.h kernel/time/timekeeping.c mm/kasan/kasan.c security/selinux/nlmsgtab.c Also add this commit: `0bcdc0987c` ("time: Fix ktime_get_raw() incorrect base accumulation")	2018-01-26 19:26:47 +08:00
Steven Rostedt (VMware)	cf3625004e	tracing: Fix converting enum's from the map in trace_event_eval_update() commit `1ebe1eaf2f` upstream. Since enums do not get converted by the TRACE_EVENT macro into their values, the event format displaces the enum name and not the value. This breaks tools like perf and trace-cmd that need to interpret the raw binary data. To solve this, an enum map was created to convert these enums into their actual numbers on boot up. This is done by TRACE_EVENTS() adding a TRACE_DEFINE_ENUM() macro. Some enums were not being converted. This was caused by an optization that had a bug in it. All calls get checked against this enum map to see if it should be converted or not, and it compares the call's system to the system that the enum map was created under. If they match, then they call is processed. To cut down on the number of iterations needed to find the maps with a matching system, since calls and maps are grouped by system, when a match is made, the index into the map array is saved, so that the next call, if it belongs to the same system as the previous call, could start right at that array index and not have to scan all the previous arrays. The problem was, the saved index was used as the variable to know if this is a call in a new system or not. If the index was zero, it was assumed that the call is in a new system and would keep incrementing the saved index until it found a matching system. The issue arises when the first matching system was at index zero. The next map, if it belonged to the same system, would then think it was the first match and increment the index to one. If the next call belong to the same system, it would begin its search of the maps off by one, and miss the first enum that should be converted. This left a single enum not converted properly. Also add a comment to describe exactly what that index was for. It took me a bit too long to figure out what I was thinking when debugging this issue. Link: http://lkml.kernel.org/r/717BE572-2070-4C1E-9902-9F2E0FEDA4F8@oracle.com Fixes: `0c564a538a` ("tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values") Reported-by: Chuck Lever <chuck.lever@oracle.com> Teste-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:50:16 +01:00
Xunlei Pang	8bd58b61d2	sched/deadline: Zero out positive runtime after throttling constrained tasks commit `ae83b56a56` upstream. When a contrained task is throttled by dl_check_constrained_dl(), it may carry the remaining positive runtime, as a result when dl_task_timer() fires and calls replenish_dl_entity(), it will not be replenished correctly due to the positive dl_se->runtime. This patch assigns its runtime to 0 if positive after throttling. Signed-off-by: Xunlei Pang <xlpang@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `df8eac8caf` ("sched/deadline: Throttle a constrained deadline task activated after the deadline) Link: http://lkml.kernel.org/r/1494421417-27550-1-git-send-email-xlpang@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:50:15 +01:00
Li Jinyue	58c82be944	futex: Prevent overflow by strengthen input validation commit `fbe0e839d1` upstream. UBSAN reports signed integer overflow in kernel/futex.c: UBSAN: Undefined behaviour in kernel/futex.c:2041:18 signed integer overflow: 0 - -2147483648 cannot be represented in type 'int' Add a sanity check to catch negative values of nr_wake and nr_requeue. Signed-off-by: Li Jinyue <lijinyue@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Cc: dvhart@infradead.org Link: https://lkml.kernel.org/r/1513242294-31786-1-git-send-email-lijinyue@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:50:14 +01:00
Arnd Bergmann	e6c591e7a4	gcov: disable for COMPILE_TEST commit `cc62242079` upstream. Enabling gcov is counterproductive to compile testing: it significantly increases the kernel image size, compile time, and it produces lots of false positive "may be used uninitialized" warnings as the result of missed optimizations. This is in line with how UBSAN_SANITIZE_ALL and PROFILE_ALL_BRANCHES work, both of which have similar problems. With an ARM allmodconfig kernel, I see the build time drop from 283 minutes CPU time to 225 minutes, and the vmlinux size drops from 43MB to 26MB. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Michal Marek <mmarek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:50:10 +01:00
Ke Wang	204bbd5268	sched: EAS: Initialize push_task as NULL to avoid direct reference on out_unlock path After applying up-migrate patches(`dc626b2` sched: avoid pushing tasks to an offline CPU, `2da014c` sched: Extend active balance to accept 'push_task' argument), leaving EAS disabled and doing a stability test which includes some random cpu plugin/plugout. There are two types crashes happened as below: TYPE 1: [ 2072.653091] c1 ------------[ cut here ]------------ [ 2072.653133] c1 WARNING: CPU: 1 PID: 13 at kernel/fork.c:252 __put_task_struct+0x30/0x124() [ 2072.653173] c1 CPU: 1 PID: 13 Comm: migration/1 Tainted: G W O 4.4.83-01066-g04c5403-dirty #17 [ 2072.653215] c1 [<c011141c>] (unwind_backtrace) from [<c010ced8>] (show_stack+0x20/0x24) [ 2072.653235] c1 [<c010ced8>] (show_stack) from [<c043d7f8>] (dump_stack+0xa8/0xe0) [ 2072.653255] c1 [<c043d7f8>] (dump_stack) from [<c012be04>] (warn_slowpath_common+0x98/0xc4) [ 2072.653273] c1 [<c012be04>] (warn_slowpath_common) from [<c012beec>] (warn_slowpath_null+0x2c/0x34) [ 2072.653291] c1 [<c012beec>] (warn_slowpath_null) from [<c01293b4>] (__put_task_struct+0x30/0x124) [ 2072.653310] c1 [<c01293b4>] (__put_task_struct) from [<c0166964>] (active_load_balance_cpu_stop+0x22c/0x314) [ 2072.653331] c1 [<c0166964>] (active_load_balance_cpu_stop) from [<c01c2604>] (cpu_stopper_thread+0x90/0x144) [ 2072.653352] c1 [<c01c2604>] (cpu_stopper_thread) from [<c014d80c>] (smpboot_thread_fn+0x258/0x270) [ 2072.653370] c1 [<c014d80c>] (smpboot_thread_fn) from [<c0149ee4>] (kthread+0x118/0x12c) [ 2072.653388] c1 [<c0149ee4>] (kthread) from [<c0108310>] (ret_from_fork+0x14/0x24) [ 2072.653400] c1 ---[ end trace 49c3d154890763fc ]--- [ 2072.653418] c1 Unable to handle kernel NULL pointer dereference at virtual address 00000000 ... [ 2072.832804] c1 [<c01ba00c>] (put_css_set) from [<c01be870>] (cgroup_free+0x6c/0x78) [ 2072.832823] c1 [<c01be870>] (cgroup_free) from [<c01293f8>] (__put_task_struct+0x74/0x124) [ 2072.832844] c1 [<c01293f8>] (__put_task_struct) from [<c0166964>] (active_load_balance_cpu_stop+0x22c/0x314) [ 2072.832860] c1 [<c0166964>] (active_load_balance_cpu_stop) from [<c01c2604>] (cpu_stopper_thread+0x90/0x144) [ 2072.832879] c1 [<c01c2604>] (cpu_stopper_thread) from [<c014d80c>] (smpboot_thread_fn+0x258/0x270) [ 2072.832896] c1 [<c014d80c>] (smpboot_thread_fn) from [<c0149ee4>] (kthread+0x118/0x12c) [ 2072.832914] c1 [<c0149ee4>] (kthread) from [<c0108310>] (ret_from_fork+0x14/0x24) [ 2072.832930] c1 Code: f57ff05b f590f000 e3e02000 e3a03001 (e1941f9f) [ 2072.839208] c1 ---[ end trace 49c3d154890763fd ]--- TYPE 2: [ 214.742695] c1 ------------[ cut here ]------------ [ 214.742709] c1 kernel BUG at kernel/smpboot.c:136! [ 214.742718] c1 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [ 214.748785] c1 CPU: 1 PID: 18 Comm: migration/2 Tainted: G W O 4.4.83-00912-g370f62c #1 [ 214.748805] c1 task: ef2d9680 task.stack: ee862000 [ 214.748821] c1 PC is at smpboot_thread_fn+0x168/0x270 [ 214.748832] c1 LR is at smpboot_thread_fn+0xe4/0x270 ... [ 214.821339] c1 [<c014d71c>] (smpboot_thread_fn) from [<c0149ee4>] (kthread+0x118/0x12c) [ 214.821363] c1 [<c0149ee4>] (kthread) from [<c0108310>] (ret_from_fork+0x14/0x24) [ 214.821378] c1 Code: e5950000 e5943010 e1500003 0a000000 (e7f001f2) [ 214.827676] c1 ---[ end trace da87539f59bab8de ]--- For the first type crash, the root cause is the push_task pointer will be used without initialization on the out_lock path. And maybe cpu hotplug in/out make this happen more easily. For the second type crash, it hits 'BUG_ON(td->cpu != smp_processor_id());' in smpboot_thread_fn(). It seems that OOPS was caused by migration/2 which actually running on cpu1. And I haven't found what actually happened. However, after this fix, the second type crash seems gone too. Signed-off-by: Ke Wang <ke.wang@spreadtrum.com>	2018-01-22 13:16:20 +05:30
Dmitry Vyukov	123b7a5eb3	UPSTREAM: kcov: fix comparison callback signature Fix a silly copy-paste bug. We truncated u32 args to u16. Link: http://lkml.kernel.org/r/20171207101134.107168-1-dvyukov@google.com Fixes: `ded97d2c2b` ("kcov: support comparison operands collection") Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Cc: syzkaller@googlegroups.com Cc: Alexander Potapenko <glider@google.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Bug: 64145065 (cherry-picked from 8c0431ec452de79ef3fe998c1fbb1e3d3ac13ddd) Change-Id: Ic3872c33d03a456640dd6fdcce3b0795765dc1c0 Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Victor Chibotaru	4b4c3b015e	UPSTREAM: kcov: support comparison operands collection Enables kcov to collect comparison operands from instrumented code. This is done by using Clang's -fsanitize=trace-cmp instrumentation (currently not available for GCC). The comparison operands help a lot in fuzz testing. E.g. they are used in Syzkaller to cover the interiors of conditional statements with way less attempts and thus make previously unreachable code reachable. To allow separate collection of coverage and comparison operands two different work modes are implemented. Mode selection is now done via a KCOV_ENABLE ioctl call with corresponding argument value. Link: http://lkml.kernel.org/r/20171011095459.70721-1-glider@google.com Signed-off-by: Victor Chibotaru <tchibo@google.com> Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Popov <alex.popov@linux.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Kees Cook <keescook@chromium.org> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: <syzkaller@googlegroups.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `ded97d2c2b`) Change-Id: Iaba700a3f4786048be14a5e764ccabceae114eb7 Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Andrey Ryabinin	c00b857401	UPSTREAM: kcov: remove pointless current != NULL check __sanitizer_cov_trace_pc() is a hot code, so it's worth to remove pointless '!current' check. Current is never NULL. Link: http://lkml.kernel.org/r/20170929162221.32500-1-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `fcf4edac04`) Change-Id: Ia76e8c6cc0dc3fb796d8e8b92430fcf659b52eee Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Dmitry Vyukov	78c0349eeb	UPSTREAM: kcov: support compat processes Support compat processes in KCOV by providing compat_ioctl callback. Compat mode uses the same ioctl callback: we have 2 commands that do not use the argument and 1 that already checks that the arg does not overflow INT_MAX. This allows to use KCOV-guided fuzzing in compat processes. Link: http://lkml.kernel.org/r/20170823100553.55812-1-dvyukov@google.com Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Cc: <syzkaller@googlegroups.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `7483e5d420`) Change-Id:I74b62f01941091649ce6e88b3130e4ca4274a8de Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Dmitry Vyukov	42bc836aac	UPSTREAM: kcov: simplify interrupt check in_interrupt() semantics are confusing and wrong for most users as it also returns true when bh is disabled. Thus we open coded a proper check for interrupts in __sanitizer_cov_trace_pc() with a lengthy explanatory comment. Use the new in_task() predicate instead. Link: http://lkml.kernel.org/r/20170321091026.139655-1-dvyukov@google.com Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: James Morse <james.morse@arm.com> Cc: Alexander Popov <alex.popov@linux.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `f61e869d51`) Change-Id: Ice260535314238c8f82ddc578ecaeea6177d28fc Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Alexander Popov	4f2df4ac63	UPSTREAM: kcov: make kcov work properly with KASLR enabled Subtract KASLR offset from the kernel addresses reported by kcov. Tested on x86_64 and AArch64 (Hikey LeMaker). Link: http://lkml.kernel.org/r/1481417456-28826-3-git-send-email-alex.popov@linux.com Signed-off-by: Alexander Popov <alex.popov@linux.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Rob Herring <robh@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Jon Masters <jcm@redhat.com> Cc: David Daney <david.daney@cavium.com> Cc: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Nicolai Stange <nicstange@gmail.com> Cc: James Morse <james.morse@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Popov <alex.popov@linux.com> Cc: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `4983f0ab7f`) Change-Id: Ib19d7fb559f7db28314cd13a3e33e061d1dfdec9 Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Kefeng Wang	96932cfd9f	UPSTREAM: kcov: add more missing includes It is fragile that some definitions acquired via transitive dependencies, as shown in below: atomic_* (<linux/atomic.h>) ENOMEM/EN* (<linux/errno.h>) EXPORT_SYMBOL (<linux/export.h>) device_initcall (<linux/init.h>) preempt_* (<linux/preempt.h>) Include them to prevent possible issues. Link: http://lkml.kernel.org/r/1481163221-40170-1-git-send-email-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `db862358a4`) Change-Id: Ia529631d2072cc795c46ae0276e51592318cd40f Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Kefeng Wang	36b8316195	UPSTREAM: kcov: add missing #include <linux/sched.h> In __sanitizer_cov_trace_pc we use task_struct and fields within it, but as we haven't included <linux/sched.h>, it is not guaranteed to be defined. While we usually happen to acquire the definition through a transitive include, this is fragile (and hasn't been true in the past, causing issues with backports). Include <linux/sched.h> to avoid any fragility. [mark.rutland@arm.com: rewrote changelog] Link: http://lkml.kernel.org/r/1481007384-27529-1-git-send-email-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `166ad0e1e2`) Change-Id: Id5a06b927687ace8788f623fc91cc8305fde5f2d Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Andrey Konovalov	9779e195be	UPSTREAM: kcov: properly check if we are in an interrupt in_interrupt() returns a nonzero value when we are either in an interrupt or have bh disabled via local_bh_disable(). Since we are interested in only ignoring coverage from actual interrupts, do a proper check instead of just calling in_interrupt(). As a result of this change, kcov will start to collect coverage from within local_bh_disable()/local_bh_enable() sections. Link: http://lkml.kernel.org/r/1476115803-20712-1-git-send-email-andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Cc: Nicolai Stange <nicstange@gmail.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Kees Cook <keescook@chromium.org> Cc: James Morse <james.morse@arm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `b274c0bb39`) Change-Id: I91364ef699f9af0a57959caf65372a6559d7a6b0 Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Andrey Ryabinin	928e925573	UPSTREAM: kcov: don't profile branches in kcov Profiling 'if' statements in __sanitizer_cov_trace_pc() leads to unbound recursion and crash: __sanitizer_cov_trace_pc() -> ftrace_likely_update -> __sanitizer_cov_trace_pc() ... Define DISABLE_BRANCH_PROFILING to disable this tracer. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `36f05ae8bc`) Change-Id: I53cea027ba86b89016df3944374bdb119a2ca9dd Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
James Morse	318c10553b	UPSTREAM: kcov: don't trace the code coverage code Kcov causes the compiler to add a call to __sanitizer_cov_trace_pc() in every basic block. Ftrace patches in a call to _mcount() to each function it has annotated. Letting these mechanisms annotate each other is a bad thing. Break the loop by adding 'notrace' to __sanitizer_cov_trace_pc() so that ftrace won't try to patch this code. This patch lets arm64 with KCOV and STACK_TRACER boot. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `bdab42dfc9`) Change-Id: If7708ca761f81e0645d709b263e61493fc016e01 Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Dmitry Vyukov	8c8a5ee21a	BACKPORT: kernel: add kcov code coverage kcov provides code coverage collection for coverage-guided fuzzing (randomized testing). Coverage-guided fuzzing is a testing technique that uses coverage feedback to determine new interesting inputs to a system. A notable user-space example is AFL (http://lcamtuf.coredump.cx/afl/). However, this technique is not widely used for kernel testing due to missing compiler and kernel support. kcov does not aim to collect as much coverage as possible. It aims to collect more or less stable coverage that is function of syscall inputs. To achieve this goal it does not collect coverage in soft/hard interrupts and instrumentation of some inherently non-deterministic or non-interesting parts of kernel is disbled (e.g. scheduler, locking). Currently there is a single coverage collection mode (tracing), but the API anticipates additional collection modes. Initially I also implemented a second mode which exposes coverage in a fixed-size hash table of counters (what Quentin used in his original patch). I've dropped the second mode for simplicity. This patch adds the necessary support on kernel side. The complimentary compiler support was added in gcc revision 231296. We've used this support to build syzkaller system call fuzzer, which has found 90 kernel bugs in just 2 months: https://github.com/google/syzkaller/wiki/Found-Bugs We've also found 30+ bugs in our internal systems with syzkaller. Another (yet unexplored) direction where kcov coverage would greatly help is more traditional "blob mutation". For example, mounting a random blob as a filesystem, or receiving a random blob over wire. Why not gcov. Typical fuzzing loop looks as follows: (1) reset coverage, (2) execute a bit of code, (3) collect coverage, repeat. A typical coverage can be just a dozen of basic blocks (e.g. an invalid input). In such context gcov becomes prohibitively expensive as reset/collect coverage steps depend on total number of basic blocks/edges in program (in case of kernel it is about 2M). Cost of kcov depends only on number of executed basic blocks/edges. On top of that, kernel requires per-thread coverage because there are always background threads and unrelated processes that also produce coverage. With inlined gcov instrumentation per-thread coverage is not possible. kcov exposes kernel PCs and control flow to user-space which is insecure. But debugfs should not be mapped as user accessible. Based on a patch by Quentin Casasnovas. [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode'] [akpm@linux-foundation.org: unbreak allmodconfig] [akpm@linux-foundation.org: follow x86 Makefile layout standards] Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: syzkaller <syzkaller@googlegroups.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Tavis Ormandy <taviso@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@google.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: David Drysdale <drysdale@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `5c9a8750a6`) Change-Id: I17b5e04f6e89b241924e78ec32ead79c38b860ce Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Alexander Potapenko	cc19018a12	UPSTREAM: arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections KASAN needs to know whether the allocation happens in an IRQ handler. This lets us strip everything below the IRQ entry point to reduce the number of unique stack traces needed to be stored. Move the definition of __irq_entry to <linux/interrupt.h> so that the users don't need to pull in <linux/ftrace.h>. Also introduce the __softirq_entry macro which is similar to __irq_entry, but puts the corresponding functions to the .softirqentry.text section. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 64145065 (cherry-picked from `be7635e728`) Change-Id: Ib321eb9c2b76ef4785cf3fd522169f524348bd9a Signed-off-by: Paul Lawrence <paullawrence@google.com>	2018-01-22 13:15:43 +05:30
Ke Wang	2cc4764f3f	sched: EAS/WALT: Don't take into account of running task's util For upmigrating misfit running task case, the currently running task's util has been counted into cpu_util(). Thus currently __cpu_overutilized() which add task's uitl twice is overestimated. Signed-off-by: Ke Wang <ke.wang@spreadtrum.com>	2018-01-22 13:15:43 +05:30
Viresh Kumar	56c4ae908c	BACKPORT: schedutil: Reset cached freq if it is not in sync with next_freq 'cached_raw_freq' is used to get the next frequency quickly but should always be in sync with sg_policy->next_freq. There are cases where it is not and in such cases it should be reset to avoid switching to incorrect frequencies. Consider this case for example: - policy->cur is 1.2 GHz (Max) - New request comes for 780 MHz and we store that in cached_raw_freq. - Based on 780 MHz, we calculate the effective frequency as 800 MHz. - We then decide not to update the frequency as sugov_up_down_rate_limit() return true. - Here cached_raw_freq is 780 MHz and sg_policy->next_freq is 1.2 GHz. - Now if the utilization doesn't change in next request, then the next target frequency will still be 780 MHz and it will match with cached_raw_freq and so we will directly return 1.2 GHz instead of 800 MHz. BACKPORT of upstream commit `07458f6a51` ("cpufreq: schedutil: Reset cached_raw_freq when not in sync with next_freq"). This also updates sugov_update_commit() for handling up/down tunables, which aren't present in mainline. Change-Id: I70bca2c5dfdb545a0471d1c9e4c5addb30ab5494 Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>	2018-01-22 13:15:43 +05:30
John Stultz	aafcb0f503	BACKPORT: time: Clean up CLOCK_MONOTONIC_RAW time handling (cherry pick from commit `fc6eead7c1`) Now that we fixed the sub-ns handling for CLOCK_MONOTONIC_RAW, remove the duplicitive tk->raw_time.tv_nsec, which can be stored in tk->tkr_raw.xtime_nsec (similarly to how its handled for monotonic time). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Daniel Mentz <danielmentz@google.com> Tested-by: Daniel Mentz <danielmentz@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Bug: 20045882 Bug: 63737556 Change-Id: I243827d21b08703a09d2d2fe738a9258be224582	2018-01-22 13:15:43 +05:30
John Stultz	581a337227	BACKPORT: time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting (cherry pick from commit `3d88d56c58`) Due to how the MONOTONIC_RAW accumulation logic was handled, there is the potential for a 1ns discontinuity when we do accumulations. This small discontinuity has for the most part gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW in their vDSO clock_gettime implementation, we've seen failures with the inconsistency-check test in kselftest. This patch addresses the issue by using the same sub-ns accumulation handling that CLOCK_MONOTONIC uses, which avoids the issue for in-kernel users. Since the ARM64 vDSO implementation has its own clock_gettime calculation logic, this patch reduces the frequency of errors, but failures are still seen. The ARM64 vDSO will need to be updated to include the sub-nanosecond xtime_nsec values in its calculation for this issue to be completely fixed. Signed-off-by: John Stultz <john.stultz@linaro.org> Tested-by: Daniel Mentz <danielmentz@google.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: "stable #4 . 8+" <stable@vger.kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Bug: 20045882 Bug: 63737556 Change-Id: I6c55dd7685f6bd212c6af9d09c527528e1dd5fa1	2018-01-22 13:15:43 +05:30
Amit Pundir	395ae9f4e6	Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Conflicts: kernel/fork.c Conflict due to Kaiser implementation in LTS 4.4.110. net/ipv4/raw.c Minor conflict due to LTS commit `be27b620a8` ("net: ipv4: fix for a race condition in raw_sendmsg")	2018-01-22 11:50:22 +05:30
Alex Shi	13b48cd3ac	Merge tag 'v4.4.112' into linux-linaro-lsk-v4.4 This is the 4.4.112 stable release	2018-01-18 12:02:00 +08:00
Daniel Borkmann	095b0ba360	bpf, array: fix overflow in max_entries and undefined behavior in index_mask commit `bbeb6e4323` upstream. syzkaller tried to alloc a map with 0xfffffffd entries out of a userns, and thus unprivileged. With the recently added logic in `b2157399cc` ("bpf: prevent out-of-bounds speculation") we round this up to the next power of two value for max_entries for unprivileged such that we can apply proper masking into potentially zeroed out map slots. However, this will generate an index_mask of 0xffffffff, and therefore a + 1 will let this overflow into new max_entries of 0. This will pass allocation, etc, and later on map access we still enforce on the original attr->max_entries value which was 0xfffffffd, therefore triggering GPF all over the place. Thus bail out on overflow in such case. Moreover, on 32 bit archs roundup_pow_of_two() can also not be used, since fls_long(max_entries - 1) can result in 32 and 1UL << 32 in 32 bit space is undefined. Therefore, do this by hand in a 64 bit variable. This fixes all the issues triggered by syzkaller's reproducers. Fixes: `b2157399cc` ("bpf: prevent out-of-bounds speculation") Reported-by: syzbot+b0efb8e572d01bce1ae0@syzkaller.appspotmail.com Reported-by: syzbot+6c15e9744f75f2364773@syzkaller.appspotmail.com Reported-by: syzbot+d2f5524fb46fd3b312ee@syzkaller.appspotmail.com Reported-by: syzbot+61d23c95395cc90dbc2b@syzkaller.appspotmail.com Reported-by: syzbot+0d363c942452cca68c01@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:35:31 +01:00
Alexei Starovoitov	9a7fad4c0e	bpf: prevent out-of-bounds speculation commit `b2157399cc` upstream. Under speculation, CPUs may mis-predict branches in bounds checks. Thus, memory accesses under a bounds check may be speculated even if the bounds check fails, providing a primitive for building a side channel. To avoid leaking kernel data round up array-based maps and mask the index after bounds check, so speculated load with out of bounds index will load either valid value from the array or zero from the padded area. Unconditionally mask index for all array types even when max_entries are not rounded to power of 2 for root user. When map is created by unpriv user generate a sequence of bpf insns that includes AND operation to make sure that JITed code includes the same 'index & index_mask' operation. If prog_array map is created by unpriv user replace bpf_tail_call(ctx, map, index); with if (index >= max_entries) { index &= map->index_mask; bpf_tail_call(ctx, map, index); } (along with roundup to power 2) to prevent out-of-bounds speculation. There is secondary redundant 'if (index >= max_entries)' in the interpreter and in all JITs, but they can be optimized later if necessary. Other array-like maps (cpumap, devmap, sockmap, perf_event_array, cgroup_array) cannot be used by unpriv, so no changes there. That fixes bpf side of "Variant 1: bounds check bypass (CVE-2017-5753)" on all architectures with and without JIT. v2->v3: Daniel noticed that attack potentially can be crafted via syscall commands without loading the program, so add masking to those paths as well. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:35:31 +01:00
Alexei Starovoitov	648064515d	bpf: adjust insn_aux_data when patching insns commit `8041902dae` upstream. convert_ctx_accesses() replaces single bpf instruction with a set of instructions. Adjust corresponding insn_aux_data while patching. It's needed to make sure subsequent 'for(all insn)' loops have matching insn and insn_aux_data. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:35:31 +01:00
Alexei Starovoitov	19614eee06	bpf: refactor fixup_bpf_calls() commit `79741b3bde` upstream. reduce indent and make it iterate over instructions similar to convert_ctx_accesses(). Also convert hard BUG_ON into soft verifier error. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:35:30 +01:00
Alexei Starovoitov	14c7c55f45	bpf: move fixup_bpf_calls() function commit `e245c5c6a5` upstream. no functional change. move fixup_bpf_calls() to verifier.c it's being refactored in the next patch Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:35:30 +01:00
Jakub Kicinski	0748b80e43	bpf: don't (ab)use instructions to store state commit `3df126f35f` upstream. Storing state in reserved fields of instructions makes it impossible to run verifier on programs already marked as read-only. Allocate and use an array of per-instruction state instead. While touching the error path rename and move existing jump target. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:35:30 +01:00
Daniel Borkmann	087a92287d	bpf: add bpf_patch_insn_single helper commit `c237ee5eb3` upstream. Move the functionality to patch instructions out of the verifier code and into the core as the new bpf_patch_insn_single() helper will be needed later on for blinding as well. No changes in functionality. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:35:30 +01:00
Davidlohr Bueso	bd44e3f19d	locking/mutex: Allow next waiter lockless wakeup commit `1329ce6fbb` upstream. Make use of wake-queues and enable the wakeup to occur after releasing the wait_lock. This is similar to what we do with rtmutex top waiter, slightly shortening the critical region and allow other waiters to acquire the wait_lock sooner. In low contention cases it can also help the recently woken waiter to find the wait_lock available (fastpath) when it continues execution. Reviewed-by: Waiman Long <Waiman.Long@hpe.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ding Tianhong <dingtianhong@huawei.com> Cc: Jason Low <jason.low2@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul E. McKenney <paulmck@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Waiman Long <waiman.long@hpe.com> Cc: Will Deacon <Will.Deacon@arm.com> Link: http://lkml.kernel.org/r/20160125022343.GA3322@linux-uzut.site Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:35:27 +01:00
Jianyu Zhan	1920b8a6a6	futex: Replace barrier() in unqueue_me() with READ_ONCE() commit `29b75eb2d5` upstream. Commit `e91467ecd1` ("bug in futex unqueue_me") introduced a barrier() in unqueue_me() to prevent the compiler from rereading the lock pointer which might change after a check for NULL. Replace the barrier() with a READ_ONCE() for the following reasons: 1) READ_ONCE() is a weaker form of barrier() that affects only the specific load operation, while barrier() is a general compiler level memory barrier. READ_ONCE() was not available at the time when the barrier was added. 2) Aside of that READ_ONCE() is descriptive and self explainatory while a barrier without comment is not clear to the casual reader. No functional change. [ tglx: Massaged changelog ] Signed-off-by: Jianyu Zhan <nasa4836@gmail.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Darren Hart <dvhart@linux.intel.com> Cc: dave@stgolabs.net Cc: peterz@infradead.org Cc: linux@rasmusvillemoes.dk Cc: akpm@linux-foundation.org Cc: fengguang.wu@intel.com Cc: bigeasy@linutronix.de Link: http://lkml.kernel.org/r/1457314344-5685-1-git-send-email-nasa4836@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:35:27 +01:00
Alex Shi	921884d92b	Merge tag 'v4.4.111' into linux-linaro-lsk-v4.4 This is the 4.4.111 stable release	2018-01-11 12:01:16 +08:00
Libor Pechacek	c819a67f7e	module: Issue warnings when tainting kernel commit `3205c36cf7` upstream. While most of the locations where a kernel taint bit is set are accompanied with a warning message, there are two which set their bits silently. If the tainting module gets unloaded later on, it is almost impossible to tell what was the reason for setting the flag. Signed-off-by: Libor Pechacek <lpechacek@suse.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:27:14 +01:00
Miroslav Benes	7e35bc655e	module: keep percpu symbols in module's symtab commit `e022441851` upstream. Currently, percpu symbols from .data..percpu ELF section of a module are not copied over and stored in final symtab array of struct module. Consequently such symbol cannot be returned via kallsyms API (for example kallsyms_lookup_name). This can be especially confusing when the percpu symbol is exported. Only its __ksymtab et al. are present in its symtab. The culprit is in layout_and_allocate() function where SHF_ALLOC flag is dropped for .data..percpu section. There is in fact no need to copy the section to final struct module, because kernel module loader allocates extra percpu section by itself. Unfortunately only symbols from SHF_ALLOC sections are copied due to a check in is_core_symbol(). The patch changes is_core_symbol() function to copy over also percpu symbols (their st_shndx points to .data..percpu ELF section). We do it only if CONFIG_KALLSYMS_ALL is set to be consistent with the rest of the function (ELF section is SHF_ALLOC but !SHF_EXECINSTR). Finally elf_type() returns type 'a' for a percpu symbol because its address is absolute. Signed-off-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:27:13 +01:00
Oleg Nesterov	5f1aa83c58	kernel/signal.c: remove the no longer needed SIGNAL_UNKILLABLE check in complete_signal() commit `426915796c` upstream. complete_signal() checks SIGNAL_UNKILLABLE before it starts to destroy the thread group, today this is wrong in many ways. If nothing else, fatal_signal_pending() should always imply that the whole thread group (except ->group_exit_task if it is not NULL) is killed, this check breaks the rule. After the previous changes we can rely on sig_task_ignored(); sig_fatal(sig) && SIGNAL_UNKILLABLE can only be true if we actually want to kill this task and sig == SIGKILL OR it is traced and debugger can intercept the signal. This should hopefully fix the problem reported by Dmitry. This test-case static int init(void arg) { for (;;) pause(); } int main(void) { char stack[16 1024]; for (;;) { int pid = clone(init, stack + sizeof(stack)/2, CLONE_NEWPID \| SIGCHLD, NULL); assert(pid > 0); assert(ptrace(PTRACE_ATTACH, pid, 0, 0) == 0); assert(waitpid(-1, NULL, WSTOPPED) == pid); assert(ptrace(PTRACE_DETACH, pid, 0, SIGSTOP) == 0); assert(syscall(__NR_tkill, pid, SIGKILL) == 0); assert(pid == wait(NULL)); } } triggers the WARN_ON_ONCE(!(task->jobctl & JOBCTL_STOP_PENDING)) in task_participate_group_stop(). do_signal_stop()->signal_group_exit() checks SIGNAL_GROUP_EXIT and return false, but task_set_jobctl_pending() checks fatal_signal_pending() and does not set JOBCTL_STOP_PENDING. And his should fix the minor security problem reported by Kyle, SECCOMP_RET_TRACE can miss fatal_signal_pending() the same way if the task is the root of a pid namespace. Link: http://lkml.kernel.org/r/20171103184246.GD21036@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Kyle Huey <me@kylehuey.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:27:11 +01:00
Oleg Nesterov	7a7f54f8e3	kernel/signal.c: protect the SIGNAL_UNKILLABLE tasks from !sig_kernel_only() signals commit `ac25385089` upstream. Change sig_task_ignored() to drop the SIG_DFL && !sig_kernel_only() signals even if force == T. This simplifies the next change and this matches the same check in get_signal() which will drop these signals anyway. Link: http://lkml.kernel.org/r/20171103184227.GC21036@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:27:11 +01:00
Oleg Nesterov	be95f1308f	kernel/signal.c: protect the traced SIGNAL_UNKILLABLE tasks from SIGKILL commit `628c1bcba2` upstream. The comment in sig_ignored() says "Tracers may want to know about even ignored signals" but SIGKILL can not be reported to debugger and it is just wrong to return 0 in this case: SIGKILL should only kill the SIGNAL_UNKILLABLE task if it comes from the parent ns. Change sig_ignored() to ignore ->ptrace if sig == SIGKILL and rely on sig_task_ignored(). SISGTOP coming from within the namespace is not really right too but at least debugger can intercept it, and we can't drop it here because this will break "gdb -p 1": ptrace_attach() won't work. Perhaps we will add another ->ptrace check later, we will see. Link: http://lkml.kernel.org/r/20171103184206.GB21036@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:27:10 +01:00
Thiago Rafael Becker	58330ec2fe	kernel: make groups_sort calling a responsibility group_info allocators commit `bdcf0a423e` upstream. In testing, we found that nfsd threads may call set_groups in parallel for the same entry cached in auth.unix.gid, racing in the call of groups_sort, corrupting the groups for that entry and leading to permission denials for the client. This patch: - Make groups_sort globally visible. - Move the call to groups_sort to the modifiers of group_info - Remove the call to groups_sort from set_groups Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com Signed-off-by: Thiago Rafael Becker <thiago.becker@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: NeilBrown <neilb@suse.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:27:10 +01:00

1 2 3 4 5 ...

22419 Commits