Commit Graph

22457 Commits

Author SHA1 Message Date
Tao Huang
3430c68a33 Merge branch 'linux-linaro-lsk-v4.4-android' of git://git.linaro.org/kernel/linux-linaro-stable.git
* linux-linaro-lsk-v4.4-android: (660 commits)
  ANDROID: keychord: Check for write data size
  ANDROID: sdcardfs: Set num in extension_details during make_item
  ANDROID: sdcardfs: Hold i_mutex for i_size_write
  BACKPORT, FROMGIT: crypto: speck - add test vectors for Speck64-XTS
  BACKPORT, FROMGIT: crypto: speck - add test vectors for Speck128-XTS
  BACKPORT, FROMGIT: crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS
  FROMGIT: crypto: speck - export common helpers
  BACKPORT, FROMGIT: crypto: speck - add support for the Speck block cipher
  UPSTREAM: ANDROID: binder: synchronize_rcu() when using POLLFREE.
  f2fs: updates on v4.16-rc1
  BACKPORT: tee: shm: Potential NULL dereference calling tee_shm_register()
  BACKPORT: tee: shm: don't put_page on null shm->pages
  BACKPORT: tee: shm: make function __tee_shm_alloc static
  BACKPORT: tee: optee: check type of registered shared memory
  BACKPORT: tee: add start argument to shm_register callback
  BACKPORT: tee: optee: fix header dependencies
  BACKPORT: tee: shm: inline tee_shm_get_id()
  BACKPORT: tee: use reference counting for tee_context
  BACKPORT: tee: optee: enable dynamic SHM support
  BACKPORT: tee: optee: add optee-specific shared pool implementation
  ...

Conflicts:
	drivers/irqchip/Kconfig
	drivers/media/i2c/tc35874x.c
	drivers/media/v4l2-core/v4l2-compat-ioctl32.c
	drivers/usb/gadget/function/f_fs.c
	fs/f2fs/node.c

Change-Id: Icecd73a515821b536fa3d81ea91b63d9b3699916
2018-03-09 19:10:14 +08:00
John Stultz
963fcc6a19 time: Fix ktime_get_raw() incorrect base accumulation
In comqit fc6eead7c1 ("time: Clean up CLOCK_MONOTONIC_RAW time
handling"), the following code got mistakenly added to the update of the
raw timekeeper:

 /* Update the monotonic raw base */
 seconds = tk->raw_sec;
 nsec = (u32)(tk->tkr_raw.xtime_nsec >> tk->tkr_raw.shift);
 tk->tkr_raw.base = ns_to_ktime(seconds * NSEC_PER_SEC + nsec);

Which adds the raw_sec value and the shifted down raw xtime_nsec to the
base value.

But the read function adds the shifted down tk->tkr_raw.xtime_nsec value
another time, The result of this is that ktime_get_raw() users (which are
all internal users) see the raw time move faster then it should (the rate
at which can vary with the current size of tkr_raw.xtime_nsec), which has
resulted in at least problems with graphics rendering performance.

The change tried to match the monotonic base update logic:

 seconds = (u64)(tk->xtime_sec + tk->wall_to_monotonic.tv_sec);
 nsec = (u32) tk->wall_to_monotonic.tv_nsec;
 tk->tkr_mono.base = ns_to_ktime(seconds * NSEC_PER_SEC + nsec);

Which adds the wall_to_monotonic.tv_nsec value, but not the
tk->tkr_mono.xtime_nsec value to the base.

To fix this, simplify the tkr_raw.base accumulation to only accumulate the
raw_sec portion, and do not include the tkr_raw.xtime_nsec portion, which
will be added at read time.

Fixes: fc6eead7c1 ("time: Clean up CLOCK_MONOTONIC_RAW time handling")
Reported-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Daniel Mentz <danielmentz@google.com>
Link: http://lkml.kernel.org/r/1503701824-1645-1-git-send-email-john.stultz@linaro.org

(cherry picked from commit 0bcdc0987c)

Change-Id: I91d552bef42005d954f77963beafdca3cb6eb246
2018-03-05 21:56:13 +05:30
Chris Redpath
daa3eacc9e sched/fair: prevent possible infinite loop in sched_group_energy
There is a race between hotplug and energy_diff which might result
in endless loop in sched_group_energy. When this happens, the end
condition cannot be detected.

We can store how many CPUs we need to visit at the beginning, and
bail out of the energy calculation if we visit more cpus than expected.

Bug: 72311797 72202633
Change-Id: I8dda75468ee1570da4071cd8165ef5131a8205d8
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2018-03-05 21:56:13 +05:30
Roberto Pereira
3a73be5e84 UPSTREAM: config: android-base: disable CONFIG_NFSD and CONFIG_NFS_FS
Disable Network file system support.

Reviewed-at: https://android-review.googlesource.com/#/c/409559/

Signed-off-by: Roberto Pereira <rpere@google.com>
[AmitP: cherry-picked this change from Android common kernel
        and updated commit message]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit 9e69dd0179)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Chenbo Feng
a29dcc8500 UPSTREAM: config: android-base: add CGROUP_BPF
Add CONFIG_CGROUP_BPF as a default configuration in android base config
since it is used to replace XT_QTAGUID in future.

Reviewed-at: https://android-review.googlesource.com/#/c/400374/

Signed-off-by: Chenbo Feng <fengc@google.com>
[AmitP: cherry-picked this change from Android common kernel]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit 2edfe6be20)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Greg Kroah-Hartman
b07dfc66f0 UPSTREAM: config: android-base: add CONFIG_MODULES option
This adds CONFIG_MODULES, CONFIG_MODULE_UNLOAD, and CONFIG_MODVERSIONS
which are required by the O release.

Reviewed-at: https://android-review.googlesource.com/#/c/364554/

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[AmitP: cherry-picked this change from Android common kernel]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit 2096e17063)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Greg Kroah-Hartman
dda4c3c6a9 UPSTREAM: config: android-base: add CONFIG_IKCONFIG option
This adds CONFIG_IKCONFIG and CONFIG_IKCONFIG_PROC options, which are a
requirement for the O release.

Reviewed-at: https://android-review.googlesource.com/#/c/364553/

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[AmitP: cherry-picked this change from Android common kernel]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit 5b89db2fa5)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Max Shi
d34990f648 UPSTREAM: config: android-base: disable CONFIG_USELIB and CONFIG_FHANDLE
Turn off the two kernel configs to disable related system ABI.

Reviewed-at: https://android-review.googlesource.com/#/c/264976/

Signed-off-by: Max Shi <meixuanshi@google.com>
[AmitP: cherry-picked this change from Android common kernel]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit c1ebc2febd)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Amit Pundir
8a03c3c6ed UPSTREAM: config: android-base: enable hardened usercopy and kernel ASLR
Enable CONFIG_HARDENED_USERCOPY and CONFIG_RANDOMIZE_BASE in Android
base config fragment.

Reviewed at https://android-review.googlesource.com/#/c/283659/
Reviewed at https://android-review.googlesource.com/#/c/278133/

Link: http://lkml.kernel.org/r/1481113148-29204-2-git-send-email-amit.pundir@linaro.org
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Cc: Rob Herring <rob.herring@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 0d31a194d5)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Rob Herring
1a38086872 UPSTREAM: config: android: enable CONFIG_SECCOMP
As of Android N, SECCOMP is required. Without it, we will get
mediaextractor error:

E /system/bin/mediaextractor: libminijail: prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER): Invalid argument

Link: http://lkml.kernel.org/r/20160908185934.18098-3-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 2489a1771a)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Rob Herring
e981413d19 UPSTREAM: config: android: set SELinux as default security mode
Android won't boot without SELinux enabled, so make it the default.

Link: http://lkml.kernel.org/r/20160908185934.18098-2-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit d90ae51a3e)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Rob Herring
014074e2d0 UPSTREAM: config: android: move device mapper options to recommended
CONFIG_MD is in recommended, but other dependent options like DM_CRYPT and
DM_VERITY options are in base.  The result is the options in base don't
get enabled when applying both base and recommended fragments.  Move all
the options to recommended.

Link: http://lkml.kernel.org/r/20160908185934.18098-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit f023a3956f)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Borislav Petkov
acef484554 UPSTREAM: config/android: Remove CONFIG_IPV6_PRIVACY
Option is long gone, see commit 5d9efa7ee9 ("ipv6: Remove privacy
config option.")

Link: http://lkml.kernel.org/r/20160811170340.9859-1-bp@alien8.de
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit a2c6a235db)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Rob Herring
bbf3d8ad4f UPSTREAM: config: add android config fragments
Copy the config fragments from the AOSP common kernel android-4.4
branch.  It is becoming possible to run mainline kernels with Android,
but the kernel defconfigs don't work as-is and debugging missing config
options is a pain.  Adding the config fragments into the kernel tree,
makes configuring a mainline kernel as simple as:

  make ARCH=arm multi_v7_defconfig android-base.config android-recommended.config

The following non-upstream config options were removed:

  CONFIG_NETFILTER_XT_MATCH_QTAGUID
  CONFIG_NETFILTER_XT_MATCH_QUOTA2
  CONFIG_NETFILTER_XT_MATCH_QUOTA2_LOG
  CONFIG_PPPOLAC
  CONFIG_PPPOPNS
  CONFIG_SECURITY_PERF_EVENTS_RESTRICT
  CONFIG_USB_CONFIGFS_F_MTP
  CONFIG_USB_CONFIGFS_F_PTP
  CONFIG_USB_CONFIGFS_F_ACC
  CONFIG_USB_CONFIGFS_F_AUDIO_SRC
  CONFIG_USB_CONFIGFS_UEVENT
  CONFIG_INPUT_KEYCHORD
  CONFIG_INPUT_KEYRESET

Link: http://lkml.kernel.org/r/1466708235-28593-1-git-send-email-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 27eb6622ab)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Jiri Slaby
6182c40160 BACKPORT: exit_thread: accept a task parameter to be exited
We need to call exit_thread from copy_process in a fail path.  So make it
accept task_struct as a parameter.

[v2]
* s390: exit_thread_runtime_instr doesn't make sense to be called for
  non-current tasks.
* arm: fix the comment in vfp_thread_copy
* change 'me' to 'tsk' for task_struct
* now we can change only archs that actually have exit_thread

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit e64646946e)

Conflicts:
	arch/s390/kernel/process.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-05 21:56:13 +05:30
Joel Fernandes
6c48002863 ANDROID: sched/rt: schedtune: Add boost retention to RT
Boosted RT tasks can be deboosted quickly, this makes boost usless
for RT tasks and causes lots of glitching. Use timers to prevent
de-boost too soon and wait for long enough such that next enqueue
happens after a threshold.

While this can be solved in the governor, there are following
advantages:
- The approach used is governor-independent
- Reduces boost group lock contention for frequently sleepers/wakers

Note:
Fixed build breakage due to schedfreq dependency which isn't used
for RT anymore.

Bug: 30210506

Change-Id: I428a2695cac06cc3458cdde0dea72315e4e66c00
Signed-off-by: Joel Fernandes <joelaf@google.com>
2018-03-05 21:52:26 +05:30
Ke Wang
272395d80b ANDROID: sched: EAS: check energy_aware() before calling select_energy_cpu_brute() in up-migrate path
In up-migrate path, select_energy_cpu_brute() was called directly
without checking energy_aware(). This will make select_energy_cpu_brute()
always worked even disabling energy_aware() on the asymmetric cpu
capacity system.

Signed-off-by: Ke Wang <ke.wang@spreadtrum.com>
2018-03-05 21:52:13 +05:30
Amit Pundir
24740dab5c Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>

Conflicts:
    fs/f2fs/extent_cache.c
        Pick changes from AOSP Change-Id: Icd8a85ac0c19a8aa25cd2591a12b4e9b85bdf1c5
        ("f2fs: catch up to v4.14-rc1")

    fs/f2fs/namei.c
        Pick changes from AOSP F2FS backport commit 7d5c08fd91
        ("f2fs: backport from (4c1fad64 - Merge tag 'for-f2fs-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs)")
2018-03-05 20:20:17 +05:30
Alex Shi
a2b5256f1e Merge tag 'v4.4.120' into linux-linaro-lsk-v4.4
This is the 4.4.120 stable release
2018-03-05 12:05:38 +08:00
Anna-Maria Gleixner
b77d5ffcb7 hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers)
commit 48d0c9becc upstream.

The POSIX specification defines that relative CLOCK_REALTIME timers are not
affected by clock modifications. Those timers have to use CLOCK_MONOTONIC
to ensure POSIX compliance.

The introduction of the additional HRTIMER_MODE_PINNED mode broke this
requirement for pinned timers.

There is no user space visible impact because user space timers are not
using pinned mode, but for consistency reasons this needs to be fixed.

Check whether the mode has the HRTIMER_MODE_REL bit set instead of
comparing with HRTIMER_MODE_ABS.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Fixes: 597d027573 ("timers: Framework for identifying pinned timers")
Link: http://lkml.kernel.org/r/20171221104205.7269-7-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-03 10:19:41 +01:00
Alex Shi
5a49e50670 Merge tag 'v4.4.118' into linux-linaro-lsk-v4.4
This is the 4.4.118 stable release
2018-02-26 12:02:27 +08:00
Andi Kleen
6cd5513c81 module/retpoline: Warn about missing retpoline in module
(cherry picked from commit caf7501a1b)

There's a risk that a kernel which has full retpoline mitigations becomes
vulnerable when a module gets loaded that hasn't been compiled with the
right compiler or the right option.

To enable detection of that mismatch at module load time, add a module info
string "retpoline" at build time when the module was compiled with
retpoline support. This only covers compiled C source, but assembler source
or prebuilt object files are not checked.

If a retpoline enabled kernel detects a non retpoline protected module at
load time, print a warning and report it in the sysfs vulnerability file.

[ tglx: Massaged changelog ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: gregkh@linuxfoundation.org
Cc: torvalds@linux-foundation.org
Cc: jeyu@kernel.org
Cc: arjan@linux.intel.com
Link: https://lkml.kernel.org/r/20180125235028.31211-1-andi@firstfloor.org
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
[jwang: port to 4.4]
Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:03:52 +01:00
Arnd Bergmann
28ad68ba16 profile: hide unused functions when !CONFIG_PROC_FS
commit ade356b99a upstream.

A couple of functions and variables in the profile implementation are
used only on SMP systems by the procfs code, but are unused if either
procfs is disabled or in uniprocessor kernels.  gcc prints a harmless
warning about the unused symbols:

  kernel/profile.c:243:13: error: 'profile_flip_buffers' defined but not used [-Werror=unused-function]
   static void profile_flip_buffers(void)
               ^
  kernel/profile.c:266:13: error: 'profile_discard_flip_buffers' defined but not used [-Werror=unused-function]
   static void profile_discard_flip_buffers(void)
               ^
  kernel/profile.c:330:12: error: 'profile_cpu_callback' defined but not used [-Werror=unused-function]
   static int profile_cpu_callback(struct notifier_block *info,
              ^

This adds further #ifdef to the file, to annotate exactly in which cases
they are used.  I have done several thousand ARM randconfig kernels with
this patch applied and no longer get any warnings in this file.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:03:44 +01:00
Jens Axboe
28de93896a blktrace: fix unlocked registration of tracepoints
commit a6da0024ff upstream.

We need to ensure that tracepoints are registered and unregistered
with the users of them. The existing atomic count isn't enough for
that. Add a lock around the tracepoints, so we serialize access
to them.

This fixes cases where we have multiple users setting up and
tearing down tracepoints, like this:

CPU: 0 PID: 2995 Comm: syzkaller857118 Not tainted
4.14.0-rc5-next-20171018+ #36
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:16 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:52
  panic+0x1e4/0x41c kernel/panic.c:183
  __warn+0x1c4/0x1e0 kernel/panic.c:546
  report_bug+0x211/0x2d0 lib/bug.c:183
  fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:177
  do_trap_no_signal arch/x86/kernel/traps.c:211 [inline]
  do_trap+0x260/0x390 arch/x86/kernel/traps.c:260
  do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:297
  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:310
  invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:tracepoint_add_func kernel/tracepoint.c:210 [inline]
RIP: 0010:tracepoint_probe_register_prio+0x397/0x9a0 kernel/tracepoint.c:283
RSP: 0018:ffff8801d1d1f6c0 EFLAGS: 00010293
RAX: ffff8801d22e8540 RBX: 00000000ffffffef RCX: ffffffff81710f07
RDX: 0000000000000000 RSI: ffffffff85b679c0 RDI: ffff8801d5f19818
RBP: ffff8801d1d1f7c8 R08: ffffffff81710c10 R09: 0000000000000004
R10: ffff8801d1d1f6b0 R11: 0000000000000003 R12: ffffffff817597f0
R13: 0000000000000000 R14: 00000000ffffffff R15: ffff8801d1d1f7a0
  tracepoint_probe_register+0x2a/0x40 kernel/tracepoint.c:304
  register_trace_block_rq_insert include/trace/events/block.h:191 [inline]
  blk_register_tracepoints+0x1e/0x2f0 kernel/trace/blktrace.c:1043
  do_blk_trace_setup+0xa10/0xcf0 kernel/trace/blktrace.c:542
  blk_trace_setup+0xbd/0x180 kernel/trace/blktrace.c:564
  sg_ioctl+0xc71/0x2d90 drivers/scsi/sg.c:1089
  vfs_ioctl fs/ioctl.c:45 [inline]
  do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
  SYSC_ioctl fs/ioctl.c:700 [inline]
  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
  entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x444339
RSP: 002b:00007ffe05bb5b18 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000006d66c0 RCX: 0000000000444339
RDX: 000000002084cf90 RSI: 00000000c0481273 RDI: 0000000000000009
RBP: 0000000000000082 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000206 R12: ffffffffffffffff
R13: 00000000c0481273 R14: 0000000000000000 R15: 0000000000000000

since we can now run these in parallel. Ensure that the exported helpers
for doing this are grabbing the queue trace mutex.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:03:35 +01:00
Alex Shi
cb827992c7 Merge tag 'v4.4.116' into linux-linaro-lsk-v4.4
This is the 4.4.116 stable release
2018-02-20 12:04:39 +08:00
Steven Rostedt (VMware)
a94d7722b3 ftrace: Remove incorrect setting of glob search field
commit 7b65865627 upstream.

__unregister_ftrace_function_probe() will incorrectly parse the glob filter
because it resets the search variable that was setup by filter_parse_regex().

Al Viro reported this:

    After that call of filter_parse_regex() we could have func_g.search not
    equal to glob only if glob started with '!' or '*'.  In the former case
    we would've buggered off with -EINVAL (not = 1).  In the latter we
    would've set func_g.search equal to glob + 1, calculated the length of
    that thing in func_g.len and proceeded to reset func_g.search back to
    glob.

    Suppose the glob is e.g. *foo*.  We end up with
	    func_g.type = MATCH_MIDDLE_ONLY;
	    func_g.len = 3;
	    func_g.search = "*foo";
    Feeding that to ftrace_match_record() will not do anything sane - we
    will be looking for names containing "*foo" (->len is ignored for that
    one).

Link: http://lkml.kernel.org/r/20180127031706.GE13338@ZenIV.linux.org.uk

Fixes: 3ba0092971 ("ftrace: Introduce ftrace_glob structure")
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16 20:09:47 +01:00
Rasmus Villemoes
c832448d07 kernel/async.c: revert "async: simplify lowest_in_progress()"
commit 4f7e988e63 upstream.

This reverts commit 92266d6ef6 ("async: simplify lowest_in_progress()")
which was simply wrong: In the case where domain is NULL, we now use the
wrong offsetof() in the list_first_entry macro, so we don't actually
fetch the ->cookie value, but rather the eight bytes located
sizeof(struct list_head) further into the struct async_entry.

On 64 bit, that's the data member, while on 32 bit, that's a u64 built
from func and data in some order.

I think the bug happens to be harmless in practice: It obviously only
affects callers which pass a NULL domain, and AFAICT the only such
caller is

  async_synchronize_full() ->
  async_synchronize_full_domain(NULL) ->
  async_synchronize_cookie_domain(ASYNC_COOKIE_MAX, NULL)

and the ASYNC_COOKIE_MAX means that in practice we end up waiting for
the async_global_pending list to be empty - but it would break if
somebody happened to pass (void*)-1 as the data element to
async_schedule, and of course also if somebody ever does a
async_synchronize_cookie_domain(, NULL) with a "finite" cookie value.

Maybe the "harmless in practice" means this isn't -stable material.  But
I'm not completely confident my quick git grep'ing is enough, and there
might be affected code in one of the earlier kernels that has since been
removed, so I'll leave the decision to the stable guys.

Link: http://lkml.kernel.org/r/20171128104938.3921-1-linux@rasmusvillemoes.dk
Fixes: 92266d6ef6 "async: simplify lowest_in_progress()"
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Adam Wallis <awallis@codeaurora.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16 20:09:45 +01:00
Steven Rostedt (VMware)
911357aed6 sched/rt: Up the root domain ref count when passing it around via IPIs
commit 364f566537 upstream.

When issuing an IPI RT push, where an IPI is sent to each CPU that has more
than one RT task scheduled on it, it references the root domain's rto_mask,
that contains all the CPUs within the root domain that has more than one RT
task in the runable state. The problem is, after the IPIs are initiated, the
rq->lock is released. This means that the root domain that is associated to
the run queue could be freed while the IPIs are going around.

Add a sched_get_rd() and a sched_put_rd() that will increment and decrement
the root domain's ref count respectively. This way when initiating the IPIs,
the scheduler will up the root domain's ref count before releasing the
rq->lock, ensuring that the root domain does not go away until the IPI round
is complete.

Reported-by: Pavan Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 4bdced5c9a ("sched/rt: Simplify the IPI based RT balancing logic")
Link: http://lkml.kernel.org/r/CAEU1=PkiHO35Dzna8EQqNSKW1fr1y1zRQ5y66X117MG06sQtNA@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16 20:09:40 +01:00
Steven Rostedt (VMware)
af9de1a10f sched/rt: Use container_of() to get root domain in rto_push_irq_work_func()
commit ad0f1d9d65 upstream.

When the rto_push_irq_work_func() is called, it looks at the RT overloaded
bitmask in the root domain via the runqueue (rq->rd). The problem is that
during CPU up and down, nothing here stops rq->rd from changing between
taking the rq->rd->rto_lock and releasing it. That means the lock that is
released is not the same lock that was taken.

Instead of using this_rq()->rd to get the root domain, as the irq work is
part of the root domain, we can simply get the root domain from the irq work
that is passed to the routine:

 container_of(work, struct root_domain, rto_push_work)

This keeps the root domain consistent.

Reported-by: Pavan Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 4bdced5c9a ("sched/rt: Simplify the IPI based RT balancing logic")
Link: http://lkml.kernel.org/r/CAEU1=PkiHO35Dzna8EQqNSKW1fr1y1zRQ5y66X117MG06sQtNA@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16 20:09:40 +01:00
Thomas Gleixner
77f56f5d39 posix-timer: Properly check sigevent->sigev_notify
commit cef31d9af9 upstream.

timer_create() specifies via sigevent->sigev_notify the signal delivery for
the new timer. The valid modes are SIGEV_NONE, SIGEV_SIGNAL, SIGEV_THREAD
and (SIGEV_SIGNAL | SIGEV_THREAD_ID).

The sanity check in good_sigevent() is only checking the valid combination
for the SIGEV_THREAD_ID bit, i.e. SIGEV_SIGNAL, but if SIGEV_THREAD_ID is
not set it accepts any random value.

This has no real effects on the posix timer and signal delivery code, but
it affects show_timer() which handles the output of /proc/$PID/timers. That
function uses a string array to pretty print sigev_notify. The access to
that array has no bound checks, so random sigev_notify cause access beyond
the array bounds.

Add proper checks for the valid notify modes and remove the SIGEV_THREAD_ID
masking from various code pathes as SIGEV_NONE can never be set in
combination with SIGEV_THREAD_ID.

Reported-by: Eric Biggers <ebiggers3@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16 20:09:40 +01:00
Alex Shi
d132d2582d Merge remote-tracking branch 'lts/linux-4.4.y' into linux-linaro-lsk-v4.4
Conflicts:
	keep HAVE_ARCH_WITHIN_STACK_FRAMES in arch/x86/Kconfig
2018-02-13 19:50:59 +08:00
Tao Huang
8d2d0b6a51 Merge tag 'lsk-v4.4-18.02-android' of git://git.linaro.org/kernel/linux-linaro-stable.git
LSK 18.02 v4.4-android

* tag 'lsk-v4.4-18.02-android': (131 commits)
  Linux 4.4.114
  nfsd: auth: Fix gid sorting when rootsquash enabled
  net: tcp: close sock if net namespace is exiting
  flow_dissector: properly cap thoff field
  ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
  net: Allow neigh contructor functions ability to modify the primary_key
  vmxnet3: repair memory leak
  sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
  sctp: do not allow the v4 socket to bind a v4mapped v6 address
  r8169: fix memory corruption on retrieval of hardware statistics.
  pppoe: take ->needed_headroom of lower device into account on xmit
  net: qdisc_pkt_len_init() should be more robust
  tcp: __tcp_hdrlen() helper
  net: igmp: fix source address check for IGMPv3 reports
  lan78xx: Fix failure in USB Full Speed
  ipv6: ip6_make_skb() needs to clear cork.base.dst
  ipv6: fix udpv6 sendmsg crash caused by too small MTU
  ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
  dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
  hrtimer: Reset hrtimer cpu base proper on CPU hotplug
  ...
2018-02-07 20:59:20 +08:00
Daniel Borkmann
faa74a862a bpf: reject stores into ctx via st and xadd
[ upstream commit f37a8cb84c ]

Alexei found that verifier does not reject stores into context
via BPF_ST instead of BPF_STX. And while looking at it, we
also should not allow XADD variant of BPF_STX.

The context rewriter is only assuming either BPF_LDX_MEM- or
BPF_STX_MEM-type operations, thus reject anything other than
that so that assumptions in the rewriter properly hold. Add
test cases as well for BPF selftests.

Fixes: d691f9e8d4 ("bpf: allow programs to write to certain skb fields")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-03 17:04:25 +01:00
Alexei Starovoitov
02662601a2 bpf: fix 32-bit divide by zero
[ upstream commit 68fda450a7 ]

due to some JITs doing if (src_reg == 0) check in 64-bit mode
for div/mod operations mask upper 32-bits of src register
before doing the check

Fixes: 622582786c ("net: filter: x86: internal BPF JIT")
Fixes: 7a12b5031c ("sparc64: Add eBPF JIT.")
Reported-by: syzbot+48340bb518e88849e2e3@syzkaller.appspotmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-03 17:04:25 +01:00
Eric Dumazet
b72ba2a0d8 bpf: fix divides by zero
[ upstream commit c366287ebd ]

Divides by zero are not nice, lets avoid them if possible.

Also do_div() seems not needed when dealing with 32bit operands,
but this seems a minor detail.

Fixes: bd4cf0ed33 ("net: filter: rework/optimize internal BPF interpreter's instruction set")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-03 17:04:24 +01:00
Daniel Borkmann
7dcda40e52 bpf: arsh is not supported in 32 bit alu thus reject it
[ upstream commit 7891a87efc ]

The following snippet was throwing an 'unknown opcode cc' warning
in BPF interpreter:

  0: (18) r0 = 0x0
  2: (7b) *(u64 *)(r10 -16) = r0
  3: (cc) (u32) r0 s>>= (u32) r0
  4: (95) exit

Although a number of JITs do support BPF_ALU | BPF_ARSH | BPF_{K,X}
generation, not all of them do and interpreter does neither. We can
leave existing ones and implement it later in bpf-next for the
remaining ones, but reject this properly in verifier for the time
being.

Fixes: 17a5267067 ("bpf: verifier (add verifier core)")
Reported-by: syzbot+93c4904c5c70348a6890@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-03 17:04:24 +01:00
Alexei Starovoitov
28c486744e bpf: introduce BPF_JIT_ALWAYS_ON config
[ upstream commit 290af86629 ]

The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.

A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."

To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64

The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_harden

v2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)

v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
  It will be sent when the trees are merged back to net-next

Considered doing:
  int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-03 17:04:24 +01:00
Alexei Starovoitov
361fb04812 bpf: fix bpf_tail_call() x64 JIT
[ upstream commit 90caccdd8c ]

- bpf prog_array just like all other types of bpf array accepts 32-bit index.
  Clarify that in the comment.
- fix x64 JIT of bpf_tail_call which was incorrectly loading 8 instead of 4 bytes
- tighten corresponding check in the interpreter to stay consistent

The JIT bug can be triggered after introduction of BPF_F_NUMA_NODE flag
in commit 96eabe7a40 in 4.14. Before that the map_flags would stay zero and
though JIT code is wrong it will check bounds correctly.
Hence two fixes tags. All other JITs don't have this problem.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Fixes: 96eabe7a40 ("bpf: Allow selecting numa node during map creation")
Fixes: b52f00e6a7 ("x86: bpf_jit: implement bpf_tail_call() helper")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-03 17:04:24 +01:00
Alexei Starovoitov
1367d854b9 bpf: fix branch pruning logic
[ Upstream commit c131187db2 ]

when the verifier detects that register contains a runtime constant
and it's compared with another constant it will prune exploration
of the branch that is guaranteed not to be taken at runtime.
This is all correct, but malicious program may be constructed
in such a way that it always has a constant comparison and
the other branch is never taken under any conditions.
In this case such path through the program will not be explored
by the verifier. It won't be taken at run-time either, but since
all instructions are JITed the malicious program may cause JITs
to complain about using reserved fields, etc.
To fix the issue we have to track the instructions explored by
the verifier and sanitize instructions that are dead at run time
with NOPs. We cannot reject such dead code, since llvm generates
it for valid C code, since it doesn't do as much data flow
analysis as the verifier does.

Fixes: 17a5267067 ("bpf: verifier (add verifier core)")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-03 17:04:24 +01:00
Tao Huang
e7f55d159a printk: add support show process information on printks
Change-Id: I34cf76388ceb2e1f6b6417638c82bf774641ebac
Signed-off-by: Tao Huang <huangtao@rock-chips.com>
2018-02-02 19:26:07 +08:00
Alex Shi
59e35359ec Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android 2018-02-01 12:02:38 +08:00
Alex Shi
a40f2a595a Merge tag 'v4.4.114' into linux-linaro-lsk-v4.4
This is the 4.4.114 stable release
2018-02-01 12:02:34 +08:00
Thomas Gleixner
360496cab5 hrtimer: Reset hrtimer cpu base proper on CPU hotplug
commit d5421ea43d upstream.

The hrtimer interrupt code contains a hang detection and mitigation
mechanism, which prevents that a long delayed hrtimer interrupt causes a
continous retriggering of interrupts which prevent the system from making
progress. If a hang is detected then the timer hardware is programmed with
a certain delay into the future and a flag is set in the hrtimer cpu base
which prevents newly enqueued timers from reprogramming the timer hardware
prior to the chosen delay. The subsequent hrtimer interrupt after the delay
clears the flag and resumes normal operation.

If such a hang happens in the last hrtimer interrupt before a CPU is
unplugged then the hang_detected flag is set and stays that way when the
CPU is plugged in again. At that point the timer hardware is not armed and
it cannot be armed because the hang_detected flag is still active, so
nothing clears that flag. As a consequence the CPU does not receive hrtimer
interrupts and no timers expire on that CPU which results in RCU stalls and
other malfunctions.

Clear the flag along with some other less critical members of the hrtimer
cpu base to ensure starting from a clean state when a CPU is plugged in.

Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
root cause of that hard to reproduce heisenbug. Once understood it's
trivial and certainly justifies a brown paperbag.

Fixes: 41d2e49493 ("hrtimer: Tune hrtimer_interrupt hang logic")
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-31 12:06:12 +01:00
Thomas Gleixner
839003061a timers: Plug locking race vs. timer migration
commit b831275a35 upstream.

Linus noticed that lock_timer_base() lacks a READ_ONCE() for accessing the
timer flags. As a consequence the compiler is allowed to reload the flags
between the initial check for TIMER_MIGRATION and the following timer base
computation and the spin lock of the base.

While this has not been observed (yet), we need to make sure that it never
happens.

Fixes: 0eeda71bc3 ("timer: Replace timer base by a cpu index")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-31 12:06:08 +01:00
Vegard Nossum
08b1cf4964 time: Avoid undefined behaviour in ktime_add_safe()
commit 979515c564 upstream.

I ran into this:

    ================================================================================
    UBSAN: Undefined behaviour in kernel/time/hrtimer.c:310:16
    signed integer overflow:
    9223372036854775807 + 50000 cannot be represented in type 'long long int'
    CPU: 2 PID: 4798 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #91
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
     0000000000000000 ffff88010ce6fb88 ffffffff82344740 0000000041b58ab3
     ffffffff84f97a20 ffffffff82344694 ffff88010ce6fbb0 ffff88010ce6fb60
     000000000000c350 ffff88010ce6f968 dffffc0000000000 ffffffff857bc320
    Call Trace:
     [<ffffffff82344740>] dump_stack+0xac/0xfc
     [<ffffffff82344694>] ? _atomic_dec_and_lock+0xc4/0xc4
     [<ffffffff8242df78>] ubsan_epilogue+0xd/0x8a
     [<ffffffff8242e6b4>] handle_overflow+0x202/0x23d
     [<ffffffff8242e4b2>] ? val_to_string.constprop.6+0x11e/0x11e
     [<ffffffff8236df71>] ? timerqueue_add+0x151/0x410
     [<ffffffff81485c48>] ? hrtimer_start_range_ns+0x3b8/0x1380
     [<ffffffff81795631>] ? memset+0x31/0x40
     [<ffffffff8242e6fd>] __ubsan_handle_add_overflow+0xe/0x10
     [<ffffffff81488ac9>] hrtimer_nanosleep+0x5d9/0x790
     [<ffffffff814884f0>] ? hrtimer_init_sleeper+0x80/0x80
     [<ffffffff813a9ffb>] ? __might_sleep+0x5b/0x260
     [<ffffffff8148be10>] common_nsleep+0x20/0x30
     [<ffffffff814906c7>] SyS_clock_nanosleep+0x197/0x210
     [<ffffffff81490530>] ? SyS_clock_getres+0x150/0x150
     [<ffffffff823c7113>] ? __this_cpu_preempt_check+0x13/0x20
     [<ffffffff8162ef60>] ? __context_tracking_exit.part.3+0x30/0x1b0
     [<ffffffff81490530>] ? SyS_clock_getres+0x150/0x150
     [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
     [<ffffffff845f85aa>] entry_SYSCALL64_slow_path+0x25/0x25
    ================================================================================

Add a new ktime_add_unsafe() helper which doesn't check for overflow, but
doesn't throw a UBSAN warning when it does overflow either.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-31 12:06:08 +01:00
Daniel Bristot de Oliveira
1d00e3d9b7 sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks
commit 3effcb4247 upstream.

We have been facing some problems with self-suspending constrained
deadline tasks. The main reason is that the original CBS was not
designed for such sort of tasks.

One problem reported by Xunlei Pang takes place when a task
suspends, and then is awakened before the deadline, but so close
to the deadline that its remaining runtime can cause the task
to have an absolute density higher than allowed. In such situation,
the original CBS assumes that the task is facing an early activation,
and so it replenishes the task and set another deadline, one deadline
in the future. This rule works fine for implicit deadline tasks.
Moreover, it allows the system to adapt the period of a task in which
the external event source suffered from a clock drift.

However, this opens the window for bandwidth leakage for constrained
deadline tasks. For instance, a task with the following parameters:

  runtime   = 5 ms
  deadline  = 7 ms
  [density] = 5 / 7 = 0.71
  period    = 1000 ms

If the task runs for 1 ms, and then suspends for another 1ms,
it will be awakened with the following parameters:

  remaining runtime = 4
  laxity = 5

presenting a absolute density of 4 / 5 = 0.80.

In this case, the original CBS would assume the task had an early
wakeup. Then, CBS will reset the runtime, and the absolute deadline will
be postponed by one relative deadline, allowing the task to run.

The problem is that, if the task runs this pattern forever, it will keep
receiving bandwidth, being able to run 1ms every 2ms. Following this
behavior, the task would be able to run 500 ms in 1 sec. Thus running
more than the 5 ms / 1 sec the admission control allowed it to run.

Trying to address the self-suspending case, Luca Abeni, Giuseppe
Lipari, and Juri Lelli [1] revisited the CBS in order to deal with
self-suspending tasks. In the new approach, rather than
replenishing/postponing the absolute deadline, the revised wakeup rule
adjusts the remaining runtime, reducing it to fit into the allowed
density.

A revised version of the idea is:

At a given time t, the maximum absolute density of a task cannot be
higher than its relative density, that is:

  runtime / (deadline - t) <= dl_runtime / dl_deadline

Knowing the laxity of a task (deadline - t), it is possible to move
it to the other side of the equality, thus enabling to define max
remaining runtime a task can use within the absolute deadline, without
over-running the allowed density:

  runtime = (dl_runtime / dl_deadline) * (deadline - t)

For instance, in our previous example, the task could still run:

  runtime = ( 5 / 7 ) * 5
  runtime = 3.57 ms

Without causing damage for other deadline tasks. It is note worthy
that the laxity cannot be negative because that would cause a negative
runtime. Thus, this patch depends on the patch:

  df8eac8caf ("sched/deadline: Throttle a constrained deadline task activated after the deadline")

Which throttles a constrained deadline task activated after the
deadline.

Finally, it is also possible to use the revised wakeup rule for
all other tasks, but that would require some more discussions
about pros and cons.

[The main difference from the original commit is that
 the BW_SHIFT define was not present yet. As BW_SHIFT was
 introduced in a new feature, I just used the value (20),
 likewise we used to use before the #define.
 Other changes were required because of comments. - bistrot]

Reported-by: Xunlei Pang <xpang@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
[peterz: replaced dl_is_constrained with dl_is_implicit]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Romulo Silva de Oliveira <romulo.deoliveira@ufsc.br>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
Link: http://lkml.kernel.org/r/5c800ab3a74a168a84ee5f3f84d12a02e11383be.1495803804.git.bristot@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-31 12:06:07 +01:00
Alex Shi
293c379504 Merge remote-tracking branch 'lts/linux-4.4.y' into linux-linaro-lsk-v4.4
Conflicts:
	keep lsk used current_stack_pointer and arch_within_stack_frames
	in arch/x86/include/asm/thread_info.h
2018-01-31 13:33:20 +08:00
Tao Huang
640193f76b Merge branch 'linux-linaro-lsk-v4.4-android' of git://git.linaro.org/kernel/linux-linaro-stable.git
* linux-linaro-lsk-v4.4-android: (733 commits)
  LSK-ANDROID: memcg: Remove wrong ->attach callback
  LSK-ANDROID: arm64: mm: Fix __create_pgd_mapping() call
  ANDROID: sdcardfs: Move default_normal to superblock
  blkdev: Refactoring block io latency histogram codes
  FROMLIST: arm64: kpti: Fix the interaction between ASID switching and software PAN
  FROMLIST: arm64: Move post_ttbr_update_workaround to C code
  FROMLIST: arm64: mm: Rename post_ttbr0_update_workaround
  sched: EAS: Initialize push_task as NULL to avoid direct reference on out_unlock path
  fscrypt: updates on 4.15-rc4
  ANDROID: uid_sys_stats: fix the comment
  BACKPORT: tee: indicate privileged dev in gen_caps
  BACKPORT: tee: optee: sync with new naming of interrupts
  BACKPORT: tee: tee_shm: Constify dma_buf_ops structures.
  BACKPORT: tee: optee: interruptible RPC sleep
  BACKPORT: tee: optee: add const to tee_driver_ops and tee_desc structures
  BACKPORT: tee.txt: standardize document format
  BACKPORT: tee: add forward declaration for struct device
  BACKPORT: tee: optee: fix uninitialized symbol 'parg'
  BACKPORT: tee: add ARM_SMCCC dependency
  BACKPORT: selinux: nlmsgtab: add SOCK_DESTROY to the netlink mapping tables
  ...

Conflicts:
	arch/arm64/kernel/vdso.c
	drivers/usb/host/xhci-plat.c
	include/drm/drmP.h
	include/linux/kasan.h
	kernel/time/timekeeping.c
	mm/kasan/kasan.c
	security/selinux/nlmsgtab.c

Also add this commit:
0bcdc0987c ("time: Fix ktime_get_raw() incorrect base accumulation")
2018-01-26 19:26:47 +08:00
Steven Rostedt (VMware)
cf3625004e tracing: Fix converting enum's from the map in trace_event_eval_update()
commit 1ebe1eaf2f upstream.

Since enums do not get converted by the TRACE_EVENT macro into their values,
the event format displaces the enum name and not the value. This breaks
tools like perf and trace-cmd that need to interpret the raw binary data. To
solve this, an enum map was created to convert these enums into their actual
numbers on boot up. This is done by TRACE_EVENTS() adding a
TRACE_DEFINE_ENUM() macro.

Some enums were not being converted. This was caused by an optization that
had a bug in it.

All calls get checked against this enum map to see if it should be converted
or not, and it compares the call's system to the system that the enum map
was created under. If they match, then they call is processed.

To cut down on the number of iterations needed to find the maps with a
matching system, since calls and maps are grouped by system, when a match is
made, the index into the map array is saved, so that the next call, if it
belongs to the same system as the previous call, could start right at that
array index and not have to scan all the previous arrays.

The problem was, the saved index was used as the variable to know if this is
a call in a new system or not. If the index was zero, it was assumed that
the call is in a new system and would keep incrementing the saved index
until it found a matching system. The issue arises when the first matching
system was at index zero. The next map, if it belonged to the same system,
would then think it was the first match and increment the index to one. If
the next call belong to the same system, it would begin its search of the
maps off by one, and miss the first enum that should be converted. This left
a single enum not converted properly.

Also add a comment to describe exactly what that index was for. It took me a
bit too long to figure out what I was thinking when debugging this issue.

Link: http://lkml.kernel.org/r/717BE572-2070-4C1E-9902-9F2E0FEDA4F8@oracle.com

Fixes: 0c564a538a ("tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values")
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Teste-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-23 19:50:16 +01:00
Xunlei Pang
8bd58b61d2 sched/deadline: Zero out positive runtime after throttling constrained tasks
commit ae83b56a56 upstream.

When a contrained task is throttled by dl_check_constrained_dl(),
it may carry the remaining positive runtime, as a result when
dl_task_timer() fires and calls replenish_dl_entity(), it will
not be replenished correctly due to the positive dl_se->runtime.

This patch assigns its runtime to 0 if positive after throttling.

Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: df8eac8caf ("sched/deadline: Throttle a constrained deadline task activated after the deadline)
Link: http://lkml.kernel.org/r/1494421417-27550-1-git-send-email-xlpang@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-23 19:50:15 +01:00