Commit Graph

10058 Commits

Author SHA1 Message Date
Amit Pundir
24740dab5c Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>

Conflicts:
    fs/f2fs/extent_cache.c
        Pick changes from AOSP Change-Id: Icd8a85ac0c19a8aa25cd2591a12b4e9b85bdf1c5
        ("f2fs: catch up to v4.14-rc1")

    fs/f2fs/namei.c
        Pick changes from AOSP F2FS backport commit 7d5c08fd91
        ("f2fs: backport from (4c1fad64 - Merge tag 'for-f2fs-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs)")
2018-03-05 20:20:17 +05:30
Alex Shi
5a49e50670 Merge tag 'v4.4.118' into linux-linaro-lsk-v4.4
This is the 4.4.118 stable release
2018-02-26 12:02:27 +08:00
Dave Young
aa7dca0a5a mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep
[ Upstream commit 7f6f60a1ba ]

earlyprintk=efi,keep does not work any more with a warning
in mm/early_ioremap.c: WARN_ON(system_state != SYSTEM_BOOTING):
Boot just hangs because of the earlyprintk within the earlyprintk
implementation code itself.

This is caused by a new introduced middle state in:

  69a78ff226 ("init: Introduce SYSTEM_SCHEDULING state")

early_ioremap() is fine in both SYSTEM_BOOTING and SYSTEM_SCHEDULING
states, original condition should be updated accordingly.

Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bp@suse.de
Cc: linux-efi@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20171209041610.GA3249@dhcp-128-65.nay.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:03:41 +01:00
David Howells
4dbb8e38a1 Provide a function to create a NUL-terminated string from unterminated data
commit f351574172 upstream.

Provide a function, kmemdup_nul(), that will create a NUL-terminated string
from an unterminated character array where the length is known in advance.

This is better than kstrndup() in situations where we already know the
string length as the strnlen() in kstrndup() is superfluous.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:03:36 +01:00
Tetsuo Handa
0de023ab7d mm,vmscan: Make unregister_shrinker() no-op if register_shrinker() failed.
commit bb422a738f upstream.

Syzbot caught an oops at unregister_shrinker() because combination of
commit 1d3d4437ea ("vmscan: per-node deferred work") and fault
injection made register_shrinker() fail and the caller of
register_shrinker() did not check for failure.

----------
[  554.881422] FAULT_INJECTION: forcing a failure.
[  554.881422] name failslab, interval 1, probability 0, space 0, times 0
[  554.881438] CPU: 1 PID: 13231 Comm: syz-executor1 Not tainted 4.14.0-rc8+ #82
[  554.881443] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[  554.881445] Call Trace:
[  554.881459]  dump_stack+0x194/0x257
[  554.881474]  ? arch_local_irq_restore+0x53/0x53
[  554.881486]  ? find_held_lock+0x35/0x1d0
[  554.881507]  should_fail+0x8c0/0xa40
[  554.881522]  ? fault_create_debugfs_attr+0x1f0/0x1f0
[  554.881537]  ? check_noncircular+0x20/0x20
[  554.881546]  ? find_next_zero_bit+0x2c/0x40
[  554.881560]  ? ida_get_new_above+0x421/0x9d0
[  554.881577]  ? find_held_lock+0x35/0x1d0
[  554.881594]  ? __lock_is_held+0xb6/0x140
[  554.881628]  ? check_same_owner+0x320/0x320
[  554.881634]  ? lock_downgrade+0x990/0x990
[  554.881649]  ? find_held_lock+0x35/0x1d0
[  554.881672]  should_failslab+0xec/0x120
[  554.881684]  __kmalloc+0x63/0x760
[  554.881692]  ? lock_downgrade+0x990/0x990
[  554.881712]  ? register_shrinker+0x10e/0x2d0
[  554.881721]  ? trace_event_raw_event_module_request+0x320/0x320
[  554.881737]  register_shrinker+0x10e/0x2d0
[  554.881747]  ? prepare_kswapd_sleep+0x1f0/0x1f0
[  554.881755]  ? _down_write_nest_lock+0x120/0x120
[  554.881765]  ? memcpy+0x45/0x50
[  554.881785]  sget_userns+0xbcd/0xe20
(...snipped...)
[  554.898693] kasan: CONFIG_KASAN_INLINE enabled
[  554.898724] kasan: GPF could be caused by NULL-ptr deref or user memory access
[  554.898732] general protection fault: 0000 [#1] SMP KASAN
[  554.898737] Dumping ftrace buffer:
[  554.898741]    (ftrace buffer empty)
[  554.898743] Modules linked in:
[  554.898752] CPU: 1 PID: 13231 Comm: syz-executor1 Not tainted 4.14.0-rc8+ #82
[  554.898755] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[  554.898760] task: ffff8801d1dbe5c0 task.stack: ffff8801c9e38000
[  554.898772] RIP: 0010:__list_del_entry_valid+0x7e/0x150
[  554.898775] RSP: 0018:ffff8801c9e3f108 EFLAGS: 00010246
[  554.898780] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  554.898784] RDX: 0000000000000000 RSI: ffff8801c53c6f98 RDI: ffff8801c53c6fa0
[  554.898788] RBP: ffff8801c9e3f120 R08: 1ffff100393c7d55 R09: 0000000000000004
[  554.898791] R10: ffff8801c9e3ef70 R11: 0000000000000000 R12: 0000000000000000
[  554.898795] R13: dffffc0000000000 R14: 1ffff100393c7e45 R15: ffff8801c53c6f98
[  554.898800] FS:  0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
[  554.898804] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[  554.898807] CR2: 00000000dbc23000 CR3: 00000001c7269000 CR4: 00000000001406e0
[  554.898813] DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
[  554.898816] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[  554.898818] Call Trace:
[  554.898828]  unregister_shrinker+0x79/0x300
[  554.898837]  ? perf_trace_mm_vmscan_writepage+0x750/0x750
[  554.898844]  ? down_write+0x87/0x120
[  554.898851]  ? deactivate_super+0x139/0x1b0
[  554.898857]  ? down_read+0x150/0x150
[  554.898864]  ? check_same_owner+0x320/0x320
[  554.898875]  deactivate_locked_super+0x64/0xd0
[  554.898883]  deactivate_super+0x141/0x1b0
----------

Since allowing register_shrinker() callers to call unregister_shrinker()
when register_shrinker() failed can simplify error recovery path, this
patch makes unregister_shrinker() no-op when register_shrinker() failed.
Also, reset shrinker->nr_deferred in case unregister_shrinker() was
by error called twice.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Glauber Costa <glauber@scylladb.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25 11:03:35 +01:00
Alex Shi
e1b9ff290b Merge tag 'v4.4.117' into linux-linaro-lsk-v4.4
This is the 4.4.117 stable release
2018-02-23 12:01:32 +08:00
Arnd Bergmann
6d046d9e75 mm: hide a #warning for COMPILE_TEST
commit af27d9403f upstream.

We get a warning about some slow configurations in randconfig kernels:

  mm/memory.c:83:2: error: #warning Unfortunate NUMA and NUMA Balancing config, growing page-frame for last_cpupid. [-Werror=cpp]

The warning is reasonable by itself, but gets in the way of randconfig
build testing, so I'm hiding it whenever CONFIG_COMPILE_TEST is set.

The warning was added in 2013 in commit 75980e97da ("mm: fold
page->_last_nid into page->flags where possible").

Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22 15:45:00 +01:00
Alex Shi
d132d2582d Merge remote-tracking branch 'lts/linux-4.4.y' into linux-linaro-lsk-v4.4
Conflicts:
	keep HAVE_ARCH_WITHIN_STACK_FRAMES in arch/x86/Kconfig
2018-02-13 19:50:59 +08:00
Yisheng Xie
51d2967fe5 kmemleak: add scheduling point to kmemleak_scan()
[ Upstream commit bde5f6bc68 ]

kmemleak_scan() will scan struct page for each node and it can be really
large and resulting in a soft lockup.  We have seen a soft lockup when
do scan while compile kernel:

  watchdog: BUG: soft lockup - CPU#53 stuck for 22s! [bash:10287]
 [...]
  Call Trace:
   kmemleak_scan+0x21a/0x4c0
   kmemleak_write+0x312/0x350
   full_proxy_write+0x5a/0xa0
   __vfs_write+0x33/0x150
   vfs_write+0xad/0x1a0
   SyS_write+0x52/0xc0
   do_syscall_64+0x61/0x1a0
   entry_SYSCALL64_slow_path+0x25/0x25

Fix this by adding cond_resched every MAX_SCAN_SIZE.

Link: http://lkml.kernel.org/r/1511439788-20099-1-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-03 17:04:29 +01:00
Alex Shi
59e35359ec Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android 2018-02-01 12:02:38 +08:00
Alex Shi
a40f2a595a Merge tag 'v4.4.114' into linux-linaro-lsk-v4.4
This is the 4.4.114 stable release
2018-02-01 12:02:34 +08:00
Vlastimil Babka
90ca50c56b mm, page_alloc: fix potential false positive in __zone_watermark_ok
commit b050e3769c upstream.

Since commit 97a16fc82a ("mm, page_alloc: only enforce watermarks for
order-0 allocations"), __zone_watermark_ok() check for high-order
allocations will shortcut per-migratetype free list checks for
ALLOC_HARDER allocations, and return true as long as there's free page
of any migratetype.  The intention is that ALLOC_HARDER can allocate
from MIGRATE_HIGHATOMIC free lists, while normal allocations can't.

However, as a side effect, the watermark check will then also return
true when there are pages only on the MIGRATE_ISOLATE list, or (prior to
CMA conversion to ZONE_MOVABLE) on the MIGRATE_CMA list.  Since the
allocation cannot actually obtain isolated pages, and might not be able
to obtain CMA pages, this can result in a false positive.

The condition should be rare and perhaps the outcome is not a fatal one.
Still, it's better if the watermark check is correct.  There also
shouldn't be a performance tradeoff here.

Link: http://lkml.kernel.org/r/20171102125001.23708-1-vbabka@suse.cz
Fixes: 97a16fc82a ("mm, page_alloc: only enforce watermarks for order-0 allocations")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-31 12:06:09 +01:00
Doug Berger
1ab0bb9834 cma: fix calculation of aligned offset
commit e048cb32f6 upstream.

The align_offset parameter is used by bitmap_find_next_zero_area_off()
to represent the offset of map's base from the previous alignment
boundary; the function ensures that the returned index, plus the
align_offset, honors the specified align_mask.

The logic introduced by commit b5be83e308 ("mm: cma: align to physical
address, not CMA region position") has the cma driver calculate the
offset to the *next* alignment boundary.  In most cases, the base
alignment is greater than that specified when making allocations,
resulting in a zero offset whether we align up or down.  In the example
given with the commit, the base alignment (8MB) was half the requested
alignment (16MB) so the math also happened to work since the offset is
8MB in both directions.  However, when requesting allocations with an
alignment greater than twice that of the base, the returned index would
not be correctly aligned.

Also, the align_order arguments of cma_bitmap_aligned_mask() and
cma_bitmap_aligned_offset() should not be negative so the argument type
was made unsigned.

Fixes: b5be83e308 ("mm: cma: align to physical address, not CMA region position")
Link: http://lkml.kernel.org/r/20170628170742.2895-1-opendmb@gmail.com
Signed-off-by: Angus Clark <angus@angusclark.org>
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Gregory Fong <gregory.0xf0@gmail.com>
Cc: Doug Berger <opendmb@gmail.com>
Cc: Angus Clark <angus@angusclark.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Shiraz Hashim <shashim@codeaurora.org>
Cc: Jaewon Kim <jaewon31.kim@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-31 12:06:09 +01:00
Michal Hocko
361376a534 hwpoison, memcg: forcibly uncharge LRU pages
commit 18365225f0 upstream.

Laurent Dufour has noticed that hwpoinsoned pages are kept charged.  In
his particular case he has hit a bad_page("page still charged to
cgroup") when onlining a hwpoison page.  While this looks like something
that shouldn't happen in the first place because onlining hwpages and
returning them to the page allocator makes only little sense it shows a
real problem.

hwpoison pages do not get freed usually so we do not uncharge them (at
least not since commit 0a31bc97c8 ("mm: memcontrol: rewrite uncharge
API")).  Each charge pins memcg (since e8ea14cc6e ("mm: memcontrol:
take a css reference for each charged page")) as well and so the
mem_cgroup and the associated state will never go away.  Fix this leak
by forcibly uncharging a LRU hwpoisoned page in delete_from_lru_cache().
We also have to tweak uncharge_list because it cannot rely on zero ref
count for these pages.

[akpm@linux-foundation.org: coding-style fixes]
Fixes: 0a31bc97c8 ("mm: memcontrol: rewrite uncharge API")
Link: http://lkml.kernel.org/r/20170502185507.GB19165@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-31 12:06:09 +01:00
Michal Hocko
7175e56fa7 mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
commit 561b5e0709 upstream.

Commit 1be7107fbe ("mm: larger stack guard gap, between vmas") has
introduced a regression in some rust and Java environments which are
trying to implement their own stack guard page.  They are punching a new
MAP_FIXED mapping inside the existing stack Vma.

This will confuse expand_{downwards,upwards} into thinking that the
stack expansion would in fact get us too close to an existing non-stack
vma which is a correct behavior wrt safety.  It is a real regression on
the other hand.

Let's work around the problem by considering PROT_NONE mapping as a part
of the stack.  This is a gros hack but overflowing to such a mapping
would trap anyway an we only can hope that usespace knows what it is
doing and handle it propely.

Fixes: 1be7107fbe ("mm: larger stack guard gap, between vmas")
Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-31 12:06:09 +01:00
Amit Pundir
c210bc406d LSK-ANDROID: memcg: Remove wrong ->attach callback
mem_cgroup_move_task() is now called from ->post_attach
instead of ->attach thanks to LTS commit 52526076a5
("memcg: relocate charge moving from ->attach to ->post_attach").

Hence remove ->attach callback which sneaked back into
lsk-v4.4-android tree in the LSK merge commit 334ca3ed18
("Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android").

Otherwise we run into following build warning reported on KernelCI
https://kernelci.org/build/lsk/branch/linux-linaro-lsk-v4.4-android/kernel/lsk-v4.4-17.11-android-844-g6a7d9fbcf946/

mm/memcontrol.c:5337:12: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]

Fixes: LSK commit 334ca3ed18
       ("Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android")
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2018-01-24 15:45:12 +05:30
Dmitry Vyukov
8c8a5ee21a BACKPORT: kernel: add kcov code coverage
kcov provides code coverage collection for coverage-guided fuzzing
(randomized testing).  Coverage-guided fuzzing is a testing technique
that uses coverage feedback to determine new interesting inputs to a
system.  A notable user-space example is AFL
(http://lcamtuf.coredump.cx/afl/).  However, this technique is not
widely used for kernel testing due to missing compiler and kernel
support.

kcov does not aim to collect as much coverage as possible.  It aims to
collect more or less stable coverage that is function of syscall inputs.
To achieve this goal it does not collect coverage in soft/hard
interrupts and instrumentation of some inherently non-deterministic or
non-interesting parts of kernel is disbled (e.g.  scheduler, locking).

Currently there is a single coverage collection mode (tracing), but the
API anticipates additional collection modes.  Initially I also
implemented a second mode which exposes coverage in a fixed-size hash
table of counters (what Quentin used in his original patch).  I've
dropped the second mode for simplicity.

This patch adds the necessary support on kernel side.  The complimentary
compiler support was added in gcc revision 231296.

We've used this support to build syzkaller system call fuzzer, which has
found 90 kernel bugs in just 2 months:

  https://github.com/google/syzkaller/wiki/Found-Bugs

We've also found 30+ bugs in our internal systems with syzkaller.
Another (yet unexplored) direction where kcov coverage would greatly
help is more traditional "blob mutation".  For example, mounting a
random blob as a filesystem, or receiving a random blob over wire.

Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
typical coverage can be just a dozen of basic blocks (e.g.  an invalid
input).  In such context gcov becomes prohibitively expensive as
reset/collect coverage steps depend on total number of basic
blocks/edges in program (in case of kernel it is about 2M).  Cost of
kcov depends only on number of executed basic blocks/edges.  On top of
that, kernel requires per-thread coverage because there are always
background threads and unrelated processes that also produce coverage.
With inlined gcov instrumentation per-thread coverage is not possible.

kcov exposes kernel PCs and control flow to user-space which is
insecure.  But debugfs should not be mapped as user accessible.

Based on a patch by Quentin Casasnovas.

[akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
[akpm@linux-foundation.org: unbreak allmodconfig]
[akpm@linux-foundation.org: follow x86 Makefile layout standards]
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tavis Ormandy <taviso@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: David Drysdale <drysdale@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 5c9a8750a6)
Change-Id: I17b5e04f6e89b241924e78ec32ead79c38b860ce
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Colin Ian King
814f77833e UPSTREAM: kasan: make get_wild_bug_type() static
The helper function get_wild_bug_type() does not need to be in global
scope, so make it static.

Cleans up sparse warning:

  "symbol 'get_wild_bug_type' was not declared. Should it be static?"

Link: http://lkml.kernel.org/r/20170622090049.10658-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 822d5ec258)
Change-Id: If89c8ba8ee3bdb0db7ecb67e773bfbf3179514f3
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Konovalov
d0d379d27e UPSTREAM: kasan: separate report parts by empty lines
Makes the report easier to read.

Link: http://lkml.kernel.org/r/20170302134851.101218-10-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from b193859936)
Change-Id: I8cc010a73e257cb08c7a2537d7aabd3c9c2c8116
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Konovalov
2db44a19ba UPSTREAM: kasan: improve double-free report format
Changes double-free report header from

  BUG: Double free or freeing an invalid pointer
  Unexpected shadow byte: 0xFB

to

  BUG: KASAN: double-free or invalid-free in kmalloc_oob_left+0xe5/0xef

This makes a bug uniquely identifiable by the first report line.  To
account for removing of the unexpected shadow value, print shadow bytes
at the end of the report as in reports for other kinds of bugs.

Link: http://lkml.kernel.org/r/20170302134851.101218-9-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 5ab6d91ac9)
Change-Id: I02dee92190216601d65866eb1c27f7381a22b0da
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Konovalov
617cf5e697 UPSTREAM: kasan: print page description after stacks
Moves page description after the stacks since it's less important.

Link: http://lkml.kernel.org/r/20170302134851.101218-8-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 430a05f91d)
Change-Id: Ia20b7d6cf5602531072e9bd4fd478737f8623db1
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Konovalov
28ab6eaea1 UPSTREAM: kasan: improve slab object description
Changes slab object description from:

  Object at ffff880068388540, in cache kmalloc-128 size: 128

to:

  The buggy address belongs to the object at ffff880068388540
   which belongs to the cache kmalloc-128 of size 128
  The buggy address is located 123 bytes inside of
   128-byte region [ffff880068388540, ffff8800683885c0)

Makes it more explanatory and adds information about relative offset of
the accessed address to the start of the object.

Link: http://lkml.kernel.org/r/20170302134851.101218-7-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 0c06f1f86c)
Change-Id: I23928984dbe5a614b84c57e42b20ec13e7c739a4
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Konovalov
9472afc62e UPSTREAM: kasan: change report header
Change report header format from:

  BUG: KASAN: use-after-free in unwind_get_return_address+0x28a/0x2c0 at addr ffff880069437950
  Read of size 8 by task insmod/3925

to:

  BUG: KASAN: use-after-free in unwind_get_return_address+0x28a/0x2c0
  Read of size 8 at addr ffff880069437950 by task insmod/3925

The exact access address is not usually important, so move it to the
second line.  This also makes the header look visually balanced.

Link: http://lkml.kernel.org/r/20170302134851.101218-6-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 7f0a84c23b)
Change-Id: If9cacce637c317538d813b05ef2647707300d310
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Konovalov
7079544af3 UPSTREAM: kasan: simplify address description logic
Simplify logic for describing a memory address.  Add addr_to_page()
helper function.

Makes the code easier to follow.

Link: http://lkml.kernel.org/r/20170302134851.101218-5-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from db429f16e0)
Change-Id: Ie688a8fe0da5d1012e64bdbd26b5e7cb2ed43ff8
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Konovalov
6aa280e225 UPSTREAM: kasan: change allocation and freeing stack traces headers
Change stack traces headers from:

  Allocated:
  PID = 42

to:

  Allocated by task 42:

Makes the report one line shorter and look better.

Link: http://lkml.kernel.org/r/20170302134851.101218-4-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from b6b72f4919)
Change-Id: Iab66777f16016b5a3a8ce85f7cc62d4572fcf5b0
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Konovalov
b6fa387bdb UPSTREAM: kasan: unify report headers
Unify KASAN report header format for different kinds of bad memory
accesses.  Makes the code simpler.

Link: http://lkml.kernel.org/r/20170302134851.101218-3-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 7d418f7b0d)
Change-Id: I81577ad4617e8c4624fc0701f45a197d211f12a6
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Konovalov
6d34bde21c UPSTREAM: kasan: introduce helper functions for determining bug type
Patch series "kasan: improve error reports", v2.

This patchset improves KASAN reports by making them easier to read and a
little more detailed.  Also improves mm/kasan/report.c readability.

Effectively changes a use-after-free report to:

  ==================================================================
  BUG: KASAN: use-after-free in kmalloc_uaf+0xaa/0xb6 [test_kasan]
  Write of size 1 at addr ffff88006aa59da8 by task insmod/3951

  CPU: 1 PID: 3951 Comm: insmod Tainted: G    B           4.10.0+ #84
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
   dump_stack+0x292/0x398
   print_address_description+0x73/0x280
   kasan_report.part.2+0x207/0x2f0
   __asan_report_store1_noabort+0x2c/0x30
   kmalloc_uaf+0xaa/0xb6 [test_kasan]
   kmalloc_tests_init+0x4f/0xa48 [test_kasan]
   do_one_initcall+0xf3/0x390
   do_init_module+0x215/0x5d0
   load_module+0x54de/0x82b0
   SYSC_init_module+0x3be/0x430
   SyS_init_module+0x9/0x10
   entry_SYSCALL_64_fastpath+0x1f/0xc2
  RIP: 0033:0x7f22cfd0b9da
  RSP: 002b:00007ffe69118a78 EFLAGS: 00000206 ORIG_RAX: 00000000000000af
  RAX: ffffffffffffffda RBX: 0000555671242090 RCX: 00007f22cfd0b9da
  RDX: 00007f22cffcaf88 RSI: 000000000004df7e RDI: 00007f22d0399000
  RBP: 00007f22cffcaf88 R08: 0000000000000003 R09: 0000000000000000
  R10: 00007f22cfd07d0a R11: 0000000000000206 R12: 0000555671243190
  R13: 000000000001fe81 R14: 0000000000000000 R15: 0000000000000004

  Allocated by task 3951:
   save_stack_trace+0x16/0x20
   save_stack+0x43/0xd0
   kasan_kmalloc+0xad/0xe0
   kmem_cache_alloc_trace+0x82/0x270
   kmalloc_uaf+0x56/0xb6 [test_kasan]
   kmalloc_tests_init+0x4f/0xa48 [test_kasan]
   do_one_initcall+0xf3/0x390
   do_init_module+0x215/0x5d0
   load_module+0x54de/0x82b0
   SYSC_init_module+0x3be/0x430
   SyS_init_module+0x9/0x10
   entry_SYSCALL_64_fastpath+0x1f/0xc2

  Freed by task 3951:
   save_stack_trace+0x16/0x20
   save_stack+0x43/0xd0
   kasan_slab_free+0x72/0xc0
   kfree+0xe8/0x2b0
   kmalloc_uaf+0x85/0xb6 [test_kasan]
   kmalloc_tests_init+0x4f/0xa48 [test_kasan]
   do_one_initcall+0xf3/0x390
   do_init_module+0x215/0x5d0
   load_module+0x54de/0x82b0
   SYSC_init_module+0x3be/0x430
   SyS_init_module+0x9/0x10
   entry_SYSCALL_64_fastpath+0x1f/0xc

  The buggy address belongs to the object at ffff88006aa59da0
   which belongs to the cache kmalloc-16 of size 16
  The buggy address is located 8 bytes inside of
   16-byte region [ffff88006aa59da0, ffff88006aa59db0)
  The buggy address belongs to the page:
  page:ffffea0001aa9640 count:1 mapcount:0 mapping:          (null) index:0x0
  flags: 0x100000000000100(slab)
  raw: 0100000000000100 0000000000000000 0000000000000000 0000000180800080
  raw: ffffea0001abe380 0000000700000007 ffff88006c401b40 0000000000000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   ffff88006aa59c80: 00 00 fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc
   ffff88006aa59d00: 00 00 fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc
  >ffff88006aa59d80: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
                                    ^
   ffff88006aa59e00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
   ffff88006aa59e80: fb fb fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc
  ==================================================================

from:

  ==================================================================
  BUG: KASAN: use-after-free in kmalloc_uaf+0xaa/0xb6 [test_kasan] at addr ffff88006c4dcb28
  Write of size 1 by task insmod/3984
  CPU: 1 PID: 3984 Comm: insmod Tainted: G    B           4.10.0+ #83
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
   dump_stack+0x292/0x398
   kasan_object_err+0x1c/0x70
   kasan_report.part.1+0x20e/0x4e0
   __asan_report_store1_noabort+0x2c/0x30
   kmalloc_uaf+0xaa/0xb6 [test_kasan]
   kmalloc_tests_init+0x4f/0xa48 [test_kasan]
   do_one_initcall+0xf3/0x390
   do_init_module+0x215/0x5d0
   load_module+0x54de/0x82b0
   SYSC_init_module+0x3be/0x430
   SyS_init_module+0x9/0x10
   entry_SYSCALL_64_fastpath+0x1f/0xc2
  RIP: 0033:0x7feca0f779da
  RSP: 002b:00007ffdfeae5218 EFLAGS: 00000206 ORIG_RAX: 00000000000000af
  RAX: ffffffffffffffda RBX: 000055a064c13090 RCX: 00007feca0f779da
  RDX: 00007feca1236f88 RSI: 000000000004df7e RDI: 00007feca1605000
  RBP: 00007feca1236f88 R08: 0000000000000003 R09: 0000000000000000
  R10: 00007feca0f73d0a R11: 0000000000000206 R12: 000055a064c14190
  R13: 000000000001fe81 R14: 0000000000000000 R15: 0000000000000004
  Object at ffff88006c4dcb20, in cache kmalloc-16 size: 16
  Allocated:
  PID = 3984
   save_stack_trace+0x16/0x20
   save_stack+0x43/0xd0
   kasan_kmalloc+0xad/0xe0
   kmem_cache_alloc_trace+0x82/0x270
   kmalloc_uaf+0x56/0xb6 [test_kasan]
   kmalloc_tests_init+0x4f/0xa48 [test_kasan]
   do_one_initcall+0xf3/0x390
   do_init_module+0x215/0x5d0
   load_module+0x54de/0x82b0
   SYSC_init_module+0x3be/0x430
   SyS_init_module+0x9/0x10
   entry_SYSCALL_64_fastpath+0x1f/0xc2
  Freed:
  PID = 3984
   save_stack_trace+0x16/0x20
   save_stack+0x43/0xd0
   kasan_slab_free+0x73/0xc0
   kfree+0xe8/0x2b0
   kmalloc_uaf+0x85/0xb6 [test_kasan]
   kmalloc_tests_init+0x4f/0xa48 [test_kasan]
   do_one_initcall+0xf3/0x390
   do_init_module+0x215/0x5d0
   load_module+0x54de/0x82b0
   SYSC_init_module+0x3be/0x430
   SyS_init_module+0x9/0x10
   entry_SYSCALL_64_fastpath+0x1f/0xc2
  Memory state around the buggy address:
   ffff88006c4dca00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
   ffff88006c4dca80: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
  >ffff88006c4dcb00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
                                    ^
   ffff88006c4dcb80: fb fb fc fc 00 00 fc fc fb fb fc fc fb fb fc fc
   ffff88006c4dcc00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
  ==================================================================

This patch (of 9):

Introduce get_shadow_bug_type() function, which determines bug type
based on the shadow value for a particular kernel address.  Introduce
get_wild_bug_type() function, which determines bug type for addresses
which don't have a corresponding shadow value.

Link: http://lkml.kernel.org/r/20170302134851.101218-2-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 5e82cd1203)
Change-Id: I3359775858891c9c66d11d2a520831e329993ae9
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Mark Rutland
b6d435cf89 BACKPORT: kasan: report only the first error by default
Disable kasan after the first report.  There are several reasons for
this:

 - Single bug quite often has multiple invalid memory accesses causing
   storm in the dmesg.

 - Write OOB access might corrupt metadata so the next report will print
   bogus alloc/free stacktraces.

 - Reports after the first easily could be not bugs by itself but just
   side effects of the first one.

Given that multiple reports usually only do harm, it makes sense to
disable kasan after the first one.  If user wants to see all the
reports, the boot-time parameter kasan_multi_shot must be used.

[aryabinin@virtuozzo.com: wrote changelog and doc, added missing include]
Link: http://lkml.kernel.org/r/20170323154416.30257-1-aryabinin@virtuozzo.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from b0845ce583)
Change-Id: Ia8c6d40dd0d4f5b944bf3501c08d7a825070b116
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Dmitry Vyukov
bc025fd6de UPSTREAM: kasan: fix races in quarantine_remove_cache()
quarantine_remove_cache() frees all pending objects that belong to the
cache, before we destroy the cache itself.  However there are currently
two possibilities how it can fail to do so.

First, another thread can hold some of the objects from the cache in
temp list in quarantine_put().  quarantine_put() has a windows of
enabled interrupts, and on_each_cpu() in quarantine_remove_cache() can
finish right in that window.  These objects will be later freed into the
destroyed cache.

Then, quarantine_reduce() has the same problem.  It grabs a batch of
objects from the global quarantine, then unlocks quarantine_lock and
then frees the batch.  quarantine_remove_cache() can finish while some
objects from the cache are still in the local to_free list in
quarantine_reduce().

Fix the race with quarantine_put() by disabling interrupts for the whole
duration of quarantine_put().  In combination with on_each_cpu() in
quarantine_remove_cache() it ensures that quarantine_remove_cache()
either sees the objects in the per-cpu list or in the global list.

Fix the race with quarantine_reduce() by protecting quarantine_reduce()
with srcu critical section and then doing synchronize_srcu() at the end
of quarantine_remove_cache().

I've done some assessment of how good synchronize_srcu() works in this
case.  And on a 4 CPU VM I see that it blocks waiting for pending read
critical sections in about 2-3% of cases.  Which looks good to me.

I suspect that these races are the root cause of some GPFs that I
episodically hit.  Previously I did not have any explanation for them.

  BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
  IP: qlist_free_all+0x2e/0xc0 mm/kasan/quarantine.c:155
  PGD 6aeea067
  PUD 60ed7067
  PMD 0
  Oops: 0000 [#1] SMP KASAN
  Dumping ftrace buffer:
     (ftrace buffer empty)
  Modules linked in:
  CPU: 0 PID: 13667 Comm: syz-executor2 Not tainted 4.10.0+ #60
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  task: ffff88005f948040 task.stack: ffff880069818000
  RIP: 0010:qlist_free_all+0x2e/0xc0 mm/kasan/quarantine.c:155
  RSP: 0018:ffff88006981f298 EFLAGS: 00010246
  RAX: ffffea0000ffff00 RBX: 0000000000000000 RCX: ffffea0000ffff1f
  RDX: 0000000000000000 RSI: ffff88003fffc3e0 RDI: 0000000000000000
  RBP: ffff88006981f2c0 R08: ffff88002fed7bd8 R09: 00000001001f000d
  R10: 00000000001f000d R11: ffff88006981f000 R12: ffff88003fffc3e0
  R13: ffff88006981f2d0 R14: ffffffff81877fae R15: 0000000080000000
  FS:  00007fb911a2d700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000000c8 CR3: 0000000060ed6000 CR4: 00000000000006f0
  Call Trace:
   quarantine_reduce+0x10e/0x120 mm/kasan/quarantine.c:239
   kasan_kmalloc+0xca/0xe0 mm/kasan/kasan.c:590
   kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
   slab_post_alloc_hook mm/slab.h:456 [inline]
   slab_alloc_node mm/slub.c:2718 [inline]
   kmem_cache_alloc_node+0x1d3/0x280 mm/slub.c:2754
   __alloc_skb+0x10f/0x770 net/core/skbuff.c:219
   alloc_skb include/linux/skbuff.h:932 [inline]
   _sctp_make_chunk+0x3b/0x260 net/sctp/sm_make_chunk.c:1388
   sctp_make_data net/sctp/sm_make_chunk.c:1420 [inline]
   sctp_make_datafrag_empty+0x208/0x360 net/sctp/sm_make_chunk.c:746
   sctp_datamsg_from_user+0x7e8/0x11d0 net/sctp/chunk.c:266
   sctp_sendmsg+0x2611/0x3970 net/sctp/socket.c:1962
   inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
   sock_sendmsg_nosec net/socket.c:633 [inline]
   sock_sendmsg+0xca/0x110 net/socket.c:643
   SYSC_sendto+0x660/0x810 net/socket.c:1685
   SyS_sendto+0x40/0x50 net/socket.c:1653

I am not sure about backporting.  The bug is quite hard to trigger, I've
seen it few times during our massive continuous testing (however, it
could be cause of some other episodic stray crashes as it leads to
memory corruption...).  If it is triggered, the consequences are very
bad -- almost definite bad memory corruption.  The fix is non trivial
and has chances of introducing new bugs.  I am also not sure how
actively people use KASAN on older releases.

[dvyukov@google.com: - sorted includes[
  Link: http://lkml.kernel.org/r/20170309094028.51088-1-dvyukov@google.com
Link: http://lkml.kernel.org/r/20170308151532.5070-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from ce5bec54bb)
Change-Id: I9199861f005d7932c37397b3ae23a123a4cff89b
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Dmitry Vyukov
3e8d429c89 UPSTREAM: kasan: resched in quarantine_remove_cache()
We see reported stalls/lockups in quarantine_remove_cache() on machines
with large amounts of RAM.  quarantine_remove_cache() needs to scan
whole quarantine in order to take out all objects belonging to the
cache.  Quarantine is currently 1/32-th of RAM, e.g.  on a machine with
256GB of memory that will be 8GB.  Moreover quarantine scanning is a
walk over uncached linked list, which is slow.

Add cond_resched() after scanning of each non-empty batch of objects.
Batches are specifically kept of reasonable size for quarantine_put().
On a machine with 256GB of RAM we should have ~512 non-empty batches,
each with 16MB of objects.

Link: http://lkml.kernel.org/r/20170308154239.25440-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 68fd814a33)
Change-Id: I8a38466a9b9544bb303202c94bfba6201251e3f3
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Ingo Molnar
b2667168ca BACKPORT: kasan, sched/headers: Uninline kasan_enable/disable_current()
<linux/kasan.h> is a low level header that is included early
in affected kernel headers. But it includes <linux/sched.h>
which complicates the cleanup of sched.h dependencies.

But kasan.h has almost no need for sched.h: its only use of
scheduler functionality is in two inline functions which are
not used very frequently - so uninline kasan_enable_current()
and kasan_disable_current().

Also add a <linux/sched.h> dependency to a .c file that depended
on kasan.h including it.

This paves the way to remove the <linux/sched.h> include from kasan.h.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Bug: 64145065
(cherry-picked from af8601ad42)
Change-Id: I13fd2d3927f663d694ea0d5bf44f18e2c62ae013
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Greg Thelen
29efb55f0a BACKPORT: kasan: drain quarantine of memcg slab objects
Per memcg slab accounting and kasan have a problem with kmem_cache
destruction.
 - kmem_cache_create() allocates a kmem_cache, which is used for
   allocations from processes running in root (top) memcg.
 - Processes running in non root memcg and allocating with either
   __GFP_ACCOUNT or from a SLAB_ACCOUNT cache use a per memcg
   kmem_cache.
 - Kasan catches use-after-free by having kfree() and kmem_cache_free()
   defer freeing of objects. Objects are placed in a quarantine.
 - kmem_cache_destroy() destroys root and non root kmem_caches. It takes
   care to drain the quarantine of objects from the root memcg's
   kmem_cache, but ignores objects associated with non root memcg. This
   causes leaks because quarantined per memcg objects refer to per memcg
   kmem cache being destroyed.

To see the problem:

 1) create a slab cache with kmem_cache_create(,,,SLAB_ACCOUNT,)
 2) from non root memcg, allocate and free a few objects from cache
 3) dispose of the cache with kmem_cache_destroy() kmem_cache_destroy()
    will trigger a "Slab cache still has objects" warning indicating
    that the per memcg kmem_cache structure was leaked.

Fix the leak by draining kasan quarantined objects allocated from non
root memcg.

Racing memcg deletion is tricky, but handled.  kmem_cache_destroy() =>
shutdown_memcg_caches() => __shutdown_memcg_cache() => shutdown_cache()
flushes per memcg quarantined objects, even if that memcg has been
rmdir'd and gone through memcg_deactivate_kmem_caches().

This leak only affects destroyed SLAB_ACCOUNT kmem caches when kasan is
enabled.  So I don't think it's worth patching stable kernels.

Link: http://lkml.kernel.org/r/1482257462-36948-1-git-send-email-gthelen@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from f9fa1d919c)
Change-Id: Ie054d9cde7fb1ce62e65776bff5a70f72925d037
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Dmitry Vyukov
626e014bcc UPSTREAM: kasan: eliminate long stalls during quarantine reduction
Currently we dedicate 1/32 of RAM for quarantine and then reduce it by
1/4 of total quarantine size.  This can be a significant amount of
memory.  For example, with 4GB of RAM total quarantine size is 128MB and
it is reduced by 32MB at a time.  With 128GB of RAM total quarantine
size is 4GB and it is reduced by 1GB.  This leads to several problems:

 - freeing 1GB can take tens of seconds, causes rcu stall warnings and
   just introduces unexpected long delays at random places
 - if kmalloc() is called under a mutex, other threads stall on that
   mutex while a thread reduces quarantine
 - threads wait on quarantine_lock while one thread grabs a large batch
   of objects to evict
 - we walk the uncached list of object to free twice which makes all of
   the above worse
 - when a thread frees objects, they are already not accounted against
   global_quarantine.bytes; as the result we can have quarantine_size
   bytes in quarantine + unbounded amount of memory in large batches in
   threads that are in process of freeing

Reduce size of quarantine in smaller batches to reduce the delays.  The
only reason to reduce it in batches is amortization of overheads, the
new batch size of 1MB should be well enough to amortize spinlock
lock/unlock and few function calls.

Plus organize quarantine as a FIFO array of batches.  This allows to not
walk the list in quarantine_reduce() under quarantine_lock, which in
turn reduces contention and is just faster.

This improves performance of heavy load (syzkaller fuzzing) by ~20% with
4 CPUs and 32GB of RAM.  Also this eliminates frequent (every 5 sec)
drops of CPU consumption from ~400% to ~100% (one thread reduces
quarantine while others are waiting on a mutex).

Some reference numbers:
1. Machine with 4 CPUs and 4GB of memory. Quarantine size 128MB.
   Currently we free 32MB at at time.
   With new code we free 1MB at a time (1024 batches, ~128 are used).
2. Machine with 32 CPUs and 128GB of memory. Quarantine size 4GB.
   Currently we free 1GB at at time.
   With new code we free 8MB at a time (1024 batches, ~512 are used).
3. Machine with 4096 CPUs and 1TB of memory. Quarantine size 32GB.
   Currently we free 8GB at at time.
   With new code we free 4MB at a time (16K batches, ~8K are used).

Link: http://lkml.kernel.org/r/1478756952-18695-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 64abdcb243)
Change-Id: Idf73cb292453ceffc437121b7a5e152cde1901ff
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Dmitry Vyukov
a989294733 UPSTREAM: kasan: support panic_on_warn
If user sets panic_on_warn, he wants kernel to panic if there is
anything barely wrong with the kernel.  KASAN-detected errors are
definitely not less benign than an arbitrary kernel WARNING.

Panic after KASAN errors if panic_on_warn is set.

We use this for continuous fuzzing where we want kernel to stop and
reboot on any error.

Link: http://lkml.kernel.org/r/1476694764-31986-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 5c5c1f36ce)
Change-Id: Iee7cbc4ffbce8eb8d827447fdf960a6520d10b00
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Josh Poimboeuf
438bf40d28 UPSTREAM: x86/suspend: fix false positive KASAN warning on suspend/resume
Resuming from a suspend operation is showing a KASAN false positive
warning:

  BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878
  Read of size 8 by task pm-suspend/7774
  page:ffffea000e19f5c0 count:0 mapcount:0 mapping:          (null) index:0x0
  flags: 0x2ffff0000000000()
  page dumped because: kasan: bad access detected
  CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G    B           4.9.0-rc7+ #8
  Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016
  Call Trace:
    dump_stack+0x63/0x82
    kasan_report_error+0x4b4/0x4e0
    ? acpi_hw_read_port+0xd0/0x1ea
    ? kfree_const+0x22/0x30
    ? acpi_hw_validate_io_request+0x1a6/0x1a6
    __asan_report_load8_noabort+0x61/0x70
    ? unwind_get_return_address+0x11d/0x130
    unwind_get_return_address+0x11d/0x130
    ? unwind_next_frame+0x97/0xf0
    __save_stack_trace+0x92/0x100
    save_stack_trace+0x1b/0x20
    save_stack+0x46/0xd0
    ? save_stack_trace+0x1b/0x20
    ? save_stack+0x46/0xd0
    ? kasan_kmalloc+0xad/0xe0
    ? kasan_slab_alloc+0x12/0x20
    ? acpi_hw_read+0x2b6/0x3aa
    ? acpi_hw_validate_register+0x20b/0x20b
    ? acpi_hw_write_port+0x72/0xc7
    ? acpi_hw_write+0x11f/0x15f
    ? acpi_hw_read_multiple+0x19f/0x19f
    ? memcpy+0x45/0x50
    ? acpi_hw_write_port+0x72/0xc7
    ? acpi_hw_write+0x11f/0x15f
    ? acpi_hw_read_multiple+0x19f/0x19f
    ? kasan_unpoison_shadow+0x36/0x50
    kasan_kmalloc+0xad/0xe0
    kasan_slab_alloc+0x12/0x20
    kmem_cache_alloc_trace+0xbc/0x1e0
    ? acpi_get_sleep_type_data+0x9a/0x578
    acpi_get_sleep_type_data+0x9a/0x578
    acpi_hw_legacy_wake_prep+0x88/0x22c
    ? acpi_hw_legacy_sleep+0x3c7/0x3c7
    ? acpi_write_bit_register+0x28d/0x2d3
    ? acpi_read_bit_register+0x19b/0x19b
    acpi_hw_sleep_dispatch+0xb5/0xba
    acpi_leave_sleep_state_prep+0x17/0x19
    acpi_suspend_enter+0x154/0x1e0
    ? trace_suspend_resume+0xe8/0xe8
    suspend_devices_and_enter+0xb09/0xdb0
    ? printk+0xa8/0xd8
    ? arch_suspend_enable_irqs+0x20/0x20
    ? try_to_freeze_tasks+0x295/0x600
    pm_suspend+0x6c9/0x780
    ? finish_wait+0x1f0/0x1f0
    ? suspend_devices_and_enter+0xdb0/0xdb0
    state_store+0xa2/0x120
    ? kobj_attr_show+0x60/0x60
    kobj_attr_store+0x36/0x70
    sysfs_kf_write+0x131/0x200
    kernfs_fop_write+0x295/0x3f0
    __vfs_write+0xef/0x760
    ? handle_mm_fault+0x1346/0x35e0
    ? do_iter_readv_writev+0x660/0x660
    ? __pmd_alloc+0x310/0x310
    ? do_lock_file_wait+0x1e0/0x1e0
    ? apparmor_file_permission+0x18/0x20
    ? security_file_permission+0x73/0x1c0
    ? rw_verify_area+0xbd/0x2b0
    vfs_write+0x149/0x4a0
    SyS_write+0xd9/0x1c0
    ? SyS_read+0x1c0/0x1c0
    entry_SYSCALL_64_fastpath+0x1e/0xad
  Memory state around the buggy address:
   ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4
                                                                  ^
   ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
   ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00

KASAN instrumentation poisons the stack when entering a function and
unpoisons it when exiting the function.  However, in the suspend path,
some functions never return, so their stack never gets unpoisoned,
resulting in stale KASAN shadow data which can cause later false
positive warnings like the one above.

Reported-by: Scott Bauer <scott.bauer@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Bug: 64145065
(cherry-picked from b53f40db59)
Change-Id: Iafbf1f7f19bb7db9a49316cf70050a3dae576f15
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Dmitry Vyukov
1b97073821 UPSTREAM: kasan: support use-after-scope detection
Gcc revision 241896 implements use-after-scope detection.  Will be
available in gcc 7.  Support it in KASAN.

Gcc emits 2 new callbacks to poison/unpoison large stack objects when
they go in/out of scope.  Implement the callbacks and add a test.

[dvyukov@google.com: v3]
  Link: http://lkml.kernel.org/r/1479998292-144502-1-git-send-email-dvyukov@google.com
Link: http://lkml.kernel.org/r/1479226045-145148-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 828347f8f9)
Change-Id: Ib9cb585efbe98ba11a7efbd233ebd97cb4214a92
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Dmitry Vyukov
59ab55f0aa BACKPORT: kprobes: Unpoison stack in jprobe_return() for KASAN
I observed false KSAN positives in the sctp code, when
sctp uses jprobe_return() in jsctp_sf_eat_sack().

The stray 0xf4 in shadow memory are stack redzones:

[     ] ==================================================================
[     ] BUG: KASAN: stack-out-of-bounds in memcmp+0xe9/0x150 at addr ffff88005e48f480
[     ] Read of size 1 by task syz-executor/18535
[     ] page:ffffea00017923c0 count:0 mapcount:0 mapping:          (null) index:0x0
[     ] flags: 0x1fffc0000000000()
[     ] page dumped because: kasan: bad access detected
[     ] CPU: 1 PID: 18535 Comm: syz-executor Not tainted 4.8.0+ #28
[     ] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[     ]  ffff88005e48f2d0 ffffffff82d2b849 ffffffff0bc91e90 fffffbfff10971e8
[     ]  ffffed000bc91e90 ffffed000bc91e90 0000000000000001 0000000000000000
[     ]  ffff88005e48f480 ffff88005e48f350 ffffffff817d3169 ffff88005e48f370
[     ] Call Trace:
[     ]  [<ffffffff82d2b849>] dump_stack+0x12e/0x185
[     ]  [<ffffffff817d3169>] kasan_report+0x489/0x4b0
[     ]  [<ffffffff817d31a9>] __asan_report_load1_noabort+0x19/0x20
[     ]  [<ffffffff82d49529>] memcmp+0xe9/0x150
[     ]  [<ffffffff82df7486>] depot_save_stack+0x176/0x5c0
[     ]  [<ffffffff817d2031>] save_stack+0xb1/0xd0
[     ]  [<ffffffff817d27f2>] kasan_slab_free+0x72/0xc0
[     ]  [<ffffffff817d05b8>] kfree+0xc8/0x2a0
[     ]  [<ffffffff85b03f19>] skb_free_head+0x79/0xb0
[     ]  [<ffffffff85b0900a>] skb_release_data+0x37a/0x420
[     ]  [<ffffffff85b090ff>] skb_release_all+0x4f/0x60
[     ]  [<ffffffff85b11348>] consume_skb+0x138/0x370
[     ]  [<ffffffff8676ad7b>] sctp_chunk_put+0xcb/0x180
[     ]  [<ffffffff8676ae88>] sctp_chunk_free+0x58/0x70
[     ]  [<ffffffff8677fa5f>] sctp_inq_pop+0x68f/0xef0
[     ]  [<ffffffff8675ee36>] sctp_assoc_bh_rcv+0xd6/0x4b0
[     ]  [<ffffffff8677f2c1>] sctp_inq_push+0x131/0x190
[     ]  [<ffffffff867bad69>] sctp_backlog_rcv+0xe9/0xa20
[ ... ]
[     ] Memory state around the buggy address:
[     ]  ffff88005e48f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ]  ffff88005e48f400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ] >ffff88005e48f480: f4 f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ]                    ^
[     ]  ffff88005e48f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ]  ffff88005e48f580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[     ] ==================================================================

KASAN stack instrumentation poisons stack redzones on function entry
and unpoisons them on function exit. If a function exits abnormally
(e.g. with a longjmp like jprobe_return()), stack redzones are left
poisoned. Later this leads to random KASAN false reports.

Unpoison stack redzones in the frames we are going to jump over
before doing actual longjmp in jprobe_return().

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: kasan-dev@googlegroups.com
Cc: surovegin@google.com
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/1476454043-101898-1-git-send-email-dvyukov@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Bug: 64145065
(cherry-picked from 9f7d416c36)
Change-Id: I84e4fac44265a69f615601266b3415147dade633
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Alexander Potapenko
241f6bfcd4 UPSTREAM: kasan: remove the unnecessary WARN_ONCE from quarantine.c
It's quite unlikely that the user will so little memory that the per-CPU
quarantines won't fit into the given fraction of the available memory.
Even in that case he won't be able to do anything with the information
given in the warning.

Link: http://lkml.kernel.org/r/1470929182-101413-1-git-send-email-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kuthonuzo Luruo <kuthonuzo.luruo@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from bcbf0d566b)
Change-Id: I1230018140c32fab7ea1d1dc1d54471aa48ae45f
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Alexander Potapenko
1bd1cfb3bf UPSTREAM: kasan: avoid overflowing quarantine size on low memory systems
If the total amount of memory assigned to quarantine is less than the
amount of memory assigned to per-cpu quarantines, |new_quarantine_size|
may overflow.  Instead, set it to zero.

[akpm@linux-foundation.org: cleanup: use WARN_ONCE return value]
Link: http://lkml.kernel.org/r/1470063563-96266-1-git-send-email-glider@google.com
Fixes: 55834c5909 ("mm: kasan: initial memory quarantine implementation")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from c3cee37228)
Change-Id: I8a647e5ee5d9494698aa2a31d50d587d6ff8b65c
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Ryabinin
80f68e05f9 UPSTREAM: kasan: improve double-free reports
Currently we just dump stack in case of double free bug.
Let's dump all info about the object that we have.

[aryabinin@virtuozzo.com: change double free message per Alexander]
  Link: http://lkml.kernel.org/r/1470153654-30160-1-git-send-email-aryabinin@virtuozzo.com
Link: http://lkml.kernel.org/r/1470062715-14077-6-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 7e08897893)
Change-Id: I733bf6272d44597907bcf01f1d13695b8e9f8cb4
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Joe Perches
40ae2227e9 BACKPORT: mm: coalesce split strings
Kernel style prefers a single string over split strings when the string is
'user-visible'.

Miscellanea:

 - Add a missing newline
 - Realign arguments

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Tejun Heo <tj@kernel.org>	[percpu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 756a025f00)
Change-Id: I377fb1542980c15d2f306924656227ad17b02b5e
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Ryabinin
73345d079c BACKPORT: mm/kasan: get rid of ->state in struct kasan_alloc_meta
The state of object currently tracked in two places - shadow memory, and
the ->state field in struct kasan_alloc_meta.  We can get rid of the
latter.  The will save us a little bit of memory.  Also, this allow us
to move free stack into struct kasan_alloc_meta, without increasing
memory consumption.  So now we should always know when the last time the
object was freed.  This may be useful for long delayed use-after-free
bugs.

As a side effect this fixes following UBSAN warning:
	UBSAN: Undefined behaviour in mm/kasan/quarantine.c:102:13
	member access within misaligned address ffff88000d1efebc for type 'struct qlist_node'
	which requires 8 byte alignment

Link: http://lkml.kernel.org/r/1470062715-14077-5-git-send-email-aryabinin@virtuozzo.com
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from b3cbd9bf77)
Change-Id: Iaa4959a78ffd2e49f9060099df1fb32483df3085
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Ryabinin
bc4e17a386 UPSTREAM: mm/kasan: get rid of ->alloc_size in struct kasan_alloc_meta
Size of slab object already stored in cache->object_size.

Note, that kmalloc() internally rounds up size of allocation, so
object_size may be not equal to alloc_size, but, usually we don't need
to know the exact size of allocated object.  In case if we need that
information, we still can figure it out from the report.  The dump of
shadow memory allows to identify the end of allocated memory, and
thereby the exact allocation size.

Link: http://lkml.kernel.org/r/1470062715-14077-4-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 47b5c2a0f0)
Change-Id: I76b555f9a8469f685607ca50f6c51b2e0ad1b4ab
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Ryabinin
d0d790dbfc UPSTREAM: mm: kasan: remove unused 'reserved' field from struct kasan_alloc_meta
Commit cd11016e5f ("mm, kasan: stackdepot implementation.  Enable
stackdepot for SLAB") added 'reserved' field, but never used it.

Link: http://lkml.kernel.org/r/1464021054-2307-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 9725759a96)
Change-Id: I34d5d28a6f6e1014d234f38c23b6e4aa408d3e84
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Ryabinin
362ec928f5 UPSTREAM: mm/kasan, slub: don't disable interrupts when object leaves quarantine
SLUB doesn't require disabled interrupts to call ___cache_free().

Link: http://lkml.kernel.org/r/1470062715-14077-3-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from f7376aed6c)
Change-Id: I9c8ae37791ab10c746414322a672bdf0ebd1ed9f
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Ryabinin
d03c2b3be0 UPSTREAM: mm/kasan: don't reduce quarantine in atomic contexts
Currently we call quarantine_reduce() for ___GFP_KSWAPD_RECLAIM (implied
by __GFP_RECLAIM) allocation.  So, basically we call it on almost every
allocation.  quarantine_reduce() sometimes is heavy operation, and
calling it with disabled interrupts may trigger hard LOCKUP:

 NMI watchdog: Watchdog detected hard LOCKUP on cpu 2irq event stamp: 1411258
 Call Trace:
  <NMI>   dump_stack+0x68/0x96
   watchdog_overflow_callback+0x15b/0x190
   __perf_event_overflow+0x1b1/0x540
   perf_event_overflow+0x14/0x20
   intel_pmu_handle_irq+0x36a/0xad0
   perf_event_nmi_handler+0x2c/0x50
   nmi_handle+0x128/0x480
   default_do_nmi+0xb2/0x210
   do_nmi+0x1aa/0x220
   end_repeat_nmi+0x1a/0x1e
  <<EOE>>   __kernel_text_address+0x86/0xb0
   print_context_stack+0x7b/0x100
   dump_trace+0x12b/0x350
   save_stack_trace+0x2b/0x50
   set_track+0x83/0x140
   free_debug_processing+0x1aa/0x420
   __slab_free+0x1d6/0x2e0
   ___cache_free+0xb6/0xd0
   qlist_free_all+0x83/0x100
   quarantine_reduce+0x177/0x1b0
   kasan_kmalloc+0xf3/0x100

Reduce the quarantine_reduce iff direct reclaim is allowed.

Fixes: 55834c59098d("mm: kasan: initial memory quarantine implementation")
Link: http://lkml.kernel.org/r/1470062715-14077-2-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 4b3ec5a3f4)
Change-Id: I7e6ad29acabc2091f98a8aac54ed041b574b5e7e
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Ryabinin
b5a3dd835e UPSTREAM: mm/kasan: fix corruptions and false positive reports
Once an object is put into quarantine, we no longer own it, i.e.  object
could leave the quarantine and be reallocated.  So having set_track()
call after the quarantine_put() may corrupt slab objects.

 BUG kmalloc-4096 (Not tainted): Poison overwritten
 -----------------------------------------------------------------------------
 Disabling lock debugging due to kernel taint
 INFO: 0xffff8804540de850-0xffff8804540de857. First byte 0xb5 instead of 0x6b
...
 INFO: Freed in qlist_free_all+0x42/0x100 age=75 cpu=3 pid=24492
  __slab_free+0x1d6/0x2e0
  ___cache_free+0xb6/0xd0
  qlist_free_all+0x83/0x100
  quarantine_reduce+0x177/0x1b0
  kasan_kmalloc+0xf3/0x100
  kasan_slab_alloc+0x12/0x20
  kmem_cache_alloc+0x109/0x3e0
  mmap_region+0x53e/0xe40
  do_mmap+0x70f/0xa50
  vm_mmap_pgoff+0x147/0x1b0
  SyS_mmap_pgoff+0x2c7/0x5b0
  SyS_mmap+0x1b/0x30
  do_syscall_64+0x1a0/0x4e0
  return_from_SYSCALL_64+0x0/0x7a
 INFO: Slab 0xffffea0011503600 objects=7 used=7 fp=0x          (null) flags=0x8000000000004080
 INFO: Object 0xffff8804540de848 @offset=26696 fp=0xffff8804540dc588
 Redzone ffff8804540de840: bb bb bb bb bb bb bb bb                          ........
 Object ffff8804540de848: 6b 6b 6b 6b 6b 6b 6b 6b b5 52 00 00 f2 01 60 cc  kkkkkkkk.R....`.

Similarly, poisoning after the quarantine_put() leads to false positive
use-after-free reports:

 BUG: KASAN: use-after-free in anon_vma_interval_tree_insert+0x304/0x430 at addr ffff880405c540a0
 Read of size 8 by task trinity-c0/3036
 CPU: 0 PID: 3036 Comm: trinity-c0 Not tainted 4.7.0-think+ #9
 Call Trace:
   dump_stack+0x68/0x96
   kasan_report_error+0x222/0x600
   __asan_report_load8_noabort+0x61/0x70
   anon_vma_interval_tree_insert+0x304/0x430
   anon_vma_chain_link+0x91/0xd0
   anon_vma_clone+0x136/0x3f0
   anon_vma_fork+0x81/0x4c0
   copy_process.part.47+0x2c43/0x5b20
   _do_fork+0x16d/0xbd0
   SyS_clone+0x19/0x20
   do_syscall_64+0x1a0/0x4e0
   entry_SYSCALL64_slow_path+0x25/0x25

Fix this by putting an object in the quarantine after all other
operations.

Fixes: 80a9201a59 ("mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB")
Link: http://lkml.kernel.org/r/1470062715-14077-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 4a3d308d66)
Change-Id: Iaa699c447b97f8cb04afdd2d6a5f572bea439185
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Paul Lawrence
785fd027b0 BACKPORT: mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB
For KASAN builds:
 - switch SLUB allocator to using stackdepot instead of storing the
   allocation/deallocation stacks in the objects;
 - change the freelist hook so that parts of the freelist can be put
   into the quarantine.

[aryabinin@virtuozzo.com: fixes]
  Link: http://lkml.kernel.org/r/1468601423-28676-1-git-send-email-aryabinin@virtuozzo.com
Link: http://lkml.kernel.org/r/1468347165-41906-3-git-send-email-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Kuthonuzo Luruo <kuthonuzo.luruo@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 80a9201a59)
Change-Id: I2b59c6d50d0db62d3609edfdc7be54e48f8afa5c
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Joonsoo Kim
c0d8017a9b UPSTREAM: kasan/quarantine: fix bugs on qlist_move_cache()
There are two bugs on qlist_move_cache().  One is that qlist's tail
isn't set properly.  curr->next can be NULL since it is singly linked
list and NULL value on tail is invalid if there is one item on qlist.
Another one is that if cache is matched, qlist_put() is called and it
will set curr->next to NULL.  It would cause to stop the loop
prematurely.

These problems come from complicated implementation so I'd like to
re-implement it completely.  Implementation in this patch is really
simple.  Iterate all qlist_nodes and put them to appropriate list.

Unfortunately, I got this bug sometime ago and lose oops message.  But,
the bug looks trivial and no need to attach oops.

Fixes: 55834c5909 ("mm: kasan: initial memory quarantine implementation")
Link: http://lkml.kernel.org/r/1467766348-22419-1-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Kuthonuzo Luruo <poll.stdin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 0ab686d8c8)
Change-Id: Ifca87bd938c74ff18e7fc2680afb15070cc7019f
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30
Andrey Ryabinin
197c34e95e UPSTREAM: mm: mempool: kasan: don't poot mempool objects in quarantine
Currently we may put reserved by mempool elements into quarantine via
kasan_kfree().  This is totally wrong since quarantine may really free
these objects.  So when mempool will try to use such element,
use-after-free will happen.  Or mempool may decide that it no longer
need that element and double-free it.

So don't put object into quarantine in kasan_kfree(), just poison it.
Rename kasan_kfree() to kasan_poison_kfree() to respect that.

Also, we shouldn't use kasan_slab_alloc()/kasan_krealloc() in
kasan_unpoison_element() because those functions may update allocation
stacktrace.  This would be wrong for the most of the remove_element call
sites.

(The only call site where we may want to update alloc stacktrace is
 in mempool_alloc(). Kmemleak solves this by calling
 kmemleak_update_trace(), so we could make something like that too.
 But this is out of scope of this patch).

Fixes: 55834c5909 ("mm: kasan: initial memory quarantine implementation")
Link: http://lkml.kernel.org/r/575977C3.1010905@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Kuthonuzo Luruo <kuthonuzo.luruo@hpe.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 9b75a867cc)
Change-Id: Idb6c152dae8f8f2975dbe6acb7165315be8b465b
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2018-01-22 13:15:43 +05:30