Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Conflicts:
fs/f2fs/extent_cache.c
Pick changes from AOSP Change-Id: Icd8a85ac0c19a8aa25cd2591a12b4e9b85bdf1c5
("f2fs: catch up to v4.14-rc1")
fs/f2fs/namei.c
Pick changes from AOSP F2FS backport commit 7d5c08fd91
("f2fs: backport from (4c1fad64 - Merge tag 'for-f2fs-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs)")
commit f351574172 upstream.
Provide a function, kmemdup_nul(), that will create a NUL-terminated string
from an unterminated character array where the length is known in advance.
This is better than kstrndup() in situations where we already know the
string length as the strnlen() in kstrndup() is superfluous.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af27d9403f upstream.
We get a warning about some slow configurations in randconfig kernels:
mm/memory.c:83:2: error: #warning Unfortunate NUMA and NUMA Balancing config, growing page-frame for last_cpupid. [-Werror=cpp]
The warning is reasonable by itself, but gets in the way of randconfig
build testing, so I'm hiding it whenever CONFIG_COMPILE_TEST is set.
The warning was added in 2013 in commit 75980e97da ("mm: fold
page->_last_nid into page->flags where possible").
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b050e3769c upstream.
Since commit 97a16fc82a ("mm, page_alloc: only enforce watermarks for
order-0 allocations"), __zone_watermark_ok() check for high-order
allocations will shortcut per-migratetype free list checks for
ALLOC_HARDER allocations, and return true as long as there's free page
of any migratetype. The intention is that ALLOC_HARDER can allocate
from MIGRATE_HIGHATOMIC free lists, while normal allocations can't.
However, as a side effect, the watermark check will then also return
true when there are pages only on the MIGRATE_ISOLATE list, or (prior to
CMA conversion to ZONE_MOVABLE) on the MIGRATE_CMA list. Since the
allocation cannot actually obtain isolated pages, and might not be able
to obtain CMA pages, this can result in a false positive.
The condition should be rare and perhaps the outcome is not a fatal one.
Still, it's better if the watermark check is correct. There also
shouldn't be a performance tradeoff here.
Link: http://lkml.kernel.org/r/20171102125001.23708-1-vbabka@suse.cz
Fixes: 97a16fc82a ("mm, page_alloc: only enforce watermarks for order-0 allocations")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e048cb32f6 upstream.
The align_offset parameter is used by bitmap_find_next_zero_area_off()
to represent the offset of map's base from the previous alignment
boundary; the function ensures that the returned index, plus the
align_offset, honors the specified align_mask.
The logic introduced by commit b5be83e308 ("mm: cma: align to physical
address, not CMA region position") has the cma driver calculate the
offset to the *next* alignment boundary. In most cases, the base
alignment is greater than that specified when making allocations,
resulting in a zero offset whether we align up or down. In the example
given with the commit, the base alignment (8MB) was half the requested
alignment (16MB) so the math also happened to work since the offset is
8MB in both directions. However, when requesting allocations with an
alignment greater than twice that of the base, the returned index would
not be correctly aligned.
Also, the align_order arguments of cma_bitmap_aligned_mask() and
cma_bitmap_aligned_offset() should not be negative so the argument type
was made unsigned.
Fixes: b5be83e308 ("mm: cma: align to physical address, not CMA region position")
Link: http://lkml.kernel.org/r/20170628170742.2895-1-opendmb@gmail.com
Signed-off-by: Angus Clark <angus@angusclark.org>
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Gregory Fong <gregory.0xf0@gmail.com>
Cc: Doug Berger <opendmb@gmail.com>
Cc: Angus Clark <angus@angusclark.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Shiraz Hashim <shashim@codeaurora.org>
Cc: Jaewon Kim <jaewon31.kim@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 18365225f0 upstream.
Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In
his particular case he has hit a bad_page("page still charged to
cgroup") when onlining a hwpoison page. While this looks like something
that shouldn't happen in the first place because onlining hwpages and
returning them to the page allocator makes only little sense it shows a
real problem.
hwpoison pages do not get freed usually so we do not uncharge them (at
least not since commit 0a31bc97c8 ("mm: memcontrol: rewrite uncharge
API")). Each charge pins memcg (since e8ea14cc6e ("mm: memcontrol:
take a css reference for each charged page")) as well and so the
mem_cgroup and the associated state will never go away. Fix this leak
by forcibly uncharging a LRU hwpoisoned page in delete_from_lru_cache().
We also have to tweak uncharge_list because it cannot rely on zero ref
count for these pages.
[akpm@linux-foundation.org: coding-style fixes]
Fixes: 0a31bc97c8 ("mm: memcontrol: rewrite uncharge API")
Link: http://lkml.kernel.org/r/20170502185507.GB19165@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 561b5e0709 upstream.
Commit 1be7107fbe ("mm: larger stack guard gap, between vmas") has
introduced a regression in some rust and Java environments which are
trying to implement their own stack guard page. They are punching a new
MAP_FIXED mapping inside the existing stack Vma.
This will confuse expand_{downwards,upwards} into thinking that the
stack expansion would in fact get us too close to an existing non-stack
vma which is a correct behavior wrt safety. It is a real regression on
the other hand.
Let's work around the problem by considering PROT_NONE mapping as a part
of the stack. This is a gros hack but overflowing to such a mapping
would trap anyway an we only can hope that usespace knows what it is
doing and handle it propely.
Fixes: 1be7107fbe ("mm: larger stack guard gap, between vmas")
Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mem_cgroup_move_task() is now called from ->post_attach
instead of ->attach thanks to LTS commit 52526076a5
("memcg: relocate charge moving from ->attach to ->post_attach").
Hence remove ->attach callback which sneaked back into
lsk-v4.4-android tree in the LSK merge commit 334ca3ed18
("Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android").
Otherwise we run into following build warning reported on KernelCI
https://kernelci.org/build/lsk/branch/linux-linaro-lsk-v4.4-android/kernel/lsk-v4.4-17.11-android-844-g6a7d9fbcf946/
mm/memcontrol.c:5337:12: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
Fixes: LSK commit 334ca3ed18
("Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android")
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
kcov provides code coverage collection for coverage-guided fuzzing
(randomized testing). Coverage-guided fuzzing is a testing technique
that uses coverage feedback to determine new interesting inputs to a
system. A notable user-space example is AFL
(http://lcamtuf.coredump.cx/afl/). However, this technique is not
widely used for kernel testing due to missing compiler and kernel
support.
kcov does not aim to collect as much coverage as possible. It aims to
collect more or less stable coverage that is function of syscall inputs.
To achieve this goal it does not collect coverage in soft/hard
interrupts and instrumentation of some inherently non-deterministic or
non-interesting parts of kernel is disbled (e.g. scheduler, locking).
Currently there is a single coverage collection mode (tracing), but the
API anticipates additional collection modes. Initially I also
implemented a second mode which exposes coverage in a fixed-size hash
table of counters (what Quentin used in his original patch). I've
dropped the second mode for simplicity.
This patch adds the necessary support on kernel side. The complimentary
compiler support was added in gcc revision 231296.
We've used this support to build syzkaller system call fuzzer, which has
found 90 kernel bugs in just 2 months:
https://github.com/google/syzkaller/wiki/Found-Bugs
We've also found 30+ bugs in our internal systems with syzkaller.
Another (yet unexplored) direction where kcov coverage would greatly
help is more traditional "blob mutation". For example, mounting a
random blob as a filesystem, or receiving a random blob over wire.
Why not gcov. Typical fuzzing loop looks as follows: (1) reset
coverage, (2) execute a bit of code, (3) collect coverage, repeat. A
typical coverage can be just a dozen of basic blocks (e.g. an invalid
input). In such context gcov becomes prohibitively expensive as
reset/collect coverage steps depend on total number of basic
blocks/edges in program (in case of kernel it is about 2M). Cost of
kcov depends only on number of executed basic blocks/edges. On top of
that, kernel requires per-thread coverage because there are always
background threads and unrelated processes that also produce coverage.
With inlined gcov instrumentation per-thread coverage is not possible.
kcov exposes kernel PCs and control flow to user-space which is
insecure. But debugfs should not be mapped as user accessible.
Based on a patch by Quentin Casasnovas.
[akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
[akpm@linux-foundation.org: unbreak allmodconfig]
[akpm@linux-foundation.org: follow x86 Makefile layout standards]
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tavis Ormandy <taviso@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: David Drysdale <drysdale@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 5c9a8750a6)
Change-Id: I17b5e04f6e89b241924e78ec32ead79c38b860ce
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Changes slab object description from:
Object at ffff880068388540, in cache kmalloc-128 size: 128
to:
The buggy address belongs to the object at ffff880068388540
which belongs to the cache kmalloc-128 of size 128
The buggy address is located 123 bytes inside of
128-byte region [ffff880068388540, ffff8800683885c0)
Makes it more explanatory and adds information about relative offset of
the accessed address to the start of the object.
Link: http://lkml.kernel.org/r/20170302134851.101218-7-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 0c06f1f86c)
Change-Id: I23928984dbe5a614b84c57e42b20ec13e7c739a4
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change report header format from:
BUG: KASAN: use-after-free in unwind_get_return_address+0x28a/0x2c0 at addr ffff880069437950
Read of size 8 by task insmod/3925
to:
BUG: KASAN: use-after-free in unwind_get_return_address+0x28a/0x2c0
Read of size 8 at addr ffff880069437950 by task insmod/3925
The exact access address is not usually important, so move it to the
second line. This also makes the header look visually balanced.
Link: http://lkml.kernel.org/r/20170302134851.101218-6-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 7f0a84c23b)
Change-Id: If9cacce637c317538d813b05ef2647707300d310
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Patch series "kasan: improve error reports", v2.
This patchset improves KASAN reports by making them easier to read and a
little more detailed. Also improves mm/kasan/report.c readability.
Effectively changes a use-after-free report to:
==================================================================
BUG: KASAN: use-after-free in kmalloc_uaf+0xaa/0xb6 [test_kasan]
Write of size 1 at addr ffff88006aa59da8 by task insmod/3951
CPU: 1 PID: 3951 Comm: insmod Tainted: G B 4.10.0+ #84
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x292/0x398
print_address_description+0x73/0x280
kasan_report.part.2+0x207/0x2f0
__asan_report_store1_noabort+0x2c/0x30
kmalloc_uaf+0xaa/0xb6 [test_kasan]
kmalloc_tests_init+0x4f/0xa48 [test_kasan]
do_one_initcall+0xf3/0x390
do_init_module+0x215/0x5d0
load_module+0x54de/0x82b0
SYSC_init_module+0x3be/0x430
SyS_init_module+0x9/0x10
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x7f22cfd0b9da
RSP: 002b:00007ffe69118a78 EFLAGS: 00000206 ORIG_RAX: 00000000000000af
RAX: ffffffffffffffda RBX: 0000555671242090 RCX: 00007f22cfd0b9da
RDX: 00007f22cffcaf88 RSI: 000000000004df7e RDI: 00007f22d0399000
RBP: 00007f22cffcaf88 R08: 0000000000000003 R09: 0000000000000000
R10: 00007f22cfd07d0a R11: 0000000000000206 R12: 0000555671243190
R13: 000000000001fe81 R14: 0000000000000000 R15: 0000000000000004
Allocated by task 3951:
save_stack_trace+0x16/0x20
save_stack+0x43/0xd0
kasan_kmalloc+0xad/0xe0
kmem_cache_alloc_trace+0x82/0x270
kmalloc_uaf+0x56/0xb6 [test_kasan]
kmalloc_tests_init+0x4f/0xa48 [test_kasan]
do_one_initcall+0xf3/0x390
do_init_module+0x215/0x5d0
load_module+0x54de/0x82b0
SYSC_init_module+0x3be/0x430
SyS_init_module+0x9/0x10
entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed by task 3951:
save_stack_trace+0x16/0x20
save_stack+0x43/0xd0
kasan_slab_free+0x72/0xc0
kfree+0xe8/0x2b0
kmalloc_uaf+0x85/0xb6 [test_kasan]
kmalloc_tests_init+0x4f/0xa48 [test_kasan]
do_one_initcall+0xf3/0x390
do_init_module+0x215/0x5d0
load_module+0x54de/0x82b0
SYSC_init_module+0x3be/0x430
SyS_init_module+0x9/0x10
entry_SYSCALL_64_fastpath+0x1f/0xc
The buggy address belongs to the object at ffff88006aa59da0
which belongs to the cache kmalloc-16 of size 16
The buggy address is located 8 bytes inside of
16-byte region [ffff88006aa59da0, ffff88006aa59db0)
The buggy address belongs to the page:
page:ffffea0001aa9640 count:1 mapcount:0 mapping: (null) index:0x0
flags: 0x100000000000100(slab)
raw: 0100000000000100 0000000000000000 0000000000000000 0000000180800080
raw: ffffea0001abe380 0000000700000007 ffff88006c401b40 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88006aa59c80: 00 00 fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc
ffff88006aa59d00: 00 00 fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc
>ffff88006aa59d80: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
^
ffff88006aa59e00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
ffff88006aa59e80: fb fb fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc
==================================================================
from:
==================================================================
BUG: KASAN: use-after-free in kmalloc_uaf+0xaa/0xb6 [test_kasan] at addr ffff88006c4dcb28
Write of size 1 by task insmod/3984
CPU: 1 PID: 3984 Comm: insmod Tainted: G B 4.10.0+ #83
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x292/0x398
kasan_object_err+0x1c/0x70
kasan_report.part.1+0x20e/0x4e0
__asan_report_store1_noabort+0x2c/0x30
kmalloc_uaf+0xaa/0xb6 [test_kasan]
kmalloc_tests_init+0x4f/0xa48 [test_kasan]
do_one_initcall+0xf3/0x390
do_init_module+0x215/0x5d0
load_module+0x54de/0x82b0
SYSC_init_module+0x3be/0x430
SyS_init_module+0x9/0x10
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x7feca0f779da
RSP: 002b:00007ffdfeae5218 EFLAGS: 00000206 ORIG_RAX: 00000000000000af
RAX: ffffffffffffffda RBX: 000055a064c13090 RCX: 00007feca0f779da
RDX: 00007feca1236f88 RSI: 000000000004df7e RDI: 00007feca1605000
RBP: 00007feca1236f88 R08: 0000000000000003 R09: 0000000000000000
R10: 00007feca0f73d0a R11: 0000000000000206 R12: 000055a064c14190
R13: 000000000001fe81 R14: 0000000000000000 R15: 0000000000000004
Object at ffff88006c4dcb20, in cache kmalloc-16 size: 16
Allocated:
PID = 3984
save_stack_trace+0x16/0x20
save_stack+0x43/0xd0
kasan_kmalloc+0xad/0xe0
kmem_cache_alloc_trace+0x82/0x270
kmalloc_uaf+0x56/0xb6 [test_kasan]
kmalloc_tests_init+0x4f/0xa48 [test_kasan]
do_one_initcall+0xf3/0x390
do_init_module+0x215/0x5d0
load_module+0x54de/0x82b0
SYSC_init_module+0x3be/0x430
SyS_init_module+0x9/0x10
entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 3984
save_stack_trace+0x16/0x20
save_stack+0x43/0xd0
kasan_slab_free+0x73/0xc0
kfree+0xe8/0x2b0
kmalloc_uaf+0x85/0xb6 [test_kasan]
kmalloc_tests_init+0x4f/0xa48 [test_kasan]
do_one_initcall+0xf3/0x390
do_init_module+0x215/0x5d0
load_module+0x54de/0x82b0
SYSC_init_module+0x3be/0x430
SyS_init_module+0x9/0x10
entry_SYSCALL_64_fastpath+0x1f/0xc2
Memory state around the buggy address:
ffff88006c4dca00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
ffff88006c4dca80: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
>ffff88006c4dcb00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
^
ffff88006c4dcb80: fb fb fc fc 00 00 fc fc fb fb fc fc fb fb fc fc
ffff88006c4dcc00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
==================================================================
This patch (of 9):
Introduce get_shadow_bug_type() function, which determines bug type
based on the shadow value for a particular kernel address. Introduce
get_wild_bug_type() function, which determines bug type for addresses
which don't have a corresponding shadow value.
Link: http://lkml.kernel.org/r/20170302134851.101218-2-andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 5e82cd1203)
Change-Id: I3359775858891c9c66d11d2a520831e329993ae9
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Disable kasan after the first report. There are several reasons for
this:
- Single bug quite often has multiple invalid memory accesses causing
storm in the dmesg.
- Write OOB access might corrupt metadata so the next report will print
bogus alloc/free stacktraces.
- Reports after the first easily could be not bugs by itself but just
side effects of the first one.
Given that multiple reports usually only do harm, it makes sense to
disable kasan after the first one. If user wants to see all the
reports, the boot-time parameter kasan_multi_shot must be used.
[aryabinin@virtuozzo.com: wrote changelog and doc, added missing include]
Link: http://lkml.kernel.org/r/20170323154416.30257-1-aryabinin@virtuozzo.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from b0845ce583)
Change-Id: Ia8c6d40dd0d4f5b944bf3501c08d7a825070b116
Signed-off-by: Paul Lawrence <paullawrence@google.com>
quarantine_remove_cache() frees all pending objects that belong to the
cache, before we destroy the cache itself. However there are currently
two possibilities how it can fail to do so.
First, another thread can hold some of the objects from the cache in
temp list in quarantine_put(). quarantine_put() has a windows of
enabled interrupts, and on_each_cpu() in quarantine_remove_cache() can
finish right in that window. These objects will be later freed into the
destroyed cache.
Then, quarantine_reduce() has the same problem. It grabs a batch of
objects from the global quarantine, then unlocks quarantine_lock and
then frees the batch. quarantine_remove_cache() can finish while some
objects from the cache are still in the local to_free list in
quarantine_reduce().
Fix the race with quarantine_put() by disabling interrupts for the whole
duration of quarantine_put(). In combination with on_each_cpu() in
quarantine_remove_cache() it ensures that quarantine_remove_cache()
either sees the objects in the per-cpu list or in the global list.
Fix the race with quarantine_reduce() by protecting quarantine_reduce()
with srcu critical section and then doing synchronize_srcu() at the end
of quarantine_remove_cache().
I've done some assessment of how good synchronize_srcu() works in this
case. And on a 4 CPU VM I see that it blocks waiting for pending read
critical sections in about 2-3% of cases. Which looks good to me.
I suspect that these races are the root cause of some GPFs that I
episodically hit. Previously I did not have any explanation for them.
BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
IP: qlist_free_all+0x2e/0xc0 mm/kasan/quarantine.c:155
PGD 6aeea067
PUD 60ed7067
PMD 0
Oops: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 13667 Comm: syz-executor2 Not tainted 4.10.0+ #60
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88005f948040 task.stack: ffff880069818000
RIP: 0010:qlist_free_all+0x2e/0xc0 mm/kasan/quarantine.c:155
RSP: 0018:ffff88006981f298 EFLAGS: 00010246
RAX: ffffea0000ffff00 RBX: 0000000000000000 RCX: ffffea0000ffff1f
RDX: 0000000000000000 RSI: ffff88003fffc3e0 RDI: 0000000000000000
RBP: ffff88006981f2c0 R08: ffff88002fed7bd8 R09: 00000001001f000d
R10: 00000000001f000d R11: ffff88006981f000 R12: ffff88003fffc3e0
R13: ffff88006981f2d0 R14: ffffffff81877fae R15: 0000000080000000
FS: 00007fb911a2d700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000c8 CR3: 0000000060ed6000 CR4: 00000000000006f0
Call Trace:
quarantine_reduce+0x10e/0x120 mm/kasan/quarantine.c:239
kasan_kmalloc+0xca/0xe0 mm/kasan/kasan.c:590
kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
slab_post_alloc_hook mm/slab.h:456 [inline]
slab_alloc_node mm/slub.c:2718 [inline]
kmem_cache_alloc_node+0x1d3/0x280 mm/slub.c:2754
__alloc_skb+0x10f/0x770 net/core/skbuff.c:219
alloc_skb include/linux/skbuff.h:932 [inline]
_sctp_make_chunk+0x3b/0x260 net/sctp/sm_make_chunk.c:1388
sctp_make_data net/sctp/sm_make_chunk.c:1420 [inline]
sctp_make_datafrag_empty+0x208/0x360 net/sctp/sm_make_chunk.c:746
sctp_datamsg_from_user+0x7e8/0x11d0 net/sctp/chunk.c:266
sctp_sendmsg+0x2611/0x3970 net/sctp/socket.c:1962
inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
sock_sendmsg_nosec net/socket.c:633 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:643
SYSC_sendto+0x660/0x810 net/socket.c:1685
SyS_sendto+0x40/0x50 net/socket.c:1653
I am not sure about backporting. The bug is quite hard to trigger, I've
seen it few times during our massive continuous testing (however, it
could be cause of some other episodic stray crashes as it leads to
memory corruption...). If it is triggered, the consequences are very
bad -- almost definite bad memory corruption. The fix is non trivial
and has chances of introducing new bugs. I am also not sure how
actively people use KASAN on older releases.
[dvyukov@google.com: - sorted includes[
Link: http://lkml.kernel.org/r/20170309094028.51088-1-dvyukov@google.com
Link: http://lkml.kernel.org/r/20170308151532.5070-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from ce5bec54bb)
Change-Id: I9199861f005d7932c37397b3ae23a123a4cff89b
Signed-off-by: Paul Lawrence <paullawrence@google.com>
We see reported stalls/lockups in quarantine_remove_cache() on machines
with large amounts of RAM. quarantine_remove_cache() needs to scan
whole quarantine in order to take out all objects belonging to the
cache. Quarantine is currently 1/32-th of RAM, e.g. on a machine with
256GB of memory that will be 8GB. Moreover quarantine scanning is a
walk over uncached linked list, which is slow.
Add cond_resched() after scanning of each non-empty batch of objects.
Batches are specifically kept of reasonable size for quarantine_put().
On a machine with 256GB of RAM we should have ~512 non-empty batches,
each with 16MB of objects.
Link: http://lkml.kernel.org/r/20170308154239.25440-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 68fd814a33)
Change-Id: I8a38466a9b9544bb303202c94bfba6201251e3f3
Signed-off-by: Paul Lawrence <paullawrence@google.com>
<linux/kasan.h> is a low level header that is included early
in affected kernel headers. But it includes <linux/sched.h>
which complicates the cleanup of sched.h dependencies.
But kasan.h has almost no need for sched.h: its only use of
scheduler functionality is in two inline functions which are
not used very frequently - so uninline kasan_enable_current()
and kasan_disable_current().
Also add a <linux/sched.h> dependency to a .c file that depended
on kasan.h including it.
This paves the way to remove the <linux/sched.h> include from kasan.h.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Bug: 64145065
(cherry-picked from af8601ad42)
Change-Id: I13fd2d3927f663d694ea0d5bf44f18e2c62ae013
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Per memcg slab accounting and kasan have a problem with kmem_cache
destruction.
- kmem_cache_create() allocates a kmem_cache, which is used for
allocations from processes running in root (top) memcg.
- Processes running in non root memcg and allocating with either
__GFP_ACCOUNT or from a SLAB_ACCOUNT cache use a per memcg
kmem_cache.
- Kasan catches use-after-free by having kfree() and kmem_cache_free()
defer freeing of objects. Objects are placed in a quarantine.
- kmem_cache_destroy() destroys root and non root kmem_caches. It takes
care to drain the quarantine of objects from the root memcg's
kmem_cache, but ignores objects associated with non root memcg. This
causes leaks because quarantined per memcg objects refer to per memcg
kmem cache being destroyed.
To see the problem:
1) create a slab cache with kmem_cache_create(,,,SLAB_ACCOUNT,)
2) from non root memcg, allocate and free a few objects from cache
3) dispose of the cache with kmem_cache_destroy() kmem_cache_destroy()
will trigger a "Slab cache still has objects" warning indicating
that the per memcg kmem_cache structure was leaked.
Fix the leak by draining kasan quarantined objects allocated from non
root memcg.
Racing memcg deletion is tricky, but handled. kmem_cache_destroy() =>
shutdown_memcg_caches() => __shutdown_memcg_cache() => shutdown_cache()
flushes per memcg quarantined objects, even if that memcg has been
rmdir'd and gone through memcg_deactivate_kmem_caches().
This leak only affects destroyed SLAB_ACCOUNT kmem caches when kasan is
enabled. So I don't think it's worth patching stable kernels.
Link: http://lkml.kernel.org/r/1482257462-36948-1-git-send-email-gthelen@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from f9fa1d919c)
Change-Id: Ie054d9cde7fb1ce62e65776bff5a70f72925d037
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Currently we dedicate 1/32 of RAM for quarantine and then reduce it by
1/4 of total quarantine size. This can be a significant amount of
memory. For example, with 4GB of RAM total quarantine size is 128MB and
it is reduced by 32MB at a time. With 128GB of RAM total quarantine
size is 4GB and it is reduced by 1GB. This leads to several problems:
- freeing 1GB can take tens of seconds, causes rcu stall warnings and
just introduces unexpected long delays at random places
- if kmalloc() is called under a mutex, other threads stall on that
mutex while a thread reduces quarantine
- threads wait on quarantine_lock while one thread grabs a large batch
of objects to evict
- we walk the uncached list of object to free twice which makes all of
the above worse
- when a thread frees objects, they are already not accounted against
global_quarantine.bytes; as the result we can have quarantine_size
bytes in quarantine + unbounded amount of memory in large batches in
threads that are in process of freeing
Reduce size of quarantine in smaller batches to reduce the delays. The
only reason to reduce it in batches is amortization of overheads, the
new batch size of 1MB should be well enough to amortize spinlock
lock/unlock and few function calls.
Plus organize quarantine as a FIFO array of batches. This allows to not
walk the list in quarantine_reduce() under quarantine_lock, which in
turn reduces contention and is just faster.
This improves performance of heavy load (syzkaller fuzzing) by ~20% with
4 CPUs and 32GB of RAM. Also this eliminates frequent (every 5 sec)
drops of CPU consumption from ~400% to ~100% (one thread reduces
quarantine while others are waiting on a mutex).
Some reference numbers:
1. Machine with 4 CPUs and 4GB of memory. Quarantine size 128MB.
Currently we free 32MB at at time.
With new code we free 1MB at a time (1024 batches, ~128 are used).
2. Machine with 32 CPUs and 128GB of memory. Quarantine size 4GB.
Currently we free 1GB at at time.
With new code we free 8MB at a time (1024 batches, ~512 are used).
3. Machine with 4096 CPUs and 1TB of memory. Quarantine size 32GB.
Currently we free 8GB at at time.
With new code we free 4MB at a time (16K batches, ~8K are used).
Link: http://lkml.kernel.org/r/1478756952-18695-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 64abdcb243)
Change-Id: Idf73cb292453ceffc437121b7a5e152cde1901ff
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Kernel style prefers a single string over split strings when the string is
'user-visible'.
Miscellanea:
- Add a missing newline
- Realign arguments
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Tejun Heo <tj@kernel.org> [percpu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 756a025f00)
Change-Id: I377fb1542980c15d2f306924656227ad17b02b5e
Signed-off-by: Paul Lawrence <paullawrence@google.com>
The state of object currently tracked in two places - shadow memory, and
the ->state field in struct kasan_alloc_meta. We can get rid of the
latter. The will save us a little bit of memory. Also, this allow us
to move free stack into struct kasan_alloc_meta, without increasing
memory consumption. So now we should always know when the last time the
object was freed. This may be useful for long delayed use-after-free
bugs.
As a side effect this fixes following UBSAN warning:
UBSAN: Undefined behaviour in mm/kasan/quarantine.c:102:13
member access within misaligned address ffff88000d1efebc for type 'struct qlist_node'
which requires 8 byte alignment
Link: http://lkml.kernel.org/r/1470062715-14077-5-git-send-email-aryabinin@virtuozzo.com
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from b3cbd9bf77)
Change-Id: Iaa4959a78ffd2e49f9060099df1fb32483df3085
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Size of slab object already stored in cache->object_size.
Note, that kmalloc() internally rounds up size of allocation, so
object_size may be not equal to alloc_size, but, usually we don't need
to know the exact size of allocated object. In case if we need that
information, we still can figure it out from the report. The dump of
shadow memory allows to identify the end of allocated memory, and
thereby the exact allocation size.
Link: http://lkml.kernel.org/r/1470062715-14077-4-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 47b5c2a0f0)
Change-Id: I76b555f9a8469f685607ca50f6c51b2e0ad1b4ab
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Currently we call quarantine_reduce() for ___GFP_KSWAPD_RECLAIM (implied
by __GFP_RECLAIM) allocation. So, basically we call it on almost every
allocation. quarantine_reduce() sometimes is heavy operation, and
calling it with disabled interrupts may trigger hard LOCKUP:
NMI watchdog: Watchdog detected hard LOCKUP on cpu 2irq event stamp: 1411258
Call Trace:
<NMI> dump_stack+0x68/0x96
watchdog_overflow_callback+0x15b/0x190
__perf_event_overflow+0x1b1/0x540
perf_event_overflow+0x14/0x20
intel_pmu_handle_irq+0x36a/0xad0
perf_event_nmi_handler+0x2c/0x50
nmi_handle+0x128/0x480
default_do_nmi+0xb2/0x210
do_nmi+0x1aa/0x220
end_repeat_nmi+0x1a/0x1e
<<EOE>> __kernel_text_address+0x86/0xb0
print_context_stack+0x7b/0x100
dump_trace+0x12b/0x350
save_stack_trace+0x2b/0x50
set_track+0x83/0x140
free_debug_processing+0x1aa/0x420
__slab_free+0x1d6/0x2e0
___cache_free+0xb6/0xd0
qlist_free_all+0x83/0x100
quarantine_reduce+0x177/0x1b0
kasan_kmalloc+0xf3/0x100
Reduce the quarantine_reduce iff direct reclaim is allowed.
Fixes: 55834c59098d("mm: kasan: initial memory quarantine implementation")
Link: http://lkml.kernel.org/r/1470062715-14077-2-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 4b3ec5a3f4)
Change-Id: I7e6ad29acabc2091f98a8aac54ed041b574b5e7e
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Once an object is put into quarantine, we no longer own it, i.e. object
could leave the quarantine and be reallocated. So having set_track()
call after the quarantine_put() may corrupt slab objects.
BUG kmalloc-4096 (Not tainted): Poison overwritten
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: 0xffff8804540de850-0xffff8804540de857. First byte 0xb5 instead of 0x6b
...
INFO: Freed in qlist_free_all+0x42/0x100 age=75 cpu=3 pid=24492
__slab_free+0x1d6/0x2e0
___cache_free+0xb6/0xd0
qlist_free_all+0x83/0x100
quarantine_reduce+0x177/0x1b0
kasan_kmalloc+0xf3/0x100
kasan_slab_alloc+0x12/0x20
kmem_cache_alloc+0x109/0x3e0
mmap_region+0x53e/0xe40
do_mmap+0x70f/0xa50
vm_mmap_pgoff+0x147/0x1b0
SyS_mmap_pgoff+0x2c7/0x5b0
SyS_mmap+0x1b/0x30
do_syscall_64+0x1a0/0x4e0
return_from_SYSCALL_64+0x0/0x7a
INFO: Slab 0xffffea0011503600 objects=7 used=7 fp=0x (null) flags=0x8000000000004080
INFO: Object 0xffff8804540de848 @offset=26696 fp=0xffff8804540dc588
Redzone ffff8804540de840: bb bb bb bb bb bb bb bb ........
Object ffff8804540de848: 6b 6b 6b 6b 6b 6b 6b 6b b5 52 00 00 f2 01 60 cc kkkkkkkk.R....`.
Similarly, poisoning after the quarantine_put() leads to false positive
use-after-free reports:
BUG: KASAN: use-after-free in anon_vma_interval_tree_insert+0x304/0x430 at addr ffff880405c540a0
Read of size 8 by task trinity-c0/3036
CPU: 0 PID: 3036 Comm: trinity-c0 Not tainted 4.7.0-think+ #9
Call Trace:
dump_stack+0x68/0x96
kasan_report_error+0x222/0x600
__asan_report_load8_noabort+0x61/0x70
anon_vma_interval_tree_insert+0x304/0x430
anon_vma_chain_link+0x91/0xd0
anon_vma_clone+0x136/0x3f0
anon_vma_fork+0x81/0x4c0
copy_process.part.47+0x2c43/0x5b20
_do_fork+0x16d/0xbd0
SyS_clone+0x19/0x20
do_syscall_64+0x1a0/0x4e0
entry_SYSCALL64_slow_path+0x25/0x25
Fix this by putting an object in the quarantine after all other
operations.
Fixes: 80a9201a59 ("mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB")
Link: http://lkml.kernel.org/r/1470062715-14077-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 4a3d308d66)
Change-Id: Iaa699c447b97f8cb04afdd2d6a5f572bea439185
Signed-off-by: Paul Lawrence <paullawrence@google.com>
There are two bugs on qlist_move_cache(). One is that qlist's tail
isn't set properly. curr->next can be NULL since it is singly linked
list and NULL value on tail is invalid if there is one item on qlist.
Another one is that if cache is matched, qlist_put() is called and it
will set curr->next to NULL. It would cause to stop the loop
prematurely.
These problems come from complicated implementation so I'd like to
re-implement it completely. Implementation in this patch is really
simple. Iterate all qlist_nodes and put them to appropriate list.
Unfortunately, I got this bug sometime ago and lose oops message. But,
the bug looks trivial and no need to attach oops.
Fixes: 55834c5909 ("mm: kasan: initial memory quarantine implementation")
Link: http://lkml.kernel.org/r/1467766348-22419-1-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Kuthonuzo Luruo <poll.stdin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 0ab686d8c8)
Change-Id: Ifca87bd938c74ff18e7fc2680afb15070cc7019f
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Currently we may put reserved by mempool elements into quarantine via
kasan_kfree(). This is totally wrong since quarantine may really free
these objects. So when mempool will try to use such element,
use-after-free will happen. Or mempool may decide that it no longer
need that element and double-free it.
So don't put object into quarantine in kasan_kfree(), just poison it.
Rename kasan_kfree() to kasan_poison_kfree() to respect that.
Also, we shouldn't use kasan_slab_alloc()/kasan_krealloc() in
kasan_unpoison_element() because those functions may update allocation
stacktrace. This would be wrong for the most of the remove_element call
sites.
(The only call site where we may want to update alloc stacktrace is
in mempool_alloc(). Kmemleak solves this by calling
kmemleak_update_trace(), so we could make something like that too.
But this is out of scope of this patch).
Fixes: 55834c5909 ("mm: kasan: initial memory quarantine implementation")
Link: http://lkml.kernel.org/r/575977C3.1010905@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Kuthonuzo Luruo <kuthonuzo.luruo@hpe.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 9b75a867cc)
Change-Id: Idb6c152dae8f8f2975dbe6acb7165315be8b465b
Signed-off-by: Paul Lawrence <paullawrence@google.com>