(Upstream commit 3f41b60938).
There are two issues with assigning random percpu seeds right now:
1. We use for_each_possible_cpu() to iterate over cpus, but cpumask is
not set up yet at the moment of kasan_init(), and thus we only set
the seed for cpu #0.
2. A call to get_random_u32() always returns the same number and produces
a message in dmesg, since the random subsystem is not yet initialized.
Fix 1 by calling kasan_init_tags() after cpumask is set up.
Fix 2 by using get_cycles() instead of get_random_u32(). This gives us
lower quality random numbers, but it's good enough, as KASAN is meant to
be used as a debugging tool and not a mitigation.
Link: http://lkml.kernel.org/r/1f815cc914b61f3516ed4cc9bfd9eeca9bd5d9de.1550677973.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: Ic4b29dfd24515f02302d5cbd0d79eab5c6f0642c
(Upstream commit d36a63a943).
When CONFIG_KASAN_SW_TAGS is enabled, ptr_addr might be tagged. Normally,
this doesn't cause any issues, as both set_freepointer() and
get_freepointer() are called with a pointer with the same tag. However,
there are some issues with CONFIG_SLUB_DEBUG code. For example, when
__free_slub() iterates over objects in a cache, it passes untagged
pointers to check_object(). check_object() in turns calls
get_freepointer() with an untagged pointer, which causes the freepointer
to be restored incorrectly.
Add kasan_reset_tag to freelist_ptr(). Also add a detailed comment.
Link: http://lkml.kernel.org/r/bf858f26ef32eb7bd24c665755b3aee4bc58d0e4.1550103861.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: Qian Cai <cai@lca.pw>
Tested-by: Qian Cai <cai@lca.pw>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: Ie57f08f676ea7a244a20f1dee98fc725594cafa6
(Upstream commit 0d0c8de878).
When option CONFIG_KASAN is enabled toghether with ftrace, function
ftrace_graph_caller() gets in to a recursion, via functions
kasan_check_read() and kasan_check_write().
Breakpoint 2, ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179
179 mcount_get_pc x0 // function's pc
(gdb) bt
#0 ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179
#1 0xffffff90101406c8 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:151
#2 0xffffff90106fd084 in kasan_check_write (p=0xffffffc06c170878, size=4) at ../mm/kasan/common.c:105
#3 0xffffff90104a2464 in atomic_add_return (v=<optimized out>, i=<optimized out>) at ./include/generated/atomic-instrumented.h:71
#4 atomic_inc_return (v=<optimized out>) at ./include/generated/atomic-fallback.h:284
#5 trace_graph_entry (trace=0xffffffc03f5ff380) at ../kernel/trace/trace_functions_graph.c:441
#6 0xffffff9010481774 in trace_graph_entry_watchdog (trace=<optimized out>) at ../kernel/trace/trace_selftest.c:741
#7 0xffffff90104a185c in function_graph_enter (ret=<optimized out>, func=<optimized out>, frame_pointer=18446743799894897728, retp=<optimized out>) at ../kernel/trace/trace_functions_graph.c:196
#8 0xffffff9010140628 in prepare_ftrace_return (self_addr=18446743592948977792, parent=0xffffffc03f5ff418, frame_pointer=18446743799894897728) at ../arch/arm64/kernel/ftrace.c:231
#9 0xffffff90101406f4 in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:182
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
Rework so that the kasan implementation isn't traced.
Link: http://lkml.kernel.org/r/20181212183447.15890-1-anders.roxell@linaro.org
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: Id4dc25b795f81b4193c5f861657e9091acd99cef
(Upstream commit 7fa1e2e6af).
Defining ARCH_SLAB_MINALIGN in arch/arm64/include/asm/cache.h when KASAN
is off is not needed, as it is defined in defined in include/linux/slab.h
as ifndef.
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: I27d13c0ed9f8ab2488028b526561f90b3e2eec0c
(Upstream commit a3fe7cdf02).
Right now tag-based KASAN can retag the memory that is reallocated via
krealloc and return a differently tagged pointer even if the same slab
object gets used and no reallocated technically happens.
There are a few issues with this approach. One is that krealloc callers
can't rely on comparing the return value with the passed argument to
check whether reallocation happened. Another is that if a caller knows
that no reallocation happened, that it can access object memory through
the old pointer, which leads to false positives. Look at
nf_ct_ext_add() to see an example.
Fix this by keeping the same tag if the memory don't actually gets
reallocated during krealloc.
Link: http://lkml.kernel.org/r/bb2a71d17ed072bcc528cbee46fcbd71a6da3be4.1546540962.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: Ie9bd73c43fa76ac2ba946bfbd2cb88e5c0000dfb
(Upstream commit fed84c7852).
Kmemleak does not play well with KASAN (tested on both HPE Apollo 70 and
Huawei TaiShan 2280 aarch64 servers).
After calling start_kernel()->setup_arch()->kasan_init(), kmemleak early
log buffer went from something like 280 to 260000 which caused kmemleak
disabled and crash dump memory reservation failed. The multitude of
kmemleak_alloc() calls is from nested loops while KASAN is setting up full
memory mappings, so let early kmemleak allocations skip those
memblock_alloc_internal() calls came from kasan_init() given that those
early KASAN memory mappings should not reference to other memory. Hence,
no kmemleak false positives.
kasan_init
kasan_map_populate [1]
kasan_pgd_populate [2]
kasan_pud_populate [3]
kasan_pmd_populate [4]
kasan_pte_populate [5]
kasan_alloc_zeroed_page
memblock_alloc_try_nid
memblock_alloc_internal
kmemleak_alloc
[1] for_each_memblock(memory, reg)
[2] while (pgdp++, addr = next, addr != end)
[3] while (pudp++, addr = next, addr != end && pud_none(READ_ONCE(*pudp)))
[4] while (pmdp++, addr = next, addr != end && pmd_none(READ_ONCE(*pmdp)))
[5] while (ptep++, addr = next, addr != end && pte_none(READ_ONCE(*ptep)))
Link: http://lkml.kernel.org/r/1543442925-17794-1-git-send-email-cai@gmx.us
Signed-off-by: Qian Cai <cai@gmx.us>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I2423d511c3938c882c738341673b13b3beff5475
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
(Upstream commit 2813b9c029).
Tag-based KASAN doesn't check memory accesses through pointers tagged with
0xff. When page_address is used to get pointer to memory that corresponds
to some page, the tag of the resulting pointer gets set to 0xff, even
though the allocated memory might have been tagged differently.
For slab pages it's impossible to recover the correct tag to return from
page_address, since the page might contain multiple slab objects tagged
with different values, and we can't know in advance which one of them is
going to get accessed. For non slab pages however, we can recover the tag
in page_address, since the whole page was marked with the same tag.
This patch adds tagging to non slab memory allocated with pagealloc. To
set the tag of the pointer returned from page_address, the tag gets stored
to page->flags when the memory gets allocated.
Link: http://lkml.kernel.org/r/d758ddcef46a5abc9970182b9137e2fbee202a2c.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: I500bdf42462fee0ee14495a6be51815e7e44460f
(Upstream commit 7f94ffbc4c).
This commit adds tag-based KASAN specific hooks implementation and
adjusts common generic and tag-based KASAN ones.
1. When a new slab cache is created, tag-based KASAN rounds up the size of
the objects in this cache to KASAN_SHADOW_SCALE_SIZE (== 16).
2. On each kmalloc tag-based KASAN generates a random tag, sets the shadow
memory, that corresponds to this object to this tag, and embeds this
tag value into the top byte of the returned pointer.
3. On each kfree tag-based KASAN poisons the shadow memory with a random
tag to allow detection of use-after-free bugs.
The rest of the logic of the hook implementation is very much similar to
the one provided by generic KASAN. Tag-based KASAN saves allocation and
free stack metadata to the slab object the same way generic KASAN does.
Link: http://lkml.kernel.org/r/bda78069e3b8422039794050ddcb2d53d053ed41.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: I23c40472207cba40dc962bb182225e378322249e
(Upstream commit 5b7c414822).
While with SLUB we can actually preassign tags for caches with contructors
and store them in pointers in the freelist, SLAB doesn't allow that since
the freelist is stored as an array of indexes, so there are no pointers to
store the tags.
Instead we compute the tag twice, once when a slab is created before
calling the constructor and then again each time when an object is
allocated with kmalloc. Tag is computed simply by taking the lowest byte
of the index that corresponds to the object. However in kasan_kmalloc we
only have access to the objects pointer, so we need a way to find out
which index this object corresponds to.
This patch moves obj_to_index from slab.c to include/linux/slab_def.h to
be reused by KASAN.
Link: http://lkml.kernel.org/r/c02cd9e574cfd93858e43ac94b05e38f891fef64.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: I6cb3332ea05e6152ab8de0490171ed5d4947def3
(Upstream commit 4d176711ea).
An object constructor can initialize pointers within this objects based on
the address of the object. Since the object address might be tagged, we
need to assign a tag before calling constructor.
The implemented approach is to assign tags to objects with constructors
when a slab is allocated and call constructors once as usual. The
downside is that such object would always have the same tag when it is
reallocated, so we won't catch use-after-frees on it.
Also pressign tags for objects from SLAB_TYPESAFE_BY_RCU caches, since
they can be validy accessed after having been freed.
Link: http://lkml.kernel.org/r/f158a8a74a031d66f0a9398a5b0ed453c37ba09a.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: I48b1de0516b9998f3b3e3917a15dd0bde27897cf
The conflict during backport is caused by the
include/linux/compiler_attributes.h file not being present.
(Upstream commit 2bd926b439).
This commit splits the current CONFIG_KASAN config option into two:
1. CONFIG_KASAN_GENERIC, that enables the generic KASAN mode (the one
that exists now);
2. CONFIG_KASAN_SW_TAGS, that enables the software tag-based KASAN mode.
The name CONFIG_KASAN_SW_TAGS is chosen as in the future we will have
another hardware tag-based KASAN mode, that will rely on hardware memory
tagging support in arm64.
With CONFIG_KASAN_SW_TAGS enabled, compiler options are changed to
instrument kernel files with -fsantize=kernel-hwaddress (except the ones
for which KASAN_SANITIZE := n is set).
Both CONFIG_KASAN_GENERIC and CONFIG_KASAN_SW_TAGS support both
CONFIG_KASAN_INLINE and CONFIG_KASAN_OUTLINE instrumentation modes.
This commit also adds empty placeholder (for now) implementation of
tag-based KASAN specific hooks inserted by the compiler and adjusts
common hooks implementation.
While this commit adds the CONFIG_KASAN_SW_TAGS config option, this option
is not selectable, as it depends on HAVE_ARCH_KASAN_SW_TAGS, which we will
enable once all the infrastracture code has been added.
Link: http://lkml.kernel.org/r/b2550106eb8a68b10fefbabce820910b115aa853.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Id95c0c0b6857c6b30f2bea4597aea6c90273ef89
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
(Upstream commit 0116523cff).
Patch series "kasan: add software tag-based mode for arm64", v13.
This patchset adds a new software tag-based mode to KASAN [1]. (Initially
this mode was called KHWASAN, but it got renamed, see the naming rationale
at the end of this section).
The plan is to implement HWASan [2] for the kernel with the incentive,
that it's going to have comparable to KASAN performance, but in the same
time consume much less memory, trading that off for somewhat imprecise bug
detection and being supported only for arm64.
The underlying ideas of the approach used by software tag-based KASAN are:
1. By using the Top Byte Ignore (TBI) arm64 CPU feature, we can store
pointer tags in the top byte of each kernel pointer.
2. Using shadow memory, we can store memory tags for each chunk of kernel
memory.
3. On each memory allocation, we can generate a random tag, embed it into
the returned pointer and set the memory tags that correspond to this
chunk of memory to the same value.
4. By using compiler instrumentation, before each memory access we can add
a check that the pointer tag matches the tag of the memory that is being
accessed.
5. On a tag mismatch we report an error.
With this patchset the existing KASAN mode gets renamed to generic KASAN,
with the word "generic" meaning that the implementation can be supported
by any architecture as it is purely software.
The new mode this patchset adds is called software tag-based KASAN. The
word "tag-based" refers to the fact that this mode uses tags embedded into
the top byte of kernel pointers and the TBI arm64 CPU feature that allows
to dereference such pointers. The word "software" here means that shadow
memory manipulation and tag checking on pointer dereference is done in
software. As it is the only tag-based implementation right now, "software
tag-based" KASAN is sometimes referred to as simply "tag-based" in this
patchset.
A potential expansion of this mode is a hardware tag-based mode, which
would use hardware memory tagging support (announced by Arm [3]) instead
of compiler instrumentation and manual shadow memory manipulation.
Same as generic KASAN, software tag-based KASAN is strictly a debugging
feature.
[1] https://www.kernel.org/doc/html/latest/dev-tools/kasan.html
[2] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
[3] https://community.arm.com/processors/b/blog/posts/arm-a-profile-architecture-2018-developments-armv85a
====== Rationale
On mobile devices generic KASAN's memory usage is significant problem.
One of the main reasons to have tag-based KASAN is to be able to perform a
similar set of checks as the generic one does, but with lower memory
requirements.
Comment from Vishwath Mohan <vishwath@google.com>:
I don't have data on-hand, but anecdotally both ASAN and KASAN have proven
problematic to enable for environments that don't tolerate the increased
memory pressure well. This includes
(a) Low-memory form factors - Wear, TV, Things, lower-tier phones like Go,
(c) Connected components like Pixel's visual core [1].
These are both places I'd love to have a low(er) memory footprint option at
my disposal.
Comment from Evgenii Stepanov <eugenis@google.com>:
Looking at a live Android device under load, slab (according to
/proc/meminfo) + kernel stack take 8-10% available RAM (~350MB). KASAN's
overhead of 2x - 3x on top of it is not insignificant.
Not having this overhead enables near-production use - ex. running
KASAN/KHWASAN kernel on a personal, daily-use device to catch bugs that do
not reproduce in test configuration. These are the ones that often cost
the most engineering time to track down.
CPU overhead is bad, but generally tolerable. RAM is critical, in our
experience. Once it gets low enough, OOM-killer makes your life
miserable.
[1] https://www.blog.google/products/pixel/pixel-visual-core-image-processing-and-machine-learning-pixel-2/
====== Technical details
Software tag-based KASAN mode is implemented in a very similar way to the
generic one. This patchset essentially does the following:
1. TCR_TBI1 is set to enable Top Byte Ignore.
2. Shadow memory is used (with a different scale, 1:16, so each shadow
byte corresponds to 16 bytes of kernel memory) to store memory tags.
3. All slab objects are aligned to shadow scale, which is 16 bytes.
4. All pointers returned from the slab allocator are tagged with a random
tag and the corresponding shadow memory is poisoned with the same value.
5. Compiler instrumentation is used to insert tag checks. Either by
calling callbacks or by inlining them (CONFIG_KASAN_OUTLINE and
CONFIG_KASAN_INLINE flags are reused).
6. When a tag mismatch is detected in callback instrumentation mode
KASAN simply prints a bug report. In case of inline instrumentation,
clang inserts a brk instruction, and KASAN has it's own brk handler,
which reports the bug.
7. The memory in between slab objects is marked with a reserved tag, and
acts as a redzone.
8. When a slab object is freed it's marked with a reserved tag.
Bug detection is imprecise for two reasons:
1. We won't catch some small out-of-bounds accesses, that fall into the
same shadow cell, as the last byte of a slab object.
2. We only have 1 byte to store tags, which means we have a 1/256
probability of a tag match for an incorrect access (actually even
slightly less due to reserved tag values).
Despite that there's a particular type of bugs that tag-based KASAN can
detect compared to generic KASAN: use-after-free after the object has been
allocated by someone else.
====== Testing
Some kernel developers voiced a concern that changing the top byte of
kernel pointers may lead to subtle bugs that are difficult to discover.
To address this concern deliberate testing has been performed.
It doesn't seem feasible to do some kind of static checking to find
potential issues with pointer tagging, so a dynamic approach was taken.
All pointer comparisons/subtractions have been instrumented in an LLVM
compiler pass and a kernel module that would print a bug report whenever
two pointers with different tags are being compared/subtracted (ignoring
comparisons with NULL pointers and with pointers obtained by casting an
error code to a pointer type) has been used. Then the kernel has been
booted in QEMU and on an Odroid C2 board and syzkaller has been run.
This yielded the following results.
The two places that look interesting are:
is_vmalloc_addr in include/linux/mm.h
is_kernel_rodata in mm/util.c
Here we compare a pointer with some fixed untagged values to make sure
that the pointer lies in a particular part of the kernel address space.
Since tag-based KASAN doesn't add tags to pointers that belong to rodata
or vmalloc regions, this should work as is. To make sure debug checks to
those two functions that check that the result doesn't change whether we
operate on pointers with or without untagging has been added.
A few other cases that don't look that interesting:
Comparing pointers to achieve unique sorting order of pointee objects
(e.g. sorting locks addresses before performing a double lock):
tty_ldisc_lock_pair_timeout in drivers/tty/tty_ldisc.c
pipe_double_lock in fs/pipe.c
unix_state_double_lock in net/unix/af_unix.c
lock_two_nondirectories in fs/inode.c
mutex_lock_double in kernel/events/core.c
ep_cmp_ffd in fs/eventpoll.c
fsnotify_compare_groups fs/notify/mark.c
Nothing needs to be done here, since the tags embedded into pointers
don't change, so the sorting order would still be unique.
Checks that a pointer belongs to some particular allocation:
is_sibling_entry in lib/radix-tree.c
object_is_on_stack in include/linux/sched/task_stack.h
Nothing needs to be done here either, since two pointers can only belong
to the same allocation if they have the same tag.
Overall, since the kernel boots and works, there are no critical bugs.
As for the rest, the traditional kernel testing way (use until fails) is
the only one that looks feasible.
Another point here is that tag-based KASAN is available under a separate
config option that needs to be deliberately enabled. Even though it might
be used in a "near-production" environment to find bugs that are not found
during fuzzing or running tests, it is still a debug tool.
====== Benchmarks
The following numbers were collected on Odroid C2 board. Both generic and
tag-based KASAN were used in inline instrumentation mode.
Boot time [1]:
* ~1.7 sec for clean kernel
* ~5.0 sec for generic KASAN
* ~5.0 sec for tag-based KASAN
Network performance [2]:
* 8.33 Gbits/sec for clean kernel
* 3.17 Gbits/sec for generic KASAN
* 2.85 Gbits/sec for tag-based KASAN
Slab memory usage after boot [3]:
* ~40 kb for clean kernel
* ~105 kb (~260% overhead) for generic KASAN
* ~47 kb (~20% overhead) for tag-based KASAN
KASAN memory overhead consists of three main parts:
1. Increased slab memory usage due to redzones.
2. Shadow memory (the whole reserved once during boot).
3. Quaratine (grows gradually until some preset limit; the more the limit,
the more the chance to detect a use-after-free).
Comparing tag-based vs generic KASAN for each of these points:
1. 20% vs 260% overhead.
2. 1/16th vs 1/8th of physical memory.
3. Tag-based KASAN doesn't require quarantine.
[1] Time before the ext4 driver is initialized.
[2] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.
[3] Measured as `cat /proc/meminfo | grep Slab`.
====== Some notes
A few notes:
1. The patchset can be found here:
https://github.com/xairy/kasan-prototype/tree/khwasan
2. Building requires a recent Clang version (7.0.0 or later).
3. Stack instrumentation is not supported yet and will be added later.
This patch (of 25):
Tag-based KASAN changes the value of the top byte of pointers returned
from the kernel allocation functions (such as kmalloc). This patch
updates KASAN hooks signatures and their usage in SLAB and SLUB code to
reflect that.
Link: http://lkml.kernel.org/r/aec2b5e3973781ff8a6bb6760f8543643202c451.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: I62e554e732ec79ffd195e2269c8a50aed14381c0
(Upstream commit 386b3c7bda).
So that we can export symbols directly from assembly files, let's make
use of the generic <asm/export.h>. We have a few symbols that we'll want
to conditionally export for !KASAN kernel builds, so we add a helper for
that in <asm/assembler.h>.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: I7cfd1b717c9b172487f7a872a3b9a4b4485e454a
(Upstream commit 163c8d54a9).
The __no_sanitize_address_or_inline and __no_kasan_or_inline defines
are almost identical. The only difference is that __no_kasan_or_inline
does not have the 'notrace' attribute.
To be able to replace __no_sanitize_address_or_inline with the older
definition, add 'notrace' to __no_kasan_or_inline and change to two
users of __no_sanitize_address_or_inline in the s390 code.
The 'notrace' option is necessary for e.g. the __load_psw_mask function
in arch/s390/include/asm/processor.h. Without the option it is possible
to trace __load_psw_mask which leads to kernel stack overflow.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pointed-out-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I27af631729f8ea52e55f31c02f584c01a0918073
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
(Upstream commit 026d1eaf5e).
The static lock quarantine_lock is used in quarantine.c to protect the
quarantine queue datastructures. It is taken inside quarantine queue
manipulation routines (quarantine_put(), quarantine_reduce() and
quarantine_remove_cache()), with IRQs disabled. This is not a problem on
a stock kernel but is problematic on an RT kernel where spin locks are
sleeping spinlocks, which can sleep and can not be acquired with disabled
interrupts.
Convert the quarantine_lock to a raw spinlock_t. The usage of
quarantine_lock is confined to quarantine.c and the work performed while
the lock is held is used for debug purpose.
[bigeasy@linutronix.de: slightly altered the commit message]
Link: http://lkml.kernel.org/r/20181010214945.5owshc3mlrh74z4b@linutronix.de
Signed-off-by: Clark Williams <williams@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: I12f35246b81b23cad5ce8b90407c00b86bf90cc0
(Upstream commit dde709d136).
Due to conflict between kasan instrumentation and inlining
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368 functions which are
defined as inline could not be called from functions defined with
__no_sanitize_address.
Introduce __no_sanitize_address_or_inline which would expand to
__no_sanitize_address when the kernel is built with kasan support and
to inline otherwise. This helps to avoid disabling kasan
instrumentation for entire files.
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Change-Id: If81ec5a63ae788bfe1a31a1678f9509daa76b01f
(Upstream commit 0293c8ba80).
"bellow" -> "below"
The recommendation from kegel.com/kerspell is to only fix the howlers.
"Bellow" is a synonym of "howl" so this should be appropriate.
Signed-off-by: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 128674696
Change-Id: I1c4a983d3d08ab1194ae541bfdaa7e5074ddea0e
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Leaf changes summary: 6 artifacts changed
Changed leaf types summary: 4 leaf types changed
Removed/Changed/Added functions summary: 0 Removed, 1 Changed, 0 Added function (156 filtered out)
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable (4 filtered out)
1 function with some sub-type change:
'struct fb_info at fb.h:464:1' changed:
type size changed from 6144 to 6912 (in bits)
2 data member insertions:
'delayed_work fb_info::deferred_work', at offset 5376 (in bits) at fb.h:496:1
'fb_deferred_io* fb_info::fbdefio', at offset 6080 (in bits) at fb.h:497:1
there are data member changes:
'fb_ops* fb_info::fbops' offset changed from 5376 to 6144 (in bits) (by +768 bits)
'device* fb_info::device' offset changed from 5440 to 6208 (in bits) (by +768 bits)
'device* fb_info::dev' offset changed from 5504 to 6272 (in bits) (by +768 bits)
'int fb_info::class_flag' offset changed from 5568 to 6336 (in bits) (by +768 bits)
while looking at anonymous data member 'union {char* screen_base; char* screen_buffer;}':
the internal name of that anonymous data memberchanged from:
__anonymous_union__4
to:
__anonymous_union__1
This is usually due to an anonymous member type being added or removed from the containing type
offset changed from 5632 to 6400 (in bits) (by +768 bits)
'unsigned long int fb_info::screen_size' offset changed from 5696 to 6464 (in bits) (by +768 bits)
'void* fb_info::pseudo_palette' offset changed from 5760 to 6528 (in bits) (by +768 bits)
'u32 fb_info::state' offset changed from 5824 to 6592 (in bits) (by +768 bits)
'void* fb_info::fbcon_par' offset changed from 5888 to 6656 (in bits) (by +768 bits)
'void* fb_info::par' offset changed from 5952 to 6720 (in bits) (by +768 bits)
'apertures_struct* fb_info::apertures' offset changed from 6016 to 6784 (in bits) (by +768 bits)
'bool fb_info::skip_vt_switch' offset changed from 6080 to 6848 (in bits) (by +768 bits)
411 impacted interfaces:
'struct net at net_namespace.h:51:1' changed:
type size hasn't changed
1 data member insertion:
'sk_buff_head net::wext_nlevents', at offset 33984 (in bits) at net_namespace.h:145:1
there are data member changes:
'net_generic* net::gen' offset changed from 33984 to 34176 (in bits) (by +192 bits)
1495 impacted interfaces:
'struct net_device at netdevice.h:1745:1' changed:
type size hasn't changed
2 data member insertions:
'const iw_handler_def* net_device::wireless_handlers', at offset 3904 (in bits) at netdevice.h:1800:1
'iw_public_data* net_device::wireless_data', at offset 3968 (in bits) at netdevice.h:1801:1
there are data member changes:
'const net_device_ops* net_device::netdev_ops' offset changed from 3904 to 4032 (in bits) (by +128 bits)
'const ethtool_ops* net_device::ethtool_ops' offset changed from 3968 to 4096 (in bits) (by +128 bits)
'const ndisc_ops* net_device::ndisc_ops' offset changed from 4032 to 4160 (in bits) (by +128 bits)
'const header_ops* net_device::header_ops' offset changed from 4096 to 4224 (in bits) (by +128 bits)
'unsigned int net_device::flags' offset changed from 4160 to 4288 (in bits) (by +128 bits)
'unsigned int net_device::priv_flags' offset changed from 4192 to 4320 (in bits) (by +128 bits)
'unsigned short int net_device::gflags' offset changed from 4224 to 4352 (in bits) (by +128 bits)
'unsigned short int net_device::padded' offset changed from 4240 to 4368 (in bits) (by +128 bits)
'unsigned char net_device::operstate' offset changed from 4256 to 4384 (in bits) (by +128 bits)
'unsigned char net_device::link_mode' offset changed from 4264 to 4392 (in bits) (by +128 bits)
'unsigned char net_device::if_port' offset changed from 4272 to 4400 (in bits) (by +128 bits)
'unsigned char net_device::dma' offset changed from 4280 to 4408 (in bits) (by +128 bits)
'unsigned int net_device::mtu' offset changed from 4288 to 4416 (in bits) (by +128 bits)
'unsigned int net_device::min_mtu' offset changed from 4320 to 4448 (in bits) (by +128 bits)
'unsigned int net_device::max_mtu' offset changed from 4352 to 4480 (in bits) (by +128 bits)
'unsigned short int net_device::type' offset changed from 4384 to 4512 (in bits) (by +128 bits)
'unsigned short int net_device::hard_header_len' offset changed from 4400 to 4528 (in bits) (by +128 bits)
'unsigned char net_device::min_header_len' offset changed from 4416 to 4544 (in bits) (by +128 bits)
'unsigned short int net_device::needed_headroom' offset changed from 4432 to 4560 (in bits) (by +128 bits)
'unsigned short int net_device::needed_tailroom' offset changed from 4448 to 4576 (in bits) (by +128 bits)
'unsigned char net_device::perm_addr[32]' offset changed from 4464 to 4592 (in bits) (by +128 bits)
'unsigned char net_device::addr_assign_type' offset changed from 4720 to 4848 (in bits) (by +128 bits)
'unsigned char net_device::addr_len' offset changed from 4728 to 4856 (in bits) (by +128 bits)
'unsigned short int net_device::neigh_priv_len' offset changed from 4736 to 4864 (in bits) (by +128 bits)
'unsigned short int net_device::dev_id' offset changed from 4752 to 4880 (in bits) (by +128 bits)
'unsigned short int net_device::dev_port' offset changed from 4768 to 4896 (in bits) (by +128 bits)
'spinlock_t net_device::addr_list_lock' offset changed from 4800 to 4928 (in bits) (by +128 bits)
'unsigned char net_device::name_assign_type' offset changed from 4832 to 4960 (in bits) (by +128 bits)
'bool net_device::uc_promisc' offset changed from 4840 to 4968 (in bits) (by +128 bits)
'netdev_hw_addr_list net_device::uc' offset changed from 4864 to 4992 (in bits) (by +128 bits)
'netdev_hw_addr_list net_device::mc' offset changed from 5056 to 5184 (in bits) (by +128 bits)
'netdev_hw_addr_list net_device::dev_addrs' offset changed from 5248 to 5376 (in bits) (by +128 bits)
'kset* net_device::queues_kset' offset changed from 5440 to 5568 (in bits) (by +128 bits)
'unsigned int net_device::promiscuity' offset changed from 5504 to 5632 (in bits) (by +128 bits)
'unsigned int net_device::allmulti' offset changed from 5536 to 5664 (in bits) (by +128 bits)
'tipc_bearer* net_device::tipc_ptr' offset changed from 5568 to 5696 (in bits) (by +128 bits)
'in_device* net_device::ip_ptr' offset changed from 5632 to 5760 (in bits) (by +128 bits)
'inet6_dev* net_device::ip6_ptr' offset changed from 5696 to 5824 (in bits) (by +128 bits)
'wireless_dev* net_device::ieee80211_ptr' offset changed from 5760 to 5888 (in bits) (by +128 bits)
'wpan_dev* net_device::ieee802154_ptr' offset changed from 5824 to 5952 (in bits) (by +128 bits)
'unsigned char* net_device::dev_addr' offset changed from 5888 to 6016 (in bits) (by +128 bits)
'netdev_rx_queue* net_device::_rx' offset changed from 5952 to 6080 (in bits) (by +128 bits)
'unsigned int net_device::num_rx_queues' offset changed from 6016 to 6144 (in bits) (by +128 bits)
'unsigned int net_device::real_num_rx_queues' offset changed from 6048 to 6176 (in bits) (by +128 bits)
'bpf_prog* net_device::xdp_prog' offset changed from 6080 to 6208 (in bits) (by +128 bits)
'unsigned long int net_device::gro_flush_timeout' offset changed from 6144 to 6272 (in bits) (by +128 bits)
'rx_handler_func_t* net_device::rx_handler' offset changed from 6208 to 6336 (in bits) (by +128 bits)
'void* net_device::rx_handler_data' offset changed from 6272 to 6400 (in bits) (by +128 bits)
'mini_Qdisc* net_device::miniq_ingress' offset changed from 6336 to 6464 (in bits) (by +128 bits)
'netdev_queue* net_device::ingress_queue' offset changed from 6400 to 6528 (in bits) (by +128 bits)
'nf_hook_entries* net_device::nf_hooks_ingress' offset changed from 6464 to 6592 (in bits) (by +128 bits)
'unsigned char net_device::broadcast[32]' offset changed from 6528 to 6656 (in bits) (by +128 bits)
'cpu_rmap* net_device::rx_cpu_rmap' offset changed from 6784 to 6912 (in bits) (by +128 bits)
'hlist_node net_device::index_hlist' offset changed from 6848 to 6976 (in bits) (by +128 bits)
1332 impacted interfaces:
'struct pinctrl_dev at core.h:43:1' changed:
type size changed from 1152 to 1536 (in bits)
4 data member insertions:
'radix_tree_root pinctrl_dev::pin_group_tree', at offset 320 (in bits) at core.h:48:1
'unsigned int pinctrl_dev::num_groups', at offset 448 (in bits) at core.h:49:1
'radix_tree_root pinctrl_dev::pin_function_tree', at offset 512 (in bits) at core.h:52:1
'unsigned int pinctrl_dev::num_functions', at offset 640 (in bits) at core.h:53:1
there are data member changes:
'list_head pinctrl_dev::gpio_ranges' offset changed from 320 to 704 (in bits) (by +384 bits)
'device* pinctrl_dev::dev' offset changed from 448 to 832 (in bits) (by +384 bits)
'module* pinctrl_dev::owner' offset changed from 512 to 896 (in bits) (by +384 bits)
'void* pinctrl_dev::driver_data' offset changed from 576 to 960 (in bits) (by +384 bits)
'pinctrl* pinctrl_dev::p' offset changed from 640 to 1024 (in bits) (by +384 bits)
'pinctrl_state* pinctrl_dev::hog_default' offset changed from 704 to 1088 (in bits) (by +384 bits)
'pinctrl_state* pinctrl_dev::hog_sleep' offset changed from 768 to 1152 (in bits) (by +384 bits)
'mutex pinctrl_dev::mutex' offset changed from 832 to 1216 (in bits) (by +384 bits)
'dentry* pinctrl_dev::device_root' offset changed from 1088 to 1472 (in bits) (by +384 bits)
29 impacted interfaces:
function pinctrl_dev* devm_pinctrl_register(device*, pinctrl_desc*, void*)
function int devm_pinctrl_register_and_init(device*, pinctrl_desc*, void*, pinctrl_dev**)
function void devm_pinctrl_unregister(device*, pinctrl_dev*)
function bool pin_is_valid(pinctrl_dev*, int)
function void pinconf_generic_dt_free_map(pinctrl_dev*, pinctrl_map*, unsigned int)
function int pinconf_generic_dt_node_to_map(pinctrl_dev*, device_node*, pinctrl_map**, unsigned int*, pinctrl_map_type)
function int pinconf_generic_dt_subnode_to_map(pinctrl_dev*, device_node*, pinctrl_map**, unsigned int*, unsigned int*, pinctrl_map_type)
function void pinconf_generic_dump_config(pinctrl_dev*, seq_file*, unsigned long int)
function void pinctrl_add_gpio_range(pinctrl_dev*, pinctrl_gpio_range*)
function void pinctrl_add_gpio_ranges(pinctrl_dev*, pinctrl_gpio_range*, unsigned int)
function const char* pinctrl_dev_get_devname(pinctrl_dev*)
function void* pinctrl_dev_get_drvdata(pinctrl_dev*)
function const char* pinctrl_dev_get_name(pinctrl_dev*)
function int pinctrl_enable(pinctrl_dev*)
function pinctrl_dev* pinctrl_find_and_add_gpio_range(const char*, pinctrl_gpio_range*)
function pinctrl_gpio_range* pinctrl_find_gpio_range_from_pin(pinctrl_dev*, unsigned int)
function pinctrl_gpio_range* pinctrl_find_gpio_range_from_pin_nolock(pinctrl_dev*, unsigned int)
function int pinctrl_force_default(pinctrl_dev*)
function int pinctrl_force_sleep(pinctrl_dev*)
function int pinctrl_get_group_pins(pinctrl_dev*, const char*, const unsigned int**, unsigned int*)
function pinctrl_dev* pinctrl_register(pinctrl_desc*, device*, void*)
function int pinctrl_register_and_init(pinctrl_desc*, device*, void*, pinctrl_dev**)
function void pinctrl_remove_gpio_range(pinctrl_dev*, pinctrl_gpio_range*)
function void pinctrl_unregister(pinctrl_dev*)
function int pinctrl_utils_add_config(pinctrl_dev*, unsigned long int**, unsigned int*, unsigned long int)
function int pinctrl_utils_add_map_configs(pinctrl_dev*, pinctrl_map**, unsigned int*, unsigned int*, const char*, unsigned long int*, unsigned int, pinctrl_map_type)
function int pinctrl_utils_add_map_mux(pinctrl_dev*, pinctrl_map**, unsigned int*, unsigned int*, const char*, const char*)
function void pinctrl_utils_free_map(pinctrl_dev*, pinctrl_map*, unsigned int)
function int pinctrl_utils_reserve_map(pinctrl_dev*, pinctrl_map**, unsigned int*, unsigned int*, unsigned int)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I940293527b5aa1875b49fad611bac9fc01c2931a