Adds compile coverage for TIPC in the TH builds, runtime not so much.
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Bug: 140406060
Change-Id: I7731157396372683c906f1f4eb2fdbdb7015f446
Adds compile coverage for SPI in the TH builds, runtime no so much.
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Bug: 140290328
Change-Id: I2d7777ab0e671248084880e5c1770b6ec6d7a650
Leaf changes summary: 1 artifact changed
Changed leaf types summary: 1 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 0 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable
'struct zs_pool at zsmalloc.c:250:1' changed:
type size changed from 17472 to 17792 (in bits)
3 data member insertions:
'wait_queue_head zs_pool::migration_wait', at offset 17472 (in bits) at zsmalloc.c:272:1
'atomic_long_t zs_pool::isolated_pages', at offset 17664 (in bits) at zsmalloc.c:273:1
'bool zs_pool::destroying', at offset 17728 (in bits) at zsmalloc.c:274:1
10 impacted interfaces:
function unsigned long int zs_compact(zs_pool*)
function zs_pool* zs_create_pool(const char*)
function void zs_destroy_pool(zs_pool*)
function void zs_free(zs_pool*, unsigned long int)
function unsigned long int zs_get_total_pages(zs_pool*)
function size_t zs_huge_class_size(zs_pool*)
function unsigned long int zs_malloc(zs_pool*, size_t, gfp_t)
function void* zs_map_object(zs_pool*, unsigned long int, zs_mapmode)
function void zs_pool_stats(zs_pool*, zs_pool_stats*)
function void zs_unmap_object(zs_pool*, unsigned long int)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I423bb47fed17b28512f64beb5fd34c3a8dc241d3
Changes in 4.19.69
HID: Add 044f:b320 ThrustMaster, Inc. 2 in 1 DT
MIPS: kernel: only use i8253 clocksource with periodic clockevent
mips: fix cacheinfo
netfilter: ebtables: fix a memory leak bug in compat
ASoC: dapm: Fix handling of custom_stop_condition on DAPM graph walks
selftests/bpf: fix sendmsg6_prog on s390
bonding: Force slave speed check after link state recovery for 802.3ad
net: mvpp2: Don't check for 3 consecutive Idle frames for 10G links
selftests: forwarding: gre_multipath: Enable IPv4 forwarding
selftests: forwarding: gre_multipath: Fix flower filters
can: dev: call netif_carrier_off() in register_candev()
can: mcp251x: add error check when wq alloc failed
can: gw: Fix error path of cgw_module_init
ASoC: Fail card instantiation if DAI format setup fails
st21nfca_connectivity_event_received: null check the allocation
st_nci_hci_connectivity_event_received: null check the allocation
ASoC: rockchip: Fix mono capture
ASoC: ti: davinci-mcasp: Correct slot_width posed constraint
net: usb: qmi_wwan: Add the BroadMobi BM818 card
qed: RDMA - Fix the hw_ver returned in device attributes
isdn: mISDN: hfcsusb: Fix possible null-pointer dereferences in start_isoc_chain()
mac80211_hwsim: Fix possible null-pointer dereferences in hwsim_dump_radio_nl()
netfilter: ipset: Actually allow destination MAC address for hash:ip,mac sets too
netfilter: ipset: Copy the right MAC address in bitmap:ip,mac and hash:ip,mac sets
netfilter: ipset: Fix rename concurrency with listing
rxrpc: Fix potential deadlock
rxrpc: Fix the lack of notification when sendmsg() fails on a DATA packet
isdn: hfcsusb: Fix mISDN driver crash caused by transfer buffer on the stack
net: phy: phy_led_triggers: Fix a possible null-pointer dereference in phy_led_trigger_change_speed()
perf bench numa: Fix cpu0 binding
can: sja1000: force the string buffer NULL-terminated
can: peak_usb: force the string buffer NULL-terminated
net/ethernet/qlogic/qed: force the string buffer NULL-terminated
NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()
NFS: Fix regression whereby fscache errors are appearing on 'nofsc' mounts
HID: quirks: Set the INCREMENT_USAGE_ON_DUPLICATE quirk on Saitek X52
HID: input: fix a4tech horizontal wheel custom usage
drm/rockchip: Suspend DP late
SMB3: Fix potential memory leak when processing compound chain
SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL
s390: put _stext and _etext into .text section
net: cxgb3_main: Fix a resource leak in a error path in 'init_one()'
net: stmmac: Fix issues when number of Queues >= 4
net: stmmac: tc: Do not return a fragment entry
net: hisilicon: make hip04_tx_reclaim non-reentrant
net: hisilicon: fix hip04-xmit never return TX_BUSY
net: hisilicon: Fix dma_map_single failed on arm64
libata: have ata_scsi_rw_xlat() fail invalid passthrough requests
libata: add SG safety checks in SFF pio transfers
x86/lib/cpu: Address missing prototypes warning
drm/vmwgfx: fix memory leak when too many retries have occurred
block, bfq: handle NULL return value by bfq_init_rq()
perf ftrace: Fix failure to set cpumask when only one cpu is present
perf cpumap: Fix writing to illegal memory in handling cpumap mask
perf pmu-events: Fix missing "cpu_clk_unhalted.core" event
KVM: arm64: Don't write junk to sysregs on reset
KVM: arm: Don't write junk to CP15 registers on reset
selftests: kvm: Adding config fragments
HID: wacom: correct misreported EKR ring values
HID: wacom: Correct distance scale for 2nd-gen Intuos devices
Revert "dm bufio: fix deadlock with loop device"
clk: socfpga: stratix10: fix rate caclulationg for cnt_clks
ceph: clear page dirty before invalidate page
ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply
libceph: fix PG split vs OSD (re)connect race
drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX
gpiolib: never report open-drain/source lines as 'input' to user-space
Drivers: hv: vmbus: Fix virt_to_hvpfn() for X86_PAE
userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx
x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386
x86/apic: Handle missing global clockevent gracefully
x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h
x86/boot: Save fields explicitly, zero out everything else
x86/boot: Fix boot regression caused by bootparam sanitizing
dm kcopyd: always complete failed jobs
dm btree: fix order of block initialization in btree_split_beneath
dm integrity: fix a crash due to BUG_ON in __journal_read_write()
dm raid: add missing cleanup in raid_ctr()
dm space map metadata: fix missing store of apply_bops() return value
dm table: fix invalid memory accesses with too high sector number
dm zoned: improve error handling in reclaim
dm zoned: improve error handling in i/o map code
dm zoned: properly handle backing device failure
genirq: Properly pair kobject_del() with kobject_add()
mm, page_owner: handle THP splits correctly
mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely
mm/zsmalloc.c: fix race condition in zs_destroy_pool
xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
xfs: don't trip over uninitialized buffer on extent read of corrupted inode
xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
xfs: Add helper function xfs_attr_try_sf_addname
xfs: Add attibute set and helper functions
xfs: Add attibute remove and helper functions
xfs: always rejoin held resources during defer roll
dm zoned: fix potential NULL dereference in dmz_do_reclaim()
powerpc: Allow flush_(inval_)dcache_range to work across ranges >4GB
rxrpc: Fix local endpoint refcounting
rxrpc: Fix read-after-free in rxrpc_queue_local()
rxrpc: Fix local endpoint replacement
rxrpc: Fix local refcounting
Linux 4.19.69
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9824a29e0434a6a80e2f32fdb88c0ac1fe8e5af5
Upstream commit 733d1d1a77 ("lib/test_meminit.c: use GFP_ATOMIC in RCU
critical section").
kmalloc() shouldn't sleep while in RCU critical section, therefore use
GFP_ATOMIC instead of GFP_KERNEL.
The bug was spotted by the 0day kernel testing robot.
Link: http://lkml.kernel.org/r/20190725121703.210874-1-glider@google.com
Fixes: 7e659650cbda ("lib: introduce test_meminit module")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I0cc435a0a5478e590180f720e3548d6e27789a1e
Bug: 138435492
Test: Boot cuttlefish with and without
Test: CONFIG_INIT_ON_ALLOC_DEFAULT_ON/CONFIG_INIT_ON_FREE_DEFAULT_ON
Signed-off-by: Alexander Potapenko <glider@google.com>
Upstream commit 1b7e816fc8 ("mm: slub: Fix slab walking for
init_on_free").
To properly clear the slab on free with slab_want_init_on_free, we walk
the list of free objects using get_freepointer/set_freepointer.
The value we get from get_freepointer may not be valid. This isn't an
issue since an actual value will get written later but this means
there's a chance of triggering a bug if we use this value with
set_freepointer:
kernel BUG at mm/slub.c:306!
invalid opcode: 0000 [#1] PREEMPT PTI
CPU: 0 PID: 0 Comm: swapper Not tainted 5.2.0-05754-g6471384a #4
RIP: 0010:kfree+0x58a/0x5c0
Code: 48 83 05 78 37 51 02 01 0f 0b 48 83 05 7e 37 51 02 01 48 83 05 7e 37 51 02 01 48 83 05 7e 37 51 02 01 48 83 05 d6 37 51 02 01 <0f> 0b 48 83 05 d4 37 51 02 01 48 83 05 d4 37 51 02 01 48 83 05 d4
RSP: 0000:ffffffff82603d90 EFLAGS: 00010002
RAX: ffff8c3976c04320 RBX: ffff8c3976c04300 RCX: 0000000000000000
RDX: ffff8c3976c04300 RSI: 0000000000000000 RDI: ffff8c3976c04320
RBP: ffffffff82603db8 R08: 0000000000000000 R09: 0000000000000000
R10: ffff8c3976c04320 R11: ffffffff8289e1e0 R12: ffffd52cc8db0100
R13: ffff8c3976c01a00 R14: ffffffff810f10d4 R15: ffff8c3976c04300
FS: 0000000000000000(0000) GS:ffffffff8266b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff8c397ffff000 CR3: 0000000125020000 CR4: 00000000000406b0
Call Trace:
apply_wqattrs_prepare+0x154/0x280
apply_workqueue_attrs_locked+0x4e/0xe0
apply_workqueue_attrs+0x36/0x60
alloc_workqueue+0x25a/0x6d0
workqueue_init_early+0x246/0x348
start_kernel+0x3c7/0x7ec
x86_64_start_reservations+0x40/0x49
x86_64_start_kernel+0xda/0xe4
secondary_startup_64+0xb6/0xc0
Modules linked in:
---[ end trace f67eb9af4d8d492b ]---
Fix this by ensuring the value we set with set_freepointer is either NULL
or another value in the chain.
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Fixes: 6471384af2 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I89414f6bf1559771d5d7c14853ddfe9f769989e5
Bug: 138435492
Test: Boot cuttlefish with and without
Test: CONFIG_INIT_ON_ALLOC_DEFAULT_ON/CONFIG_INIT_ON_FREE_DEFAULT_ON
Signed-off-by: Alexander Potapenko <glider@google.com>
Upstream commit 4ab7ace465 ("lib/test_meminit.c: minor test fixes").
Fix the following issues in test_meminit.c:
- |size| in fill_with_garbage_skip() should be signed so that it
doesn't overflow if it's not aligned on sizeof(*p);
- fill_with_garbage_skip() should actually skip |skip| bytes;
- do_kmem_cache_size() should deallocate memory in the RCU case.
Link: http://lkml.kernel.org/r/20190626133135.217355-1-glider@google.com
Fixes: 7e659650cbda ("lib: introduce test_meminit module")
Fixes: 94e8988d91c7 ("lib/test_meminit.c: fix -Wmaybe-uninitialized false positive")
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Icb31f93f509a25e1fc8d05286a5ba56e720d6b03
Bug: 138435492
Test: Boot cuttlefish with and without
Test: CONFIG_INIT_ON_ALLOC_DEFAULT_ON/CONFIG_INIT_ON_FREE_DEFAULT_ON
Signed-off-by: Alexander Potapenko <glider@google.com>
Upstream commit d3a811617a ("lib/test_meminit.c: fix
-Wmaybe-uninitialized false positive").
The conditional logic is too complicated for the compiler to fully
comprehend:
lib/test_meminit.c: In function 'test_meminit_init':
lib/test_meminit.c:236:5: error: 'buf_copy' may be used uninitialized in this function [-Werror=maybe-uninitialized]
kfree(buf_copy);
^~~~~~~~~~~~~~~
lib/test_meminit.c:201:14: note: 'buf_copy' was declared here
Simplify it by splitting out the non-rcu section.
Link: http://lkml.kernel.org/r/20190617131210.2190280-1-arnd@arndb.de
Fixes: af734ee6ec85 ("lib: introduce test_meminit module")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I810a4b8f115f387c8ffe30937ac1225849520132
Bug: 138435492
Test: Boot cuttlefish with and without
Test: CONFIG_INIT_ON_ALLOC_DEFAULT_ON/CONFIG_INIT_ON_FREE_DEFAULT_ON
Signed-off-by: Alexander Potapenko <glider@google.com>
Upstream commit 5015a300a5 ("lib: introduce test_meminit module").
Add tests for heap and pagealloc initialization. These can be used to
check init_on_alloc and init_on_free implementations as well as other
approaches to initialization.
Expected test output in the case the kernel provides heap initialization
(e.g. when running with either init_on_alloc=1 or init_on_free=1):
test_meminit: all 10 tests in test_pages passed
test_meminit: all 40 tests in test_kvmalloc passed
test_meminit: all 60 tests in test_kmemcache passed
test_meminit: all 10 tests in test_rcu_persistent passed
test_meminit: all 120 tests passed!
Link: http://lkml.kernel.org/r/20190529123812.43089-4-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Sandeep Patil <sspatil@android.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I815efdc79304ae1ad6ef60277ac5b88a4c41d479
Bug: 138435492
Test: Boot cuttlefish with and without
Test: CONFIG_INIT_ON_ALLOC_DEFAULT_ON/CONFIG_INIT_ON_FREE_DEFAULT_ON
Signed-off-by: Alexander Potapenko <glider@google.com>
Upstream commit 23a5c8cb7a ("mm: init: report memory
auto-initialization features at boot time").
Print the currently enabled stack and heap initialization modes.
Stack initialization is enabled by a config flag, while heap
initialization is configured at boot time with defaults being set in the
config. It's more convenient for the user to have all information about
these hardening measures in one place at boot, so the user can reason
about the expected behavior of the running system.
The possible options for stack are:
- "all" for CONFIG_INIT_STACK_ALL;
- "byref_all" for CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL;
- "byref" for CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF;
- "__user" for CONFIG_GCC_PLUGIN_STRUCTLEAK_USER;
- "off" otherwise.
Depending on the values of init_on_alloc and init_on_free boottime options
we also report "heap alloc" and "heap free" as "on"/"off".
In the init_on_free mode initializing pages at boot time may take a while,
so print a notice about that as well. This depends on how much memory is
installed, the memory bandwidth, etc. On a relatively modern x86 system,
it takes about 0.75s/GB to wipe all memory:
[ 0.418722] mem auto-init: stack:byref_all, heap alloc:off, heap free:on
[ 0.419765] mem auto-init: clearing system memory may take some time...
[ 12.376605] Memory: 16408564K/16776672K available (14339K kernel code, 1397K rwdata, 3756K rodata, 1636K init, 11460K bss, 368108K reserved, 0K cma-reserved)
Link: http://lkml.kernel.org/r/20190617151050.92663-3-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Sandeep Patil <sspatil@android.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Marco Elver <elver@google.com>
Cc: Kaiwan N Billimoria <kaiwan@kaiwantech.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I12ba67dd5d97491a331e7d5597b60b8e08bf453f
Bug: 138435492
Test: Boot cuttlefish with and without
Test: CONFIG_INIT_ON_ALLOC_DEFAULT_ON/CONFIG_INIT_ON_FREE_DEFAULT_ON
Signed-off-by: Alexander Potapenko <glider@google.com>
Upstream commit 6471384af2 ("mm: security: introduce init_on_alloc=1
and init_on_free=1 boot options").
Patch series "add init_on_alloc/init_on_free boot options", v10.
Provide init_on_alloc and init_on_free boot options.
These are aimed at preventing possible information leaks and making the
control-flow bugs that depend on uninitialized values more deterministic.
Enabling either of the options guarantees that the memory returned by the
page allocator and SL[AU]B is initialized with zeroes. SLOB allocator
isn't supported at the moment, as its emulation of kmem caches complicates
handling of SLAB_TYPESAFE_BY_RCU caches correctly.
Enabling init_on_free also guarantees that pages and heap objects are
initialized right after they're freed, so it won't be possible to access
stale data by using a dangling pointer.
As suggested by Michal Hocko, right now we don't let the heap users to
disable initialization for certain allocations. There's not enough
evidence that doing so can speed up real-life cases, and introducing ways
to opt-out may result in things going out of control.
This patch (of 2):
The new options are needed to prevent possible information leaks and make
control-flow bugs that depend on uninitialized values more deterministic.
This is expected to be on-by-default on Android and Chrome OS. And it
gives the opportunity for anyone else to use it under distros too via the
boot args. (The init_on_free feature is regularly requested by folks
where memory forensics is included in their threat models.)
init_on_alloc=1 makes the kernel initialize newly allocated pages and heap
objects with zeroes. Initialization is done at allocation time at the
places where checks for __GFP_ZERO are performed.
init_on_free=1 makes the kernel initialize freed pages and heap objects
with zeroes upon their deletion. This helps to ensure sensitive data
doesn't leak via use-after-free accesses.
Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator
returns zeroed memory. The two exceptions are slab caches with
constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never
zero-initialized to preserve their semantics.
Both init_on_alloc and init_on_free default to zero, but those defaults
can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and
CONFIG_INIT_ON_FREE_DEFAULT_ON.
If either SLUB poisoning or page poisoning is enabled, those options take
precedence over init_on_alloc and init_on_free: initialization is only
applied to unpoisoned allocations.
Slowdown for the new features compared to init_on_free=0, init_on_alloc=0:
hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%)
hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%)
Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%)
Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%)
Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%)
Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%)
The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline
is within the standard error.
The new features are also going to pave the way for hardware memory
tagging (e.g. arm64's MTE), which will require both on_alloc and on_free
hooks to set the tags for heap objects. With MTE, tagging will have the
same cost as memory initialization.
Although init_on_free is rather costly, there are paranoid use-cases where
in-memory data lifetime is desired to be minimized. There are various
arguments for/against the realism of the associated threat models, but
given that we'll need the infrastructure for MTE anyway, and there are
people who want wipe-on-free behavior no matter what the performance cost,
it seems reasonable to include it in this series.
[glider@google.com: v8]
Link: http://lkml.kernel.org/r/20190626121943.131390-2-glider@google.com
[glider@google.com: v9]
Link: http://lkml.kernel.org/r/20190627130316.254309-2-glider@google.com
[glider@google.com: v10]
Link: http://lkml.kernel.org/r/20190628093131.199499-2-glider@google.com
Link: http://lkml.kernel.org/r/20190617151050.92663-2-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.cz> [page and dmapool parts
Acked-by: James Morris <jamorris@linux.microsoft.com>]
Cc: Christoph Lameter <cl@linux.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Sandeep Patil <sspatil@android.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: If0620a6a8aed34c21e98458c965e94f5a9dfd297
Bug: 138435492
Test: Boot cuttlefish with and without
Test: CONFIG_INIT_ON_ALLOC_DEFAULT_ON/CONFIG_INIT_ON_FREE_DEFAULT_ON
Signed-off-by: Alexander Potapenko <glider@google.com>
Upstream commit ba5c5e4a5d ("arm64: move jump_label_init() before
parse_early_param()").
While jump_label_init() was moved earlier in the boot process in
commit efd9e03fac ("arm64: Use static keys for CPU features"),
it wasn't early
enough for early params to use it. The old state of things was as
described here...
init/main.c calls out to arch-specific things before general jump label
and early param handling:
asmlinkage __visible void __init start_kernel(void)
{
...
setup_arch(&command_line);
...
smp_prepare_boot_cpu();
...
/* parameters may set static keys */
jump_label_init();
parse_early_param();
...
}
x86 setup_arch() wants those earlier, so it handles jump label and
early param:
void __init setup_arch(char **cmdline_p)
{
...
jump_label_init();
...
parse_early_param();
...
}
arm64 setup_arch() only had early param:
void __init setup_arch(char **cmdline_p)
{
...
parse_early_param();
...
}
with jump label later in smp_prepare_boot_cpu():
void __init smp_prepare_boot_cpu(void)
{
...
jump_label_init();
...
}
This moves arm64 jump_label_init() from smp_prepare_boot_cpu() to
setup_arch(), as done already on x86, in preparation from early param
usage in the init_on_alloc/free() series:
https://lkml.kernel.org/r/1561572949.5154.81.camel@lca.pw
Link: http://lkml.kernel.org/r/201906271003.005303B52@keescook
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I50c0ba78cc366ea465377c80ed5a477b50d15bf9
Bug: 138435492
Test: Boot cuttlefish with and without
Test: CONFIG_INIT_ON_ALLOC_DEFAULT_ON/CONFIG_INIT_ON_FREE_DEFAULT_ON
Signed-off-by: Alexander Potapenko <glider@google.com>
ABI test is complaining. I can't repro locally. Maybe updating ABI will
help.
Change-Id: I9a320517042d8487d08ec9290efc4b15d6e97b61
Signed-off-by: Tri Vo <trong@google.com>
CONFIG_ARCH_QCOM is a dependency of the above and selects
CONFIG_{PINCTRL, REGULATOR, TMPFS}.
Bug: 133441279
Bug: 133441092
Bug: 133440650
Change-Id: I22c37946ec3a62ccbd3fa65bbc09076964d86475
Signed-off-by: Tri Vo <trong@google.com>
commit 3f0d9e2984
("ANDROID: Remove unused cuttlefish build infra") caused a
regression in adb-remount-test.sh by removing the
CONFIG_OVERLAY_FS configuration from the x86 configuration.
The gki/cuttlefish configuration for arm64 was not affected
by the regression as it is enabled there.
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Test: adb-remount-test.sh
Bug: 138649540
Change-Id: I9e51475f1025f726e69c6f513099248bf452cc2d
Legacy Ion driver and SPARSEMEM for carveout regions results
in invalid page structures breaking page_to_pfn(). This can
be temporarily resolved with SPARSEMEM_VMEMMAP until the Ion
driver is refactored and can be reinvestigated.
At that time if it can be solved, or maybe correct this issue
utilizing less resources than SPARSEMEM_VMEMMAP requires. The
ABI does not change so we have the flexibility to adjust this
configuration.
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Bug: 138851285
Bug: 138149732
Test: ABI_DEFINITION=common/abi_gki_aarch64.xml \
BUILD_CONFIG=common/build.config.gki.aarch64 ./build/build_abi.sh
Change-Id: I25cc8ebe9e25260b9869c5e8d8667b280f83ca51
[ Upstream commit 68553f1a6f ]
Fix rxrpc_unuse_local() to handle a NULL local pointer as it can be called
on an unbound socket on which rx->local is not yet set.
The following reproduced (includes omitted):
int main(void)
{
socket(AF_RXRPC, SOCK_DGRAM, AF_INET);
return 0;
}
causes the following oops to occur:
BUG: kernel NULL pointer dereference, address: 0000000000000010
...
RIP: 0010:rxrpc_unuse_local+0x8/0x1b
...
Call Trace:
rxrpc_release+0x2b5/0x338
__sock_release+0x37/0xa1
sock_close+0x14/0x17
__fput+0x115/0x1e9
task_work_run+0x72/0x98
do_exit+0x51b/0xa7a
? __context_tracking_exit+0x4e/0x10e
do_group_exit+0xab/0xab
__x64_sys_exit_group+0x14/0x17
do_syscall_64+0x89/0x1d4
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Reported-by: syzbot+20dee719a2e090427b5f@syzkaller.appspotmail.com
Fixes: 730c5fd42c ("rxrpc: Fix local endpoint refcounting")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b00df840fb ]
When a local endpoint (struct rxrpc_local) ceases to be in use by any
AF_RXRPC sockets, it starts the process of being destroyed, but this
doesn't cause it to be removed from the namespace endpoint list immediately
as tearing it down isn't trivial and can't be done in softirq context, so
it gets deferred.
If a new socket comes along that wants to bind to the same endpoint, a new
rxrpc_local object will be allocated and rxrpc_lookup_local() will use
list_replace() to substitute the new one for the old.
Then, when the dying object gets to rxrpc_local_destroyer(), it is removed
unconditionally from whatever list it is on by calling list_del_init().
However, list_replace() doesn't reset the pointers in the replaced
list_head and so the list_del_init() will likely corrupt the local
endpoints list.
Fix this by using list_replace_init() instead.
Fixes: 730c5fd42c ("rxrpc: Fix local endpoint refcounting")
Reported-by: syzbot+193e29e9387ea5837f1d@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 06d9532fa6 upstream.
rxrpc_queue_local() attempts to queue the local endpoint it is given and
then, if successful, prints a trace line. The trace line includes the
current usage count - but we're not allowed to look at the local endpoint
at this point as we passed our ref on it to the workqueue.
Fix this by reading the usage count before queuing the work item.
Also fix the reading of local->debug_id for trace lines, which must be done
with the same consideration as reading the usage count.
Fixes: 09d2bf595d ("rxrpc: Add a tracepoint to track rxrpc_local refcounting")
Reported-by: syzbot+78e71c5bab4f76a6a719@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 730c5fd42c upstream.
The object lifetime management on the rxrpc_local struct is broken in that
the rxrpc_local_processor() function is expected to clean up and remove an
object - but it may get requeued by packets coming in on the backing UDP
socket once it starts running.
This may result in the assertion in rxrpc_local_rcu() firing because the
memory has been scheduled for RCU destruction whilst still queued:
rxrpc: Assertion failed
------------[ cut here ]------------
kernel BUG at net/rxrpc/local_object.c:468!
Note that if the processor comes around before the RCU free function, it
will just do nothing because ->dead is true.
Fix this by adding a separate refcount to count active users of the
endpoint that causes the endpoint to be destroyed when it reaches 0.
The original refcount can then be used to refcount objects through the work
processor and cause the memory to be rcu freed when that reaches 0.
Fixes: 4f95dd78a7 ("rxrpc: Rework local endpoint management")
Reported-by: syzbot+1e0edc4b8b7494c28450@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The upstream commit:
22e9c88d48 ("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
has a similar effect, but since it is a rewrite of the assembler to C, is
too invasive for stable. This patch is a minimal fix to address the issue in
assembler.
This patch applies cleanly to v5.2, v4.19 & v4.14.
When calling flush_(inval_)dcache_range with a size >4GB, we were masking
off the upper 32 bits, so we would incorrectly flush a range smaller
than intended.
This patch replaces the 32 bit shifts with 64 bit ones, so that
the full size is accounted for.
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e0702d90b7 ]
This function is supposed to return error pointers so it matches the
dmz_get_rnd_zone_for_reclaim() function. The current code could lead to
a NULL dereference in dmz_do_reclaim()
Fixes: b234c6d7a7 ("dm zoned: improve error handling in reclaim")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 710d707d2f upstream.
During testing of xfs/141 on a V4 filesystem, I observed some
inconsistent behavior with regards to resources that are held (i.e.
remain locked) across a defer roll. The transaction roll always gives
the defer roll function a new transaction, even if committing the old
transaction fails. However, the defer roll function only rejoins the
held resources if the transaction commit succeedied. This means that
callers of defer roll have to figure out whether the held resources are
attached to the transaction being passed back.
Worse yet, if the defer roll was part of a defer finish call, we have a
third possibility: the defer finish could pass back a dirty transaction
with dirty held resources and an error code.
The only sane way to handle all of these scenarios is to require that
the code that held the resource either cancel the transaction before
unlocking and releasing the resources, or use functions that detach
resources from a transaction properly (e.g. xfs_trans_brelse) if they
need to drop the reference before committing or cancelling the
transaction.
In order to make this so, change the defer roll code to join held
resources to the new transaction unconditionally and fix all the bhold
callers to release the held buffers correctly.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
[mcgrof: fixes kz#204223 ]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 068f985a9e upstream.
This patch adds xfs_attr_remove_args. These sub-routines remove
the attributes specified in @args. We will use this later for setting
parent pointers as a deferred attribute operation.
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2f3cd80919 upstream.
This patch adds xfs_attr_set_args and xfs_bmap_set_attrforkoff.
These sub-routines set the attributes specified in @args.
We will use this later for setting parent pointers as a deferred
attribute operation.
[dgc: remove attr fork init code from xfs_attr_set_args().]
[dgc: xfs_attr_try_sf_addname() NULLs args.trans after commit.]
[dgc: correct sf add error handling.]
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 4c74a56b9d upstream.
This patch adds a subroutine xfs_attr_try_sf_addname
used by xfs_attr_set. This subrotine will attempt to
add the attribute name specified in args in shortform,
as well and perform error handling previously done in
xfs_attr_set.
This patch helps to pre-simplify xfs_attr_set for reviewing
purposes and reduce indentation. New function will be added
in the next patch.
[dgc: moved commit to helper function, too.]
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 6958d11f77 upstream.
We've had rather rare reports of bmap btree block corruption where
the bmap root block has a level count of zero. The root cause of the
corruption is so far unknown. We do have verifier checks to detect
this form of on-disk corruption, but this doesn't cover a memory
corruption variant of the problem. The latter is a reasonable
possibility because the root block is part of the inode fork and can
reside in-core for some time before inode extents are read.
If this occurs, it leads to a system crash such as the following:
BUG: unable to handle kernel paging request at ffffffff00000221
PF error: [normal kernel read fault]
...
RIP: 0010:xfs_trans_brelse+0xf/0x200 [xfs]
...
Call Trace:
xfs_iread_extents+0x379/0x540 [xfs]
xfs_file_iomap_begin_delay+0x11a/0xb40 [xfs]
? xfs_attr_get+0xd1/0x120 [xfs]
? iomap_write_begin.constprop.40+0x2d0/0x2d0
xfs_file_iomap_begin+0x4c4/0x6d0 [xfs]
? __vfs_getxattr+0x53/0x70
? iomap_write_begin.constprop.40+0x2d0/0x2d0
iomap_apply+0x63/0x130
? iomap_write_begin.constprop.40+0x2d0/0x2d0
iomap_file_buffered_write+0x62/0x90
? iomap_write_begin.constprop.40+0x2d0/0x2d0
xfs_file_buffered_aio_write+0xe4/0x3b0 [xfs]
__vfs_write+0x150/0x1b0
vfs_write+0xba/0x1c0
ksys_pwrite64+0x64/0xa0
do_syscall_64+0x5a/0x1d0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The crash occurs because xfs_iread_extents() attempts to release an
uninitialized buffer pointer as the level == 0 value prevented the
buffer from ever being allocated or read. Change the level > 0
assert to an explicit error check in xfs_iread_extents() to avoid
crashing the kernel in the event of localized, in-core inode
corruption.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 1fb254aa98 upstream.
Benjamin Moody reported to Debian that XFS partially wedges when a chgrp
fails on account of being out of disk quota. I ran his reproducer
script:
# adduser dummy
# adduser dummy plugdev
# dd if=/dev/zero bs=1M count=100 of=test.img
# mkfs.xfs test.img
# mount -t xfs -o gquota test.img /mnt
# mkdir -p /mnt/dummy
# chown -c dummy /mnt/dummy
# xfs_quota -xc 'limit -g bsoft=100k bhard=100k plugdev' /mnt
(and then as user dummy)
$ dd if=/dev/urandom bs=1M count=50 of=/mnt/dummy/foo
$ chgrp plugdev /mnt/dummy/foo
and saw:
================================================
WARNING: lock held when returning to user space!
5.3.0-rc5 #rc5 Tainted: G W
------------------------------------------------
chgrp/47006 is leaving the kernel with locks still held!
1 lock held by chgrp/47006:
#0: 000000006664ea2d (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0xd2/0x290 [xfs]
...which is clearly caused by xfs_setattr_nonsize failing to unlock the
ILOCK after the xfs_qm_vop_chown_reserve call fails. Add the missing
unlock.
Reported-by: benjamin.moody@gmail.com
Fixes: 253f4911f2 ("xfs: better xfs_trans_alloc interface")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 701d678599 upstream.
In zs_destroy_pool() we call flush_work(&pool->free_work). However, we
have no guarantee that migration isn't happening in the background at
that time.
Since migration can't directly free pages, it relies on free_work being
scheduled to free the pages. But there's nothing preventing an
in-progress migrate from queuing the work *after*
zs_unregister_migration() has called flush_work(). Which would mean
pages still pointing at the inode when we free it.
Since we know at destroy time all objects should be free, no new
migrations can come in (since zs_page_isolate() fails for fully-free
zspages). This means it is sufficient to track a "# isolated zspages"
count by class, and have the destroy logic ensure all such pages have
drained before proceeding. Keeping that state under the class spinlock
keeps the logic straightforward.
In this case a memory leak could lead to an eventual crash if compaction
hits the leaked page. This crash would only occur if people are
changing their zswap backend at runtime (which eventually starts
destruction).
Link: http://lkml.kernel.org/r/20190809181751.219326-2-henryburns@google.com
Fixes: 48b4800a1c ("zsmalloc: page migration support")
Signed-off-by: Henry Burns <henryburns@google.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Henry Burns <henrywolfeburns@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Jonathan Adams <jwadams@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0ff14fdc9 upstream.
If alloc_descs() fails before irq_sysfs_init() has run, free_desc() in the
cleanup path will call kobject_del() even though the kobject has not been
added with kobject_add().
Fix this by making the call to kobject_del() conditional on whether
irq_sysfs_init() has run.
This problem surfaced because commit aa30f47cf6 ("kobject: Add support
for default attribute groups to kobj_type") makes kobject_del() stricter
about pairing with kobject_add(). If the pairing is incorrrect, a WARNING
and backtrace occur in sysfs_remove_group() because there is no parent.
[ tglx: Add a comment to the code and make it work with CONFIG_SYSFS=n ]
Fixes: ecb3f394c5 ("genirq: Expose interrupt information through sysfs")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1564703564-4116-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75d66ffb48 upstream.
dm-zoned is observed to lock up or livelock in case of hardware
failure or some misconfiguration of the backing zoned device.
This patch adds a new dm-zoned target function that checks the status of
the backing device. If the request queue of the backing device is found
to be in dying state or the SCSI backing device enters offline state,
the health check code sets a dm-zoned target flag prompting all further
incoming I/O to be rejected. In order to detect backing device failures
timely, this new function is called in the request mapping path, at the
beginning of every reclaim run and before performing any metadata I/O.
The proper way out of this situation is to do
dmsetup remove <dm-zoned target>
and recreate the target when the problem with the backing device
is resolved.
Fixes: 3b1a94c88b ("dm zoned: drive-managed zoned block device target")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d7428c5011 upstream.
Some errors are ignored in the I/O path during queueing chunks
for processing by chunk works. Since at least these errors are
transient in nature, it should be possible to retry the failed
incoming commands.
The fix -
Errors that can happen while queueing chunks are carried upwards
to the main mapping function and it now returns DM_MAPIO_REQUEUE
for any incoming requests that can not be properly queued.
Error logging/debug messages are added where needed.
Fixes: 3b1a94c88b ("dm zoned: drive-managed zoned block device target")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b234c6d7a7 upstream.
There are several places in reclaim code where errors are not
propagated to the main function, dmz_reclaim(). This function
is responsible for unlocking zones that might be still locked
at the end of any failed reclaim iterations. As the result,
some device zones may be left permanently locked for reclaim,
degrading target's capability to reclaim zones.
This patch fixes these issues as follows -
Make sure that dmz_reclaim_buf(), dmz_reclaim_seq_data() and
dmz_reclaim_rnd_data() return error codes to the caller.
dmz_reclaim() function is renamed to dmz_do_reclaim() to avoid
clashing with "struct dmz_reclaim" and is modified to return the
error to the caller.
dmz_get_zone_for_reclaim() now returns an error instead of NULL
pointer and reclaim code checks for that error.
Error logging/debug messages are added where necessary.
Fixes: 3b1a94c88b ("dm zoned: drive-managed zoned block device target")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1cfd5d3399 upstream.
If the sector number is too high, dm_table_find_target() should return a
pointer to a zeroed dm_target structure (the caller should test it with
dm_target_is_valid).
However, for some table sizes, the code in dm_table_find_target() that
performs btree lookup will access out of bound memory structures.
Fix this bug by testing the sector number at the beginning of
dm_table_find_target(). Also, add an "inline" keyword to the function
dm_table_get_size() because this is a hot path.
Fixes: 512875bd96 ("dm: table detect io beyond device")
Cc: stable@vger.kernel.org
Reported-by: Zhang Tao <kontais@zoho.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae148243d3 upstream.
In commit 6096d91af0 ("dm space map metadata: fix occasional leak
of a metadata block on resize"), we refactor the commit logic to a new
function 'apply_bops'. But when that logic was replaced in out() the
return value was not stored. This may lead out() returning a wrong
value to the caller.
Fixes: 6096d91af0 ("dm space map metadata: fix occasional leak of a metadata block on resize")
Cc: stable@vger.kernel.org
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc1a3e8e0c upstream.
If rs_prepare_reshape() fails, no cleanup is executed, leading to
leak of the raid_set structure allocated at the beginning of
raid_ctr(). To fix this issue, go to the label 'bad' if the error
occurs.
Fixes: 11e4723206 ("dm raid: stop keeping raid set frozen altogether")
Cc: stable@vger.kernel.org
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5729b6e5a1 upstream.
Fix a crash that was introduced by the commit 724376a04d. The crash is
reported here: https://gitlab.com/cryptsetup/cryptsetup/issues/468
When reading from the integrity device, the function
dm_integrity_map_continue calls find_journal_node to find out if the
location to read is present in the journal. Then, it calculates how many
sectors are consecutively stored in the journal. Then, it locks the range
with add_new_range and wait_and_add_new_range.
The problem is that during wait_and_add_new_range, we hold no locks (we
don't hold ic->endio_wait.lock and we don't hold a range lock), so the
journal may change arbitrarily while wait_and_add_new_range sleeps.
The code then goes to __journal_read_write and hits
BUG_ON(journal_entry_get_sector(je) != logical_sector); because the
journal has changed.
In order to fix this bug, we need to re-check the journal location after
wait_and_add_new_range. We restrict the length to one block in order to
not complicate the code too much.
Fixes: 724376a04d ("dm integrity: implement fair range locks")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4f9d60138 upstream.
When btree_split_beneath() splits a node to two new children, it will
allocate two blocks: left and right. If right block's allocation
failed, the left block will be unlocked and marked dirty. If this
happened, the left block'ss content is zero, because it wasn't
initialized with the btree struct before the attempot to allocate the
right block. Upon return, when flushing the left block to disk, the
validator will fail when check this block. Then a BUG_ON is raised.
Fix this by completely initializing the left block before allocating and
initializing the right block.
Fixes: 4dcb8b57df ("dm btree: fix leak of bufio-backed block in btree_split_beneath error path")
Cc: stable@vger.kernel.org
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1fef41465 upstream.
This patch fixes a problem in dm-kcopyd that may leave jobs in
complete queue indefinitely in the event of backing storage failure.
This behavior has been observed while running 100% write file fio
workload against an XFS volume created on top of a dm-zoned target
device. If the underlying storage of dm-zoned goes to offline state
under I/O, kcopyd sometimes never issues the end copy callback and
dm-zoned reclaim work hangs indefinitely waiting for that completion.
This behavior was traced down to the error handling code in
process_jobs() function that places the failed job to complete_jobs
queue, but doesn't wake up the job handler. In case of backing device
failure, all outstanding jobs may end up going to complete_jobs queue
via this code path and then stay there forever because there are no
more successful I/O jobs to wake up the job handler.
This patch adds a wake() call to always wake up kcopyd job wait queue
for all I/O jobs that fail before dm_io() gets called for that job.
The patch also sets the write error status in all sub jobs that are
failed because their master job has failed.
Fixes: b73c67c2cb ("dm kcopyd: add sequential write feature")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a90118c445 upstream.
Recent gcc compilers (gcc 9.1) generate warnings about an out of bounds
memset, if the memset goes accross several fields of a struct. This
generated a couple of warnings on x86_64 builds in sanitize_boot_params().
Fix this by explicitly saving the fields in struct boot_params
that are intended to be preserved, and zeroing all the rest.
[ tglx: Tagged for stable as it breaks the warning free build there as well ]
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190731054627.5627-2-jhubbard@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f897e60a12 upstream.
Some newer machines do not advertise legacy timers. The kernel can handle
that situation if the TSC and the CPU frequency are enumerated by CPUID or
MSRs and the CPU supports TSC deadline timer. If the CPU does not support
TSC deadline timer the local APIC timer frequency has to be known as well.
Some Ryzens machines do not advertize legacy timers, but there is no
reliable way to determine the bus frequency which feeds the local APIC
timer when the machine allows overclocking of that frequency.
As there is no legacy timer the local APIC timer calibration crashes due to
a NULL pointer dereference when accessing the not installed global clock
event device.
Switch the calibration loop to a non interrupt based one, which polls
either TSC (if frequency is known) or jiffies. The latter requires a global
clockevent. As the machines which do not have a global clockevent installed
have a known TSC frequency this is a non issue. For older machines where
TSC frequency is not known, there is no known case where the legacy timers
do not exist as that would have been reported long ago.
Reported-by: Daniel Drake <drake@endlessm.com>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Drake <drake@endlessm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908091443030.21433@nanos.tec.linutronix.de
Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1142926#c12
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b63f20a778 upstream.
Use 'lea' instead of 'add' when adjusting %rsp in CALL_NOSPEC so as to
avoid clobbering flags.
KVM's emulator makes indirect calls into a jump table of sorts, where
the destination of the CALL_NOSPEC is a small blob of code that performs
fast emulation by executing the target instruction with fixed operands.
adcb_al_dl:
0x000339f8 <+0>: adc %dl,%al
0x000339fa <+2>: ret
A major motiviation for doing fast emulation is to leverage the CPU to
handle consumption and manipulation of arithmetic flags, i.e. RFLAGS is
both an input and output to the target of CALL_NOSPEC. Clobbering flags
results in all sorts of incorrect emulation, e.g. Jcc instructions often
take the wrong path. Sans the nops...
asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
0x0003595a <+58>: mov 0xc0(%ebx),%eax
0x00035960 <+64>: mov 0x60(%ebx),%edx
0x00035963 <+67>: mov 0x90(%ebx),%ecx
0x00035969 <+73>: push %edi
0x0003596a <+74>: popf
0x0003596b <+75>: call *%esi
0x000359a0 <+128>: pushf
0x000359a1 <+129>: pop %edi
0x000359a2 <+130>: mov %eax,0xc0(%ebx)
0x000359b1 <+145>: mov %edx,0x60(%ebx)
ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
0x000359a8 <+136>: mov -0x10(%ebp),%eax
0x000359ab <+139>: and $0x8d5,%edi
0x000359b4 <+148>: and $0xfffff72a,%eax
0x000359b9 <+153>: or %eax,%edi
0x000359bd <+157>: mov %edi,0x4(%ebx)
For the most part this has gone unnoticed as emulation of guest code
that can trigger fast emulation is effectively limited to MMIO when
running on modern hardware, and MMIO is rarely, if ever, accessed by
instructions that affect or consume flags.
Breakage is almost instantaneous when running with unrestricted guest
disabled, in which case KVM must emulate all instructions when the guest
has invalid state, e.g. when the guest is in Big Real Mode during early
BIOS.
Fixes: 776b043848fd2 ("x86/retpoline: Add initial retpoline support")
Fixes: 1a29b5b7f3 ("KVM: x86: Make indirect calls in emulator speculation safe")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190822211122.27579-1-sean.j.christopherson@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>