The snapshot dm target is required to implement the Virtual-AB
mechanism.
Introduce CONFIG_DM_SNAPSHOT in arm64 and x86 gki defconfigs to
enable this feature.
Bug: 142527064
Test: kernel build
Signed-off-by: Alessio Balsini <balsini@google.com>
Change-Id: I69bc614509eaff259a2aa9195e7e1d406b36bbb2
Enable AES and SHA-256 accelerated with the ARM Cryptography Extensions
or with AES-NI, as recommended by
kernel-configs/android-4.19/android-recommended-{arm64,x86}.config.
These are ~10x faster than the other software implementations of AES and
SHA-256 on most devices, and often are required to get acceptable
fscrypt and dm-verity performance.
Bug: 142410832
Change-Id: Ia2794f47711132d5caa9021e6e81fb625e02be8d
Signed-off-by: Eric Biggers <ebiggers@google.com>
Writing large values to remove_uid_range can cause an overflow &
system hang. To prevent this, read in the start and end of the range
using kstrtouint to ensure the full range can fit in a uid_t, and use
a u64 for our loop counters to avoid overflow when the range ends at
UINT_MAX.
Test: "echo '9223372036854775807-9223372036854775807' > \
/proc/uid_cputime/remove_uid_range" now returns error instead of hanging
Bug: 139902843
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I30138f95a1c56366a79eec27bbc476c9ea5773ae
A previously backported commit 8712cc728d ("BACKPORT: kasan: add
CONFIG_KASAN_GENERIC and CONFIG_KASAN_SW_TAGS") introduced a
__has_attribute(__no_sanitize_address__) check into
include/linux/compiler-gcc.h. Unfortunately older GCCs (before version 5)
don't support support this macro, which results in a compilation error.
Fix by checking GCC version instead. The same fix was applied to the 4.14
common kernel during the backport, but was forgotten here.
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 128674696
Bug: 142406440
Change-Id: I79cbc3fdc568c0d4905d296b8e5f979da5b41bbb
CONFIG_HYPERVISOR_GUEST is needed by cuttlefish to pass the CTS Bionic tests:
clock_getres_CLOCK_MONOTONIC, clock_getres_CLOCK_REALTIME, and
clock_getres_CLOCK_BOOTTIME
Bug: 138199351
Change-Id: I3e209035cdd7a29e020078f8aba567d9c8446347
Signed-off-by: Ram Muthiah <rammuthiah@google.com>
CONFIG_NLS_CODEPAGE_437 and CONFIG_NLS_ISO8859_1 are needed by cuttlefish
to pass the CTS test - StorageManagerTest#testMountAndUnmountObbNormal
Adding the remaining 47 NLS Configs to account for possible i18n
requirements not caught by CTS/VTS
Bug: 138199351
Change-Id: I897873c8454f8329f4dc53d9af71907ba71790b7
Signed-off-by: Ram Muthiah <rammuthiah@google.com>
A parent device can have child devices that it adds when it probes. But
this probing of the parent device can happen way after kernel init is done
-- for example, when the parent device's driver is loaded as a module.
In such cases, if the child devices depend on a supplier in the system, we
need to make sure the supplier gets the sync_state() callback only after
these child devices are added and probed.
To achieve this, when creating device links for a device by looking at its
DT node, don't just look at DT references at the top node level. Look at DT
references in all the descendant nodes too and create device links from the
ancestor device to all these supplier devices.
This way, when the parent device probes and adds child devices, the child
devices can then create their own device links to the suppliers and further
delay the supplier's sync_state() callback to after the child devices are
probed.
Example:
In this illustration, -> denotes DT references and indentation
represents child status.
Device node A
Device node B -> D
Device node C -> B, D
Device node D
Assume all these devices have their drivers loaded as modules.
Without this patch, this is the sequence of events:
1. D is added.
2. A is added.
3. Device D probes.
4. Device D gets its sync_state() callback.
5. Device B and C might malfunction because their resources got
altered/turned off before they can make active requests for them.
With this patch, this is the sequence of events:
1. D is added.
2. A is added and creates device links to D.
3. Device link from A to B is not added because A is a parent of B.
4. Device D probes.
5. Device D does not get it's sync_state() callback because consumer A
hasn't probed yet.
5. Device A probes.
5. a. Devices B and C are added.
5. b. Device links from B and C to D are added.
5. c. Device A's probe completes.
6. Device D does not get it's sync_state() callback because consumer A
has probed but consumers B and C haven't probed yet.
7. Device B and C probe.
8. Device D gets it's sync_state() callback because all its consumers
have probed.
9. None of the devices malfunction.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20190904211126.47518-7-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry-picked from commit d4387cd117https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git driver-core-next)
Change-Id: I38186d6cc4671bd8747dae8c440b09717a487088
When all the top level devices are populated from DT during kernel
init, the supplier devices could be added and probed before the
consumer devices are added and linked to the suppliers. To avoid the
sync_state() callback from being called prematurely, pause the
sync_state() callbacks before populating the devices and resume them
at late_initcall_sync().
Similarly, when children devices are populated from a module using
of_platform_populate(), there could be supplier-consumer dependencies
between the children devices that are populated. To avoid the same
problem with sync_state() being called prematurely, pause and resume
sync_state() callbacks across of_platform_populate().
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20190904211126.47518-6-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry-picked from commit 5e6669387ehttps://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git driver-core-next)
Change-Id: Ia43ebbc9071d4d6a59b347ccae99fcf0ab192269
This sync_state driver/bus callback is called once all the consumers
of a supplier have probed successfully.
This allows the supplier device's driver/bus to sync the supplier
device's state to the software state with the guarantee that all the
consumers are actively managing the resources provided by the supplier
device.
To maintain backwards compatibility and ease transition from existing
frameworks and resource cleanup schemes, late_initcall_sync is the
earliest when the sync_state callback might be called.
There is no upper bound on the time by which the sync_state callback
has to be called. This is because if a consumer device never probes,
the supplier has to maintain its resources in the state left by the
bootloader. For example, if the bootloader leaves the display
backlight at a fixed voltage and the backlight driver is never probed,
you don't want the backlight to ever be turned off after boot up.
Also, when multiple devices are added after kernel init, some
suppliers could be added before their consumer devices get added. In
these instances, the supplier devices could get their sync_state
callback called right after they probe because the consumers devices
haven't had a chance to create device links to the suppliers.
To handle this correctly, this change also provides APIs to
pause/resume sync state callbacks so that when multiple devices are
added, their sync_state callback evaluation can be postponed to happen
after all of them are added.
kbuild test robot reported missing documentation for device.state_synced
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20190904211126.47518-5-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry-picked from commit fc5a251d0fhttps://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git driver-core-next)
Conflicts: include/linux/device.h
[Rewrite an "if" to avoid use of DL_FLAG_MANAGED]
Signed-off-by: Saravana Kannan <saravanak@google.com>
Change-Id: I3e0a483c76d6ae3e5fe5aa6b3df65d0417c65ebb
Add device links after the devices are created (but before they are
probed) by looking at common DT bindings like clocks and
interconnects.
Automatically adding device links for functional dependencies at the
framework level provides the following benefits:
- Optimizes device probe order and avoids the useless work of
attempting probes of devices that will not probe successfully
(because their suppliers aren't present or haven't probed yet).
For example, in a commonly available mobile SoC, registering just
one consumer device's driver at an initcall level earlier than the
supplier device's driver causes 11 failed probe attempts before the
consumer device probes successfully. This was with a kernel with all
the drivers statically compiled in. This problem gets a lot worse if
all the drivers are loaded as modules without direct symbol
dependencies.
- Supplier devices like clock providers, interconnect providers, etc
need to keep the resources they provide active and at a particular
state(s) during boot up even if their current set of consumers don't
request the resource to be active. This is because the rest of the
consumers might not have probed yet and turning off the resource
before all the consumers have probed could lead to a hang or
undesired user experience.
Some frameworks (Eg: regulator) handle this today by turning off
"unused" resources at late_initcall_sync and hoping all the devices
have probed by then. This is not a valid assumption for systems with
loadable modules. Other frameworks (Eg: clock) just don't handle
this due to the lack of a clear signal for when they can turn off
resources. This leads to downstream hacks to handle cases like this
that can easily be solved in the upstream kernel.
By linking devices before they are probed, we give suppliers a clear
count of the number of dependent consumers. Once all of the
consumers are active, the suppliers can turn off the unused
resources without making assumptions about the number of consumers.
By default we just add device-links to track "driver presence" (probe
succeeded) of the supplier device. If any other functionality provided
by device-links are needed, it is left to the consumer/supplier
devices to change the link when they probe.
kbuild test robot reported clang error about missing const
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20190904211126.47518-4-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry-picked from commit a3e1d1a7f5https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git driver-core-next)
[Removed use of DL_FLAG_AUTOPROBE_CONSUMER flag. Not strictly necessary.]
Signed-off-by: Saravana Kannan <saravanak@google.com>
Change-Id: I54a2dc3c4e91e418bcf1895c16bb8658dcfe1bee
The firmware corresponding to a device (dev.fwnode) might be able to
provide functional dependency information between a device and its
supplier and consumer devices. Tracking this functional dependency
allows optimizing device probe order and informing a supplier when all
its consumers have probed (and thereby actively managing their
resources).
The existing device links feature allows tracking and using
supplier-consumer relationships. So, this patch adds the add_links()
fwnode callback to allow firmware to create device links for each
device as the device is added.
However, when consumer devices are added, they might not have a supplier
device to link to despite needing mandatory resources/functionality from
one or more suppliers. A waiting_for_suppliers list is created to track
such consumers and retry linking them when new devices get added.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20190904211126.47518-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry-picked from commit e2ae9bcc4ahttps://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git driver-core-next)
Change-Id: I97ffa57fa71588bf198e78d8b5a313b860b17bf5
* aosp/upstream-f2fs-stable-linux-4.19.y:
f2fs: add a condition to detect overflow in f2fs_ioc_gc_range()
f2fs: fix to add missing F2FS_IO_ALIGNED() condition
f2fs: fix to fallback to buffered IO in IO aligned mode
f2fs: fix to handle error path correctly in f2fs_map_blocks
f2fs: fix extent corrupotion during directIO in LFS mode
f2fs: check all the data segments against all node ones
f2fs: Add a small clarification to CONFIG_FS_F2FS_FS_SECURITY
f2fs: fix inode rwsem regression
f2fs: fix to avoid accessing uninitialized field of inode page in is_alive()
f2fs: avoid infinite GC loop due to stale atomic files
f2fs: Fix indefinite loop in f2fs_gc()
f2fs: convert inline_data in prior to i_size_write
f2fs: fix error path of f2fs_convert_inline_page()
f2fs: add missing documents of reserve_root/resuid/resgid
f2fs: fix flushing node pages when checkpoint is disabled
f2fs: enhance f2fs_is_checkpoint_ready()'s readability
f2fs: clean up __bio_alloc()'s parameter
f2fs: fix wrong error injection path in inc_valid_block_count()
f2fs: fix to writeout dirty inode during node flush
f2fs: optimize case-insensitive lookups
f2fs: introduce f2fs_match_name() for cleanup
f2fs: Fix indefinite loop in f2fs_gc()
f2fs: allocate memory in batch in build_sit_info()
f2fs: support FS_IOC_{GET,SET}FSLABEL
f2fs: fix to avoid data corruption by forbidding SSR overwrite
f2fs: Fix build error while CONFIG_NLS=m
Revert "f2fs: avoid out-of-range memory access"
f2fs: cleanup the code in build_sit_entries.
f2fs: fix wrong available node count calculation
f2fs: remove duplicate code in f2fs_file_write_iter
f2fs: fix to migrate blocks correctly during defragment
f2fs: use wrapped f2fs_cp_error()
f2fs: fix to use more generic EOPNOTSUPP
f2fs: use wrapped IS_SWAPFILE()
f2fs: Support case-insensitive file name lookups
f2fs: include charset encoding information in the superblock
fs: Reserve flag for casefolding
f2fs: fix to avoid call kvfree under spinlock
fs: f2fs: Remove unnecessary checks of SM_I(sbi) in update_general_status()
f2fs: disallow direct IO in atomic write
f2fs: fix to handle quota_{on,off} correctly
f2fs: fix to detect cp error in f2fs_setxattr()
f2fs: fix to spread f2fs_is_checkpoint_ready()
f2fs: support fiemap() for directory inode
f2fs: fix to avoid discard command leak
f2fs: fix to avoid tagging SBI_QUOTA_NEED_REPAIR incorrectly
f2fs: fix to drop meta/node pages during umount
f2fs: disallow switching io_bits option during remount
f2fs: fix panic of IO alignment feature
f2fs: introduce {page,io}_is_mergeable() for readability
f2fs: fix livelock in swapfile writes
f2fs: add fs-verity support
ext4: update on-disk format documentation for fs-verity
ext4: add fs-verity read support
ext4: add basic fs-verity support
fs-verity: support builtin file signatures
fs-verity: add SHA-512 support
fs-verity: implement FS_IOC_MEASURE_VERITY ioctl
fs-verity: implement FS_IOC_ENABLE_VERITY ioctl
fs-verity: add data verification hooks for ->readpages()
fs-verity: add the hook for file ->setattr()
fs-verity: add the hook for file ->open()
fs-verity: add inode and superblock fields
fs-verity: add Kconfig and the helper functions for hashing
fs: uapi: define verity bit for FS_IOC_GETFLAGS
fs-verity: add UAPI header
fs-verity: add MAINTAINERS file entry
fs-verity: add a documentation file
ext4: fix kernel oops caused by spurious casefold flag
ext4: fix coverity warning on error path of filename setup
ext4: optimize case-insensitive lookups
ext4: fix dcache lookup of !casefolded directories
unicode: update to Unicode 12.1.0 final
unicode: add missing check for an error return from utf8lookup()
ext4: export /sys/fs/ext4/feature/casefold if Unicode support is present
unicode: refactor the rule for regenerating utf8data.h
ext4: Support case-insensitive file name lookups
ext4: include charset encoding information in the superblock
unicode: update unicode database unicode version 12.1.0
unicode: introduce test module for normalized utf8 implementation
unicode: implement higher level API for string handling
unicode: reduce the size of utf8data[]
unicode: introduce code for UTF-8 normalization
unicode: introduce UTF-8 character database
ext4 crypto: fix to check feature status before get policy
fscrypt: document the new ioctls and policy version
ubifs: wire up new fscrypt ioctls
f2fs: wire up new fscrypt ioctls
ext4: wire up new fscrypt ioctls
fscrypt: require that key be added when setting a v2 encryption policy
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctl
fscrypt: allow unprivileged users to add/remove keys for v2 policies
fscrypt: v2 encryption policy support
fscrypt: add an HKDF-SHA512 implementation
fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl
fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl
fscrypt: rename keyinfo.c to keysetup.c
fscrypt: move v1 policy key setup to keysetup_v1.c
fscrypt: refactor key setup code in preparation for v2 policies
fscrypt: rename fscrypt_master_key to fscrypt_direct_key
fscrypt: add ->ci_inode to fscrypt_info
fscrypt: use FSCRYPT_* definitions, not FS_*
fscrypt: use FSCRYPT_ prefix for uapi constants
fs, fscrypt: move uapi definitions to new header <linux/fscrypt.h>
fscrypt: use ENOPKG when crypto API support missing
fscrypt: improve warnings for missing crypto API support
fscrypt: improve warning messages for unsupported encryption contexts
fscrypt: make fscrypt_msg() take inode instead of super_block
fscrypt: clean up base64 encoding/decoding
fscrypt: remove loadable module related code
Conflicts:
fs/ext4/ioctl.c
fs/ext4/readpage.c
Bug: 141329812
Change-Id: I2e10c22a7c52982d073ac6897cc8aa4d5a811a38
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Backport: drop sparc changes.
(Upstream commit 903f433f8f).
Patch series "arm64: untag user pointers passed to the kernel", v19.
=== Overview
arm64 has a feature called Top Byte Ignore, which allows to embed pointer
tags into the top byte of each pointer. Userspace programs (such as
HWASan, a memory debugging tool [1]) might use this feature and pass
tagged user pointers to the kernel through syscalls or other interfaces.
Right now the kernel is already able to handle user faults with tagged
pointers, due to these patches:
1. 81cddd65 ("arm64: traps: fix userspace cache maintenance emulation on a
tagged pointer")
2. 7dcd9dd8 ("arm64: hw_breakpoint: fix watchpoint matching for tagged
pointers")
3. 276e9327 ("arm64: entry: improve data abort handling of tagged
pointers")
This patchset extends tagged pointer support to syscall arguments.
As per the proposed ABI change [3], tagged pointers are only allowed to be
passed to syscalls when they point to memory ranges obtained by anonymous
mmap() or sbrk() (see the patchset [3] for more details).
For non-memory syscalls this is done by untaging user pointers when the
kernel performs pointer checking to find out whether the pointer comes
from userspace (most notably in access_ok). The untagging is done only
when the pointer is being checked, the tag is preserved as the pointer
makes its way through the kernel and stays tagged when the kernel
dereferences the pointer when perfoming user memory accesses.
The mmap and mremap (only new_addr) syscalls do not currently accept
tagged addresses. Architectures may interpret the tag as a background
colour for the corresponding vma.
Other memory syscalls (mprotect, etc.) don't do user memory accesses but
rather deal with memory ranges, and untagged pointers are better suited to
describe memory ranges internally. Thus for memory syscalls we untag
pointers completely when they enter the kernel.
=== Other approaches
One of the alternative approaches to untagging that was considered is to
completely strip the pointer tag as the pointer enters the kernel with
some kind of a syscall wrapper, but that won't work with the countless
number of different ioctl calls. With this approach we would need a
custom wrapper for each ioctl variation, which doesn't seem practical.
An alternative approach to untagging pointers in memory syscalls prologues
is to inspead allow tagged pointers to be passed to find_vma() (and other
vma related functions) and untag them there. Unfortunately, a lot of
find_vma() callers then compare or subtract the returned vma start and end
fields against the pointer that was being searched. Thus this approach
would still require changing all find_vma() callers.
=== Testing
The following testing approaches has been taken to find potential issues
with user pointer untagging:
1. Static testing (with sparse [2] and separately with a custom static
analyzer based on Clang) to track casts of __user pointers to integer
types to find places where untagging needs to be done.
2. Static testing with grep to find parts of the kernel that call
find_vma() (and other similar functions) or directly compare against
vm_start/vm_end fields of vma.
3. Static testing with grep to find parts of the kernel that compare
user pointers with TASK_SIZE or other similar consts and macros.
4. Dynamic testing: adding BUG_ON(has_tag(addr)) to find_vma() and running
a modified syzkaller version that passes tagged pointers to the kernel.
Based on the results of the testing the requried patches have been added
to the patchset.
=== Notes
This patchset is meant to be merged together with "arm64 relaxed ABI" [3].
This patchset is a prerequisite for ARM's memory tagging hardware feature
support [4].
This patchset has been merged into the Pixel 2 & 3 kernel trees and is
now being used to enable testing of Pixel phones with HWASan.
Thanks!
[1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
[2] 5f960cb10f
[3] https://lkml.org/lkml/2019/6/12/745
[4] https://community.arm.com/processors/b/blog/posts/arm-a-profile-architecture-2018-developments-armv85a
This patch (of 11)
This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.
strncpy_from_user and strnlen_user accept user addresses as arguments, and
do not go through the same path as copy_from_user and others, so here we
need to handle the case of tagged user addresses separately.
Untag user pointers passed to these functions.
Note, that this patch only temporarily untags the pointers to perform
validity checks, but then uses them as is to perform user memory accesses.
[andreyknvl@google.com: fix sparc4 build]
Link: http://lkml.kernel.org/r/CAAeHK+yx4a-P0sDrXTUxMvO2H0CJZUFPffBrg_cU7oJOZyC7ew@mail.gmail.com
Link: http://lkml.kernel.org/r/c5a78bcad3e94d6cda71fcaa60a423231ae71e4c.1563904656.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jens Wiklander <jens.wiklander@linaro.org>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Iece8763b3a9548c8a4f52184117f6ca5f49b4b3e
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
(Upstream commit 799c851052).
The referenced file does not exist, but tagged-address-abi.rst does.
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: Id87593870acd680e37254db234da3536be1b0028
(Upstream commit bd3841cd3b).
tags_test.c relies on PR_SET_TAGGED_ADDR_CTRL/PR_TAGGED_ADDR_ENABLE being
present in system headers. When this is not the case the build of this
test fails with undeclared identifier errors.
Fix by providing the path to the KSFT installed kernel headers in CFLAGS.
Reported-by: Cristian Marussi <cristian.marussi@arm.com>
Suggested-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: I60d1538e2fc391bcf74823ddcdce7c425d6bf707
(Upstream commit 92af2b6961).
On AArch64 the TCR_EL1.TBI0 bit is set by default, allowing userspace
(EL0) to perform memory accesses through 64-bit pointers with a non-zero
top byte. However, such pointers were not allowed at the user-kernel
syscall ABI boundary.
With the Tagged Address ABI patchset, it is now possible to pass tagged
pointers to the syscalls. Relax the requirements described in
tagged-pointers.rst to be compliant with the behaviours guaranteed by
the AArch64 Tagged Address ABI.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Bug: 135692346
Change-Id: I9b1b2aa2fa06ab58f2946dfe4628473e2fadacde
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
(Upstream commit e1b832503e).
On AArch64 the TCR_EL1.TBI0 bit is set by default, allowing userspace
(EL0) to perform memory accesses through 64-bit pointers with a non-zero
top byte. Introduce the document describing the relaxation of the
syscall ABI that allows userspace to pass certain tagged pointers to
kernel syscalls.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Acked-by: Kevin Brodsky <kevin.brodsky@arm.com>
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: I14a597504c54fb623194e56e1a471a67eb93f697
(Upstream commit 413235fced).
First rename the sysctl control to abi.tagged_addr_disabled and make it
default off (zero). When abi.tagged_addr_disabled == 1, only block the
enabling of the TBI ABI via prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE).
Getting the status of the ABI or disabling it is still allowed.
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: I80462b93d8cb92b2abd4d1f3ac0f6fbba419590b
(Upstream commit 3e91ec89f5).
Require that arg{3,4,5} of the PR_{SET,GET}_TAGGED_ADDR_CTRL prctl and
arg2 of the PR_GET_TAGGED_ADDR_CTRL prctl() are zero rather than ignored
for future extensions.
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: I8bb5c3eb4728440880c971d77904f7e45b571ddc
(Upstream commit 9c1cac424c).
untagged_addr() can be called with a '__user' pointer parameter and must
therefore use '__force' casts both when passing this parameter through
to sign_extend64() as a 'u64', but also when casting the 's64' return
value back to the '__user' pointer type.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: I8806aa6911cda3ab0489dff59b75ae214b02d48e
(Upstream commit 9ce1263033).
This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.
This patch adds a simple test, that calls the uname syscall with a
tagged user pointer as an argument. Without the kernel accepting tagged
user pointers the test fails with EFAULT.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: I150694bf72706b5dd39939e38435c3aebf6ab2b3
(Upstream commit 63f0c60379).
It is not desirable to relax the ABI to allow tagged user addresses into
the kernel indiscriminately. This patch introduces a prctl() interface
for enabling or disabling the tagged ABI with a global sysctl control
for preventing applications from enabling the relaxed ABI (meant for
testing user-space prctl() return error checking without reconfiguring
the kernel). The ABI properties are inherited by threads of the same
application and fork()'ed children but cleared on execve(). A Kconfig
option allows the overall disabling of the relaxed ABI.
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
MTE-specific settings like imprecise vs precise exceptions.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Change-Id: I2d52c5589b05415faab315c116245f1058d64750
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
(Upstream commit 2b835e24b5).
This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.
copy_from_user (and a few other similar functions) are used to copy data
from user memory into the kernel memory or vice versa. Since a user can
provided a tagged pointer to one of the syscalls that use copy_from_user,
we need to correctly handle such pointers.
Do this by untagging user pointers in access_ok and in __uaccess_mask_ptr,
before performing access validity checks.
Note, that this patch only temporarily untags the pointers to perform the
checks, but then passes them as is into the kernel internals.
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
[will: Add __force to casting in untagged_addr() to kill sparse warning]
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: Ia97d1d311c2e2bf8e4584005b5085204db6d8955
(Upstream commit d93445225c).
Architectures that support memory tagging have a need to perform untagging
(stripping the tag) in various parts of the kernel. This patch adds an
untagged_addr() macro, which is defined as noop for architectures that do
not support memory tagging. The oncoming patch series will define it at
least for sparc64 and arm64.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Bug: 135692346
Change-Id: If3932582c82e302fe8d602fabac9a8f4cde6388d
psi tracks the time tasks wait for refaulting pages to become
uptodate, but it does not track the time spent submitting the IO. The
submission part can be significant if backing storage is contended or
when cgroup throttling (io.latency) is in effect - a lot of time is
spent in submit_bio(). In that case, we underreport memory pressure.
Annotate submit_bio() to account submission time as memory stall when
the bio is reading userspace workingset pages.
Tested-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit b8e24a9300)
Conflicts:
include/linux/blk_types.h
(1. Manually resolved BIO_WORKINGSET being definition instead of enum.)
Bug: 141131229
Test: boot and run act test suite
Change-Id: I99cef039844e219f1dc8196feead54b6f5fb26bb
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Changes in 4.19.78
tpm: use tpm_try_get_ops() in tpm-sysfs.c.
tpm: Fix TPM 1.2 Shutdown sequence to prevent future TPM operations
drm/bridge: tc358767: Increase AUX transfer length limit
drm/panel: simple: fix AUO g185han01 horizontal blanking
video: ssd1307fb: Start page range at page_offset
drm/stm: attach gem fence to atomic state
drm/panel: check failure cases in the probe func
drm/rockchip: Check for fast link training before enabling psr
drm/radeon: Fix EEH during kexec
gpu: drm: radeon: Fix a possible null-pointer dereference in radeon_connector_set_property()
PCI: rpaphp: Avoid a sometimes-uninitialized warning
ipmi_si: Only schedule continuously in the thread in maintenance mode
clk: qoriq: Fix -Wunused-const-variable
clk: sunxi-ng: v3s: add missing clock slices for MMC2 module clocks
drm/amd/display: fix issue where 252-255 values are clipped
drm/amd/display: reprogram VM config when system resume
powerpc/powernv/ioda2: Allocate TCE table levels on demand for default DMA window
clk: actions: Don't reference clk_init_data after registration
clk: sirf: Don't reference clk_init_data after registration
clk: sprd: Don't reference clk_init_data after registration
clk: zx296718: Don't reference clk_init_data after registration
powerpc/xmon: Check for HV mode when dumping XIVE info from OPAL
powerpc/rtas: use device model APIs and serialization during LPM
powerpc/futex: Fix warning: 'oldval' may be used uninitialized in this function
powerpc/pseries/mobility: use cond_resched when updating device tree
pinctrl: tegra: Fix write barrier placement in pmx_writel
powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
vfio_pci: Restore original state on release
drm/nouveau/volt: Fix for some cards having 0 maximum voltage
pinctrl: amd: disable spurious-firing GPIO IRQs
clk: renesas: mstp: Set GENPD_FLAG_ALWAYS_ON for clock domain
clk: renesas: cpg-mssr: Set GENPD_FLAG_ALWAYS_ON for clock domain
drm/amd/display: support spdif
drm/amdgpu/si: fix ASIC tests
powerpc/64s/exception: machine check use correct cfar for late handler
pstore: fs superblock limits
clk: qcom: gcc-sdm845: Use floor ops for sdcc clks
powerpc/pseries: correctly track irq state in default idle
pinctrl: meson-gxbb: Fix wrong pinning definition for uart_c
arm64: fix unreachable code issue with cmpxchg
clk: at91: select parent if main oscillator or bypass is enabled
powerpc: dump kernel log before carrying out fadump or kdump
mbox: qcom: add APCS child device for QCS404
clk: sprd: add missing kfree
scsi: core: Reduce memory required for SCSI logging
dma-buf/sw_sync: Synchronize signal vs syncpt free
ext4: fix potential use after free after remounting with noblock_validity
MIPS: Ingenic: Disable broken BTB lookup optimization.
MIPS: tlbex: Explicitly cast _PAGE_NO_EXEC to a boolean
i2c-cht-wc: Fix lockdep warning
mfd: intel-lpss: Remove D3cold delay
PCI: tegra: Fix OF node reference leak
HID: wacom: Fix several minor compiler warnings
livepatch: Nullify obj->mod in klp_module_coming()'s error path
ARM: 8898/1: mm: Don't treat faults reported from cache maintenance as writes
soundwire: intel: fix channel number reported by hardware
ARM: 8875/1: Kconfig: default to AEABI w/ Clang
rtc: snvs: fix possible race condition
rtc: pcf85363/pcf85263: fix regmap error in set_time
HID: apple: Fix stuck function keys when using FN
PCI: rockchip: Propagate errors for optional regulators
PCI: histb: Propagate errors for optional regulators
PCI: imx6: Propagate errors for optional regulators
PCI: exynos: Propagate errors for optional PHYs
security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb()
ARM: 8903/1: ensure that usable memory in bank 0 starts from a PMD-aligned address
fat: work around race with userspace's read via blockdev while mounting
pktcdvd: remove warning on attempting to register non-passthrough dev
hypfs: Fix error number left in struct pointer member
crypto: hisilicon - Fix double free in sec_free_hw_sgl()
kbuild: clean compressed initramfs image
ocfs2: wait for recovering done after direct unlock request
kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K
arm64: consider stack randomization for mmap base only when necessary
mips: properly account for stack randomization and stack guard gap
arm: properly account for stack randomization and stack guard gap
arm: use STACK_TOP when computing mmap base address
block: mq-deadline: Fix queue restart handling
bpf: fix use after free in prog symbol exposure
cxgb4:Fix out-of-bounds MSI-X info array access
erspan: remove the incorrect mtu limit for erspan
hso: fix NULL-deref on tty open
ipv6: drop incoming packets having a v4mapped source address
ipv6: Handle missing host route in __ipv6_ifa_notify
net: ipv4: avoid mixed n_redirects and rate_tokens usage
net: qlogic: Fix memory leak in ql_alloc_large_buffers
net: Unpublish sk from sk_reuseport_cb before call_rcu
nfc: fix memory leak in llcp_sock_bind()
qmi_wwan: add support for Cinterion CLS8 devices
rxrpc: Fix rxrpc_recvmsg tracepoint
sch_dsmark: fix potential NULL deref in dsmark_init()
udp: fix gso_segs calculations
vsock: Fix a lockdep warning in __vsock_release()
net: dsa: rtl8366: Check VLAN ID and not ports
udp: only do GSO if # of segs > 1
net/rds: Fix error handling in rds_ib_add_one()
xen-netfront: do not use ~0U as error return value for xennet_fill_frags()
tipc: fix unlimited bundling of small messages
sch_cbq: validate TCA_CBQ_WRROPT to avoid crash
soundwire: Kconfig: fix help format
soundwire: fix regmap dependencies and align with other serial links
Smack: Don't ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set
smack: use GFP_NOFS while holding inode_smack::smk_lock
NFC: fix attrs checks in netlink interface
kexec: bail out upon SIGKILL when allocating memory.
9p/cache.c: Fix memory leak in v9fs_cache_session_get_cookie
Linux 4.19.78
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I02db9c4a0cc1f784e6ac1523599fcbed747a6375
commit 18917d5147 upstream.
nfc_genl_deactivate_target() relies on the NFC_ATTR_TARGET_INDEX
attribute being present, but doesn't check whether it is actually
provided by the user. Same goes for nfc_genl_fw_download() and
NFC_ATTR_FIRMWARE_NAME.
This patch adds appropriate checks.
Found with syzkaller.
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3675f052b4 upstream.
There is a logic bug in the current smack_bprm_set_creds():
If LSM_UNSAFE_PTRACE is set, but the ptrace state is deemed to be
acceptable (e.g. because the ptracer detached in the meantime), the other
->unsafe flags aren't checked. As far as I can tell, this means that
something like the following could work (but I haven't tested it):
- task A: create task B with fork()
- task B: set NO_NEW_PRIVS
- task B: install a seccomp filter that makes open() return 0 under some
conditions
- task B: replace fd 0 with a malicious library
- task A: attach to task B with PTRACE_ATTACH
- task B: execve() a file with an SMACK64EXEC extended attribute
- task A: while task B is still in the middle of execve(), exit (which
destroys the ptrace relationship)
Make sure that if any flags other than LSM_UNSAFE_PTRACE are set in
bprm->unsafe, we reject the execve().
Cc: stable@vger.kernel.org
Fixes: 5663884caa ("Smack: unify all ptrace accesses in the smack")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e95584a889 ]
We have identified a problem with the "oversubscription" policy in the
link transmission code.
When small messages are transmitted, and the sending link has reached
the transmit window limit, those messages will be bundled and put into
the link backlog queue. However, bundles of data messages are counted
at the 'CRITICAL' level, so that the counter for that level, instead of
the counter for the real, bundled message's level is the one being
increased.
Subsequent, to-be-bundled data messages at non-CRITICAL levels continue
to be tested against the unchanged counter for their own level, while
contributing to an unrestrained increase at the CRITICAL backlog level.
This leaves a gap in congestion control algorithm for small messages
that can result in starvation for other users or a "real" CRITICAL
user. Even that eventually can lead to buffer exhaustion & link reset.
We fix this by keeping a 'target_bskb' buffer pointer at each levels,
then when bundling, we only bundle messages at the same importance
level only. This way, we know exactly how many slots a certain level
have occupied in the queue, so can manage level congestion accurately.
By bundling messages at the same level, we even have more benefits. Let
consider this:
- One socket sends 64-byte messages at the 'CRITICAL' level;
- Another sends 4096-byte messages at the 'LOW' level;
When a 64-byte message comes and is bundled the first time, we put the
overhead of message bundle to it (+ 40-byte header, data copy, etc.)
for later use, but the next message can be a 4096-byte one that cannot
be bundled to the previous one. This means the last bundle carries only
one payload message which is totally inefficient, as for the receiver
also! Later on, another 64-byte message comes, now we make a new bundle
and the same story repeats...
With the new bundling algorithm, this will not happen, the 64-byte
messages will be bundled together even when the 4096-byte message(s)
comes in between. However, if the 4096-byte messages are sent at the
same level i.e. 'CRITICAL', the bundling algorithm will again cause the
same overhead.
Also, the same will happen even with only one socket sending small
messages at a rate close to the link transmit's one, so that, when one
message is bundled, it's transmitted shortly. Then, another message
comes, a new bundle is created and so on...
We will solve this issue radically by another patch.
Fixes: 365ad353c2 ("tipc: reduce risk of user starvation during link congestion")
Reported-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>