Pull arm64 updates from Catalin Marinas:
"Apart from the core arm64 and perf changes, the Spectre v4 mitigation
touches the arm KVM code and the ACPI PPTT support touches drivers/
(acpi and cacheinfo). I should have the maintainers' acks in place.
Summary:
- Spectre v4 mitigation (Speculative Store Bypass Disable) support
for arm64 using SMC firmware call to set a hardware chicken bit
- ACPI PPTT (Processor Properties Topology Table) parsing support and
enable the feature for arm64
- Report signal frame size to user via auxv (AT_MINSIGSTKSZ). The
primary motivation is Scalable Vector Extensions which requires
more space on the signal frame than the currently defined
MINSIGSTKSZ
- ARM perf patches: allow building arm-cci as module, demote
dev_warn() to dev_dbg() in arm-ccn event_init(), miscellaneous
cleanups
- cmpwait() WFE optimisation to avoid some spurious wakeups
- L1_CACHE_BYTES reverted back to 64 (for performance reasons that
have to do with some network allocations) while keeping
ARCH_DMA_MINALIGN to 128. cache_line_size() returns the actual
hardware Cache Writeback Granule
- Turn LSE atomics on by default in Kconfig
- Kernel fault reporting tidying
- Some #include and miscellaneous cleanups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (53 commits)
arm64: Fix syscall restarting around signal suppressed by tracer
arm64: topology: Avoid checking numa mask for scheduler MC selection
ACPI / PPTT: fix build when CONFIG_ACPI_PPTT is not enabled
arm64: cpu_errata: include required headers
arm64: KVM: Move VCPU_WORKAROUND_2_FLAG macros to the top of the file
arm64: signal: Report signal frame size to userspace via auxv
arm64/sve: Thin out initialisation sanity-checks for sve_max_vl
arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID
arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests
arm64: KVM: Add ARCH_WORKAROUND_2 support for guests
arm64: KVM: Add HYP per-cpu accessors
arm64: ssbd: Add prctl interface for per-thread mitigation
arm64: ssbd: Introduce thread flag to control userspace mitigation
arm64: ssbd: Restore mitigation status on CPU resume
arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation
arm64: ssbd: Add global mitigation state accessor
arm64: Add 'ssbd' command-line option
arm64: Add ARCH_WORKAROUND_2 probing
arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2
arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1
...
Pull dmaengine updates from Vinod Koul:
- updates to sprd, bam_dma, stm drivers
- remove VLAs in dmatest
- move TI drivers to their own subdir
- switch to SPDX tags for ima/mxs dma drivers
- simplify getting .drvdata on bunch of drivers by Wolfram Sang
* tag 'dmaengine-4.18-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (32 commits)
dmaengine: sprd: Add Spreadtrum DMA configuration
dmaengine: sprd: Optimize the sprd_dma_prep_dma_memcpy()
dmaengine: imx-dma: Switch to SPDX identifier
dmaengine: mxs-dma: Switch to SPDX identifier
dmaengine: imx-sdma: Switch to SPDX identifier
dmaengine: usb-dmac: Document R8A7799{0,5} bindings
dmaengine: qcom: bam_dma: fix some doc warnings.
dmaengine: qcom: bam_dma: fix invalid assignment warning
dmaengine: sprd: fix an NULL vs IS_ERR() bug
dmaengine: sprd: Use devm_ioremap_resource() to map memory
dmaengine: sprd: Fix potential NULL dereference in sprd_dma_probe()
dmaengine: pl330: flush before wait, and add dev burst support.
dmaengine: axi-dmac: Request IRQ with IRQF_SHARED
dmaengine: stm32-mdma: fix spelling mistake: "avalaible" -> "available"
dmaengine: rcar-dmac: Document R-Car D3 bindings
dmaengine: sprd: Move DMA request mode and interrupt type into head file
dmaengine: sprd: Define the DMA data width type
dmaengine: sprd: Define the DMA transfer step type
dmaengine: ti: New directory for Texas Instruments DMA drivers
dmaengine: shdmac: Change platform check to CONFIG_ARCH_RENESAS
...
Pull IOMMU updates from Joerg Roedel:
"Nothing big this time. In particular:
- Debugging code for Tegra-GART
- Improvement in Intel VT-d fault printing to prevent soft-lockups
when on fault storms
- Improvements in AMD IOMMU event reporting
- NUMA aware allocation in io-pgtable code for ARM
- Various other small fixes and cleanups all over the place"
* tag 'iommu-updates-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/io-pgtable-arm: Make allocations NUMA-aware
iommu/amd: Prevent possible null pointer dereference and infinite loop
iommu/amd: Fix grammar of comments
iommu: Clean up the comments for iommu_group_alloc
iommu/vt-d: Remove unnecessary parentheses
iommu/vt-d: Clean up pasid quirk for pre-production devices
iommu/vt-d: Clean up unused variable in find_or_alloc_domain
iommu/vt-d: Fix iotlb psi missing for mappings
iommu/vt-d: Introduce __mapping_notify_one()
iommu: Remove extra NULL check when call strtobool()
iommu/amd: Update logging information for new event type
iommu/amd: Update the PASID information printed to the system log
iommu/tegra: gart: Fix gart_iommu_unmap()
iommu/tegra: gart: Add debugging facility
iommu/io-pgtable-arm: Use for_each_set_bit to simplify code
iommu/qcom: Simplify getting .drvdata
iommu: Remove depends on HAS_DMA in case of platform dependency
iommu/vt-d: Ratelimit each dmar fault printing
Pull MTD updates from Boris Brezillon:
"Core changes:
- Add a sysfs attribute to expose available OOB size
Driver changes:
- Remove HAS_DMA dependency on various drivers
- Use dev_get_drvdata() instead of platform_get_drvdata() in docg3
- Replace msleep by usleep_range() in the dataflash driver
- Avoid VLA usage in nftl layers
- Remove useless .owner assignment in pismo
- Fix various issues in the CFI driver
- Improve TRX partition handling expose a DT compat for this part
parser
- Clarify OFFSET_CONTINUOUS meaning
NAND core changes:
- Add Miquel as a NAND maintainer
- Add access mode to the nand_page_io_req struct
- Fix kernel-doc in rawnand.h
- Support bit-wise majority to recover from corrupted ONFI parameter
pages
- Stop checking FAIL bit after a SET_FEATURES, as documented in the
ONFI spec
Raw NAND Driver changes:
- Fix and cleanup the error path of many NAND controller drivers
- GPMI:
+ Cleanup/simplification of a few aspects in the driver
+ Take ECC setup specified in the DT into account
- sunxi: remove support for GPIO-based R/B polling
- MTK:
+ Use of_device_get_match_data() instead of of_match_device()
+ Add an entry in MAINTAINERS for this driver
+ Fix nand-ecc-step-size and nand-ecc-strength description in the
DT bindings doc
- fsl_ifc: fix ->cmdfunc() to read more than one ONFI parameter page
OneNAND driver changes:
- samsung: use dev_get_drvdata() instead of platform_get_drvdata()
SPI NOR core changes:
- Add support for a bunch of SPI NOR chips
- Clear EAR reg when switching to 3-byte addressing mode on Winbond
chips
SPI NOR controller driver changes:
- cadence: Add DMA support for direct mode reads
- hisi: Prefix a few functions with hisi_
- intel:
+ Mark the driver as "dangerous" in Kconfig
+ Fix atomic sequence handling
+ Pass a 40us delay (instead of 0us) to readl_poll_timeout()
- fsl:
+ fix a typo in a function name
+ add support for IP variants embedded in the ls2080a and ls1080a
SoCs
- stm32: request exclusive control of the reset line"
* tag 'mtd/for-4.18' of git://git.infradead.org/linux-mtd: (66 commits)
mtd: nand: Pass mode information to nand_page_io_req
mtd: cfi_cmdset_0002: Change erase one block to enable XIP once
mtd: cfi_cmdset_0002: Change erase functions to check chip good only
mtd: cfi_cmdset_0002: Change erase functions to retry for error
mtd: cfi_cmdset_0002: Change definition naming to retry write operation
mtd: cfi_cmdset_0002: Change write buffer to check correct value
mtd: cmdlinepart: Update comment for introduction of OFFSET_CONTINUOUS
mtd: bcm47xxpart: add of_match_table with a new DT binding
dt-bindings: mtd: document Broadcom's BCM47xx partitions
mtd: spi-nor: Add support for EN25QH32
mtd: spi-nor: Add support for is25wp series chips
mtd: spi-nor: Add Winbond w25q32jv support
mtd: spi-nor: fsl-quadspi: add support for ls2080a/ls1080a
mtd: spi-nor: stm32-quadspi: explicitly request exclusive reset control
mtd: spi-nor: intel: provide a range for poll_timout
mtd: spi-nor: fsl-quadspi: fix api naming typo _init_ahb_read
mtd: spi-nor: intel-spi: Explicitly mark the driver as dangerous in Kconfig
mtd: spi-nor: intel-spi: Fix atomic sequence handling
mtd: rawnand: Do not check FAIL bit when executing a SET_FEATURES op
mtd: rawnand: use bit-wise majority to recover the ONFI param page
...
Pull GPIO updates from Linus Walleij:
"This is the bulk of GPIO changes for the v4.18 development cycle.
Core changes:
- We have killed off VLA from the core library and all drivers.
The background should be clear for everyone at this point:
https://lwn.net/Articles/749064/
Also I just don't like VLA's, kernel developers hate it when
compilers do things behind their back. It's as simple as that.
I'm sorry that they even slipped in to begin with. Kudos to Laura
Abbott for exorcising them.
- Support GPIO hogs in machines/board files.
New drivers and chip support:
- R-Car r8a77470 (RZ/G1C)
- R-Car r8a77965 (M3-N)
- R-Car r8a77990 (E3)
- PCA953x driver improvements to accomodate more variants.
Improvements and new features:
- Support one interrupt per line on port A in the DesignWare dwapb
driver.
Misc:
- Random cleanups, right header files in the drivers, some size
optimizations etc"
* tag 'gpio-v4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (73 commits)
gpio: davinci: fix build warning when !CONFIG_OF
gpio: dwapb: Fix rework support for 1 interrupt per port A GPIO
gpio: pxa: Include the right header
gpio: pl061: Include the right header
gpio: pch: Include the right header
gpio: pcf857x: Include the right header
gpio: pca953x: Include the right header
gpio: palmas: Include the right header
gpio: omap: Include the right header
gpio: octeon: Include the right header
gpio: mxs: Switch to SPDX identifier
gpio: Remove VLA from stmpe driver
gpio: mxc: Switch to SPDX identifier
gpio: mxc: add clock operation
gpio: Remove VLA from gpiolib
gpio: aspeed: Use a cache of output data registers
gpio: aspeed: Set output latch before changing direction
gpio: pca953x: fix address calculation for pcal6524
gpio: pca953x: define masks for addressing common and extended registers
gpio: pca953x: set the PCA_PCAL flag also when matching by DT
...
Pull HID updates from Jiri Kosina:
- Valve Steam Controller support from Rodrigo Rivas Costa
- Redragon Asura support from Robert Munteanu
- improvement of duplicate usage handling in generic hid-input from
Benjamin Tissoires
- Win 8.1 precisioun touchpad spec implementation from Benjamin
Tissoires
- Support for "In Range" flag for Wacom Intuos/Bamboo devices from
Jason Gerecke
- other various assorted smaller fixes and improvements
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (27 commits)
HID: rmi: use HID_QUIRK_NO_INPUT_SYNC
HID: multitouch: fix calculation of last slot field in multi-touch reports
HID: quirks: remove Delcom Visual Signal Indicator from hid_have_special_driver[]
HID: steam: select CONFIG_POWER_SUPPLY
HID: i2c-hid: remove i2c_hid_open_mut
HID: wacom: Support "in range" for Intuos/Bamboo tablets where possible
HID: core: fix hid_hw_open() comment
HID: hid-plantronics: Re-resend Update to map button for PTT products
HID: multitouch: fix types returned from mt_need_to_apply_feature()
HID: i2c-hid: check if device is there before really probing
HID: steam: add missing fields in client initialization
HID: steam: add battery device.
HID: add driver for Valve Steam Controller
HID: alps: Fix some style in 't4_read_write_register()'
HID: alps: Check errors returned by 't4_read_write_register()'
HID: alps: Save a memory allocation in 't4_read_write_register()' when writing data
HID: alps: Report an error if we receive invalid data in 't4_read_write_register()'
HID: multitouch: implement precision touchpad latency and switches
HID: multitouch: simplify the settings of the various features
HID: multitouch: make use of HID_QUIRK_INPUT_PER_APP
...
Pull livepatching fixlet from Jiri Kosina:
"livepatching documentation fix from Petr Mladek"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: Remove not longer valid limitations from the documentation
Pull aio iopriority support from Al Viro:
"The rest of aio stuff for this cycle - Adam's aio ioprio series"
* 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: aio ioprio use ioprio_check_cap ret val
fs: aio ioprio add explicit block layer dependence
fs: iomap dio set bio prio from kiocb prio
fs: blkdev set bio prio from kiocb prio
fs: Add aio iopriority support
fs: Convert kiocb rw_hint from enum to u16
block: add ioprio_check_cap function
Pull proc_fill_cache regression fix from Al Viro:
"Regression fix for proc_fill_cache() braino introduced when switching
instantiate() callback to d_splice_alias()"
* 'work.lookup' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix proc_fill_cache() in case of d_alloc_parallel() failure
Pull xen updates from Juergen Gross:
"This contains some minor code cleanups (fixing return types of
functions), some fixes for Linux running as Xen PVH guest, and adding
of a new guest resource mapping feature for Xen tools"
* tag 'for-linus-4.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/PVH: Make GDT selectors PVH-specific
xen/PVH: Set up GS segment for stack canary
xen/store: do not store local values in xen_start_info
xen-netfront: fix xennet_start_xmit()'s return type
xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE
xen: Change return type to vm_fault_t
Commit 17c2895 ("arm64: Abstract syscallno manipulation") abstracts
out the pt_regs.syscallno value for a syscall cancelled by a tracer
as NO_SYSCALL, and provides helpers to set and check for this
condition. However, the way this was implemented has the
unintended side-effect of disabling part of the syscall restart
logic.
This comes about because the second in_syscall() check in
do_signal() re-evaluates the "in a syscall" condition based on the
updated pt_regs instead of the original pt_regs. forget_syscall()
is explicitly called prior to the second check in order to prevent
restart logic in the ret_to_user path being spuriously triggered,
which means that the second in_syscall() check always yields false.
This triggers a failure in
tools/testing/selftests/seccomp/seccomp_bpf.c, when using ptrace to
suppress a signal that interrups a nanosleep() syscall.
Misbehaviour of this type is only expected in the case where a
tracer suppresses a signal and the target process is either being
single-stepped or the interrupted syscall attempts to restart via
-ERESTARTBLOCK.
This patch restores the old behaviour by performing the
in_syscall() check only once at the start of the function.
Fixes: 17c2895860 ("arm64: Abstract syscallno manipulation")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reported-by: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # 4.14.x-
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- improvement of duplicate usage handling in hid-input from Benjamin Tissoires
- Win 8.1 precisioun touchpad spec implementation from Benjamin Tissoires
If d_alloc_parallel() returns ERR_PTR(...), we don't want to dput()
that. Small reorganization allows to have all error-in-lookup
cases rejoin the main codepath after dput(child), avoiding the
entire problem.
Spotted-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Fixes: 0168b9e38c "procfs: switch instantiate_t to d_splice_alias()"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Merge updates from Andrew Morton:
- a few misc things
- ocfs2 updates
- v9fs updates
- MM
- procfs updates
- lib/ updates
- autofs updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits)
autofs: small cleanup in autofs_getpath()
autofs: clean up includes
autofs: comment on selinux changes needed for module autoload
autofs: update MAINTAINERS entry for autofs
autofs: use autofs instead of autofs4 in documentation
autofs: rename autofs documentation files
autofs: create autofs Kconfig and Makefile
autofs: delete fs/autofs4 source files
autofs: update fs/autofs4/Makefile
autofs: update fs/autofs4/Kconfig
autofs: copy autofs4 to autofs
autofs4: use autofs instead of autofs4 everywhere
autofs4: merge auto_fs.h and auto_fs4.h
fs/binfmt_misc.c: do not allow offset overflow
checkpatch: improve patch recognition
lib/ucs2_string.c: add MODULE_LICENSE()
lib/mpi: headers cleanup
lib/percpu_ida.c: use _irqsave() instead of local_irq_save() + spin_lock
lib/idr.c: remove simple_ida_lock
lib/bitmap.c: micro-optimization for __bitmap_complement()
...
Due to the autofs4 module using a file system type name of autofs
different from the module containing directory name autoload did not
function properly. To work around this kernel configurations have often
elected to build the module into the kernel.
This can result in selinux policies that prohibit autoloading of the
autofs module which need to be changed.
Add a comment about this to "possible changes" section of the autofs4
module help.
Link: http://lkml.kernel.org/r/152686474171.6155.1239659539983577463.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WHen registering a new binfmt_misc handler, it is possible to overflow
the offset to get a negative value, which might crash the system, or
possibly leak kernel data.
Here is a crash log when 2500000000 was used as an offset:
BUG: unable to handle kernel paging request at ffff989cfd6edca0
IP: load_misc_binary+0x22b/0x470 [binfmt_misc]
PGD 1ef3e067 P4D 1ef3e067 PUD 0
Oops: 0000 [#1] SMP NOPTI
Modules linked in: binfmt_misc kvm_intel ppdev kvm irqbypass joydev input_leds serio_raw mac_hid parport_pc qemu_fw_cfg parpy
CPU: 0 PID: 2499 Comm: bash Not tainted 4.15.0-22-generic #24-Ubuntu
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
RIP: 0010:load_misc_binary+0x22b/0x470 [binfmt_misc]
Call Trace:
search_binary_handler+0x97/0x1d0
do_execveat_common.isra.34+0x667/0x810
SyS_execve+0x31/0x40
do_syscall_64+0x73/0x130
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Use kstrtoint instead of simple_strtoul. It will work as the code
already set the delimiter byte to '\0' and we only do it when the field
is not empty.
Tested with offsets -1, 2500000000, UINT_MAX and INT_MAX. Also tested
with examples documented at Documentation/admin-guide/binfmt-misc.rst
and other registrations from packages on Ubuntu.
Link: http://lkml.kernel.org/r/20180529135648.14254-1-cascardo@canonical.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we get a hung task it can often be valuable to see _all_ the hung
tasks on the system before calling panic().
Quoting from https://syzkaller.appspot.com/text?tag=CrashReport&id=5316056503549952
----------------------------------------
INFO: task syz-executor0:6540 blocked for more than 120 seconds.
Not tainted 4.16.0+ #13
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor0 D23560 6540 4521 0x80000004
Call Trace:
context_switch kernel/sched/core.c:2848 [inline]
__schedule+0x8fb/0x1ef0 kernel/sched/core.c:3490
schedule+0xf5/0x430 kernel/sched/core.c:3549
schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3607
__mutex_lock_common kernel/locking/mutex.c:833 [inline]
__mutex_lock+0xb7f/0x1810 kernel/locking/mutex.c:893
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355
__blkdev_driver_ioctl block/ioctl.c:303 [inline]
blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601
ioctl_by_bdev+0xa5/0x110 fs/block_dev.c:2060
isofs_get_last_session fs/isofs/inode.c:567 [inline]
isofs_fill_super+0x2ba9/0x3bc0 fs/isofs/inode.c:660
mount_bdev+0x2b7/0x370 fs/super.c:1119
isofs_mount+0x34/0x40 fs/isofs/inode.c:1560
mount_fs+0x66/0x2d0 fs/super.c:1222
vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037
vfs_kern_mount fs/namespace.c:2514 [inline]
do_new_mount fs/namespace.c:2517 [inline]
do_mount+0xea4/0x2b90 fs/namespace.c:2847
ksys_mount+0xab/0x120 fs/namespace.c:3063
SYSC_mount fs/namespace.c:3077 [inline]
SyS_mount+0x39/0x50 fs/namespace.c:3074
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
(...snipped...)
Showing all locks held in the system:
(...snipped...)
2 locks held by syz-executor0/6540:
#0: 00000000566d4c39 (&type->s_umount_key#49/1){+.+.}, at: alloc_super fs/super.c:211 [inline]
#0: 00000000566d4c39 (&type->s_umount_key#49/1){+.+.}, at: sget_userns+0x3b2/0xe60 fs/super.c:502 /* down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); */
#1: 0000000043ca8836 (&lo->lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(&lo->lo_ctl_mutex, 1); */
(...snipped...)
3 locks held by syz-executor7/6541:
#0: 0000000043ca8836 (&lo->lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(&lo->lo_ctl_mutex, 1); */
#1: 000000007bf3d3f9 (&bdev->bd_mutex){+.+.}, at: blkdev_reread_part+0x1e/0x40 block/ioctl.c:192
#2: 00000000566d4c39 (&type->s_umount_key#50){.+.+}, at: __get_super.part.10+0x1d3/0x280 fs/super.c:663 /* down_read(&sb->s_umount); */
----------------------------------------
When reporting an AB-BA deadlock like shown above, it would be nice if
trace of PID=6541 is printed as well as trace of PID=6540 before calling
panic().
Showing hung tasks up to /proc/sys/kernel/hung_task_warnings could delay
calling panic() but normally there should not be so many hung tasks.
Link: http://lkml.kernel.org/r/201804050705.BHE57833.HVFOFtSOMQJFOL@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Test lookup in /proc/self/fd.
"map_files" lookup story showed that lookup is not that simple.
* Test that all those symlinks open the same file.
Check with (st_dev, st_info).
* Test that kernel threads do not have anything in their /proc/*/fd/
directory.
Now this is where things get interesting.
First, kernel threads aren't pinned by /proc/self or equivalent,
thus some "atomicity" is required.
Second, ->comm can contain whitespace and ')'.
No, they are not escaped.
Third, the only reliable way to check if process is kernel thread
appears to be field #9 in /proc/*/stat.
This field is struct task_struct::flags in decimal!
Check is done by testing PF_KTHREAD flags like we do in kernel.
PF_KTREAD value is a part of userspace ABI !!!
Other methods for determining kernel threadness are not reliable:
* RSS can be 0 if everything is swapped, even while reading
from /proc/self.
* ->total_vm CAN BE ZERO if process is finishing
munmap(NULL, whole address space);
* /proc/*/maps and similar files can be empty because unmapping
everything works. Read returning 0 can't distinguish between
kernel thread and such suicide process.
Link: http://lkml.kernel.org/r/20180505000414.GA15090@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>