[ Upstream commit df2474a22c ]
Since 9e8925b67a ("locks: Allow disabling mandatory locking at compile
time"), attempts to mount filesystems with "-o mand" will fail.
Unfortunately, there is no other indiciation of the reason for the
failure.
Change how the function is defined for better readability. When
CONFIG_MANDATORY_FILE_LOCKING is disabled, printk a warning when
someone attempts to mount with -o mand.
Also, add a blurb to the mandatory-locking.txt file to explain about
the "mand" option, and the behavior one should expect when it is
disabled.
Reported-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2efc459d06 upstream.
Output defects can exist in sysfs content using sprintf and snprintf.
sprintf does not know the PAGE_SIZE maximum of the temporary buffer
used for outputting sysfs content and it's possible to overrun the
PAGE_SIZE buffer length.
Add a generic sysfs_emit function that knows that the size of the
temporary buffer and ensures that no overrun is done.
Add a generic sysfs_emit_at function that can be used in multiple
call situations that also ensures that no overrun is done.
Validate the output buffer argument to be page aligned.
Validate the offset len argument to be within the PAGE_SIZE buf.
Signed-off-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/r/884235202216d464d61ee975f7465332c86f76b2.1600285923.git.joe@perches.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d3a84a8d0d ]
The basic permission bits (protection bits in AmigaOS) have been broken
in Linux' AFFS - it would only set bits, but never delete them.
Also, contrary to the documentation, the Archived bit was not handled.
Let's fix this for good, and set the bits such that Linux and classic
AmigaOS can coexist in the most peaceful manner.
Also, update the documentation to represent the current state of things.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Max Staudt <max@enpas.org>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit d9a9f4849f upstream.
several iterations of ->atomic_open() calling conventions ago, we
used to need fput() if ->atomic_open() failed at some point after
successful finish_open(). Now (since 2016) it's not needed -
struct file carries enough state to make fput() work regardless
of the point in struct file lifecycle and discarding it on
failure exits in open() got unified. Unfortunately, I'd missed
the fact that we had an instance of ->atomic_open() (cifs one)
that used to need that fput(), as well as the stale comment in
finish_open() demanding such late failure handling. Trivially
fixed...
Fixes: fe9ec8291f "do_last(): take fput() on error after opening to out:"
Cc: stable@kernel.org # v4.7+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7550c60798 ]
Patch series "THP eligibility reporting via proc".
This series of three patches aims at making THP eligibility reporting much
more robust and long term sustainable. The trigger for the change is a
regression report [2] and the long follow up discussion. In short the
specific application didn't have good API to query whether a particular
mapping can be backed by THP so it has used VMA flags to workaround that.
These flags represent a deep internal state of VMAs and as such they
should be used by userspace with a great deal of caution.
A similar has happened for [3] when users complained that VM_MIXEDMAP is
no longer set on DAX mappings. Again a lack of a proper API led to an
abuse.
The first patch in the series tries to emphasise that that the semantic of
flags might change and any application consuming those should be really
careful.
The remaining two patches provide a more suitable interface to address [2]
and provide a consistent API to query the THP status both for each VMA and
process wide as well. [1]
http://lkml.kernel.org/r/20181120103515.25280-1-mhocko@kernel.org [2]
http://lkml.kernel.org/r/http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
[3] http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz
This patch (of 3):
Even though vma flags exported via /proc/<pid>/smaps are explicitly
documented to be not guaranteed for future compatibility the warning
doesn't go far enough because it doesn't mention semantic changes to those
flags. And they are important as well because these flags are a deep
implementation internal to the MM code and the semantic might change at
any time.
Let's consider two recent examples:
http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz
: commit e1fb4a0864 "dax: remove VM_MIXEDMAP for fsdax and device dax" has
: removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the
: mean time certain customer of ours started poking into /proc/<pid>/smaps
: and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA
: flags, the application just fails to start complaining that DAX support is
: missing in the kernel.
http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
: Commit 1860033237 ("mm: make PR_SET_THP_DISABLE immediately active")
: introduced a regression in that userspace cannot always determine the set
: of vmas where thp is ineligible.
: Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps
: to determine if a vma is eligible to be backed by hugepages.
: Previous to this commit, prctl(PR_SET_THP_DISABLE, 1) would cause thp to
: be disabled and emit "nh" as a flag for the corresponding vmas as part of
: /proc/pid/smaps. After the commit, thp is disabled by means of an mm
: flag and "nh" is not emitted.
: This causes smaps parsing libraries to assume a vma is eligible for thp
: and ends up puzzling the user on why its memory is not backed by thp.
In both cases userspace was relying on a semantic of a specific VMA flag.
The primary reason why that happened is a lack of a proper interface.
While this has been worked on and it will be fixed properly, it seems that
our wording could see some refinement and be more vocal about semantic
aspect of these flags as well.
Link: http://lkml.kernel.org/r/20181211143641.3503-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Paul Oppenheimer <bepvte@gmail.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Pull f2fs update from Jaegeuk Kim:
"In this round, we've mainly focused on performance tuning and critical
bug fixes occurred in low-end devices. Sheng Yong introduced
lost_found feature to keep missing files during recovery instead of
thrashing them. We're preparing coming fsverity implementation. And,
we've got more features to communicate with users for better
performance. In low-end devices, some memory-related issues were
fixed, and subtle race condtions and corner cases were addressed as
well.
Enhancements:
- large nat bitmaps for more free node ids
- add three block allocation policies to pass down write hints given by user
- expose extension list to user and introduce hot file extension
- tune small devices seamlessly for low-end devices
- set readdir_ra by default
- give more resources under gc_urgent mode regarding to discard and cleaning
- introduce fsync_mode to enforce posix or not
- nowait aio support
- add lost_found feature to keep dangling inodes
- reserve bits for future fsverity feature
- add test_dummy_encryption for FBE
Bug fixes:
- don't use highmem for dentry pages
- align memory boundary for bitops
- truncate preallocated blocks in write errors
- guarantee i_times on fsync call
- clear CP_TRIMMED_FLAG correctly
- prevent node chain loop during recovery
- avoid data race between atomic write and background cleaning
- avoid unnecessary selinux violation warnings on resgid option
- GFP_NOFS to avoid deadlock in quota and read paths
- fix f2fs_skip_inode_update to allow i_size recovery
In addition to the above, there are several minor bug fixes and clean-ups"
Cherry-pick from origin/upstream-f2fs-stable-linux-4.9.y:
ac389af190fb f2fs: remain written times to update inode during fsync
270deeb87125 f2fs: make assignment of t->dentry_bitmap more readable
a4fa11c8da10 f2fs: truncate preallocated blocks in error case
4478970f0e73 f2fs: fix a wrong condition in f2fs_skip_inode_update
29cead58f5ea f2fs: reserve bits for fs-verity
848b293a5d95 f2fs: Add a segment type check in inplace write
2dc8f5a3a640 f2fs: no need to initialize zero value for GFP_F2FS_ZERO
83b9bb95a628 f2fs: don't track new nat entry in nat set
a33ce03ac477 f2fs: clean up with F2FS_BLK_ALIGN
a3f8ec8082e3 f2fs: check blkaddr more accuratly before issue a bio
034f11eadb16 f2fs: Set GF_NOFS in read_cache_page_gfp while doing f2fs_quota_read
aa5bcfd8f488 f2fs: introduce a new mount option test_dummy_encryption
9b880fe6e6e2 f2fs: introduce F2FS_FEATURE_LOST_FOUND feature
80d6489a08c1 f2fs: release locks before return in f2fs_ioc_gc_range()
9f1896c490eb f2fs: align memory boundary for bitops
c7930ee88334 f2fs: remove unneeded set_cold_node()
355d2346409a f2fs: add nowait aio support
e9a50e6b9479 f2fs: wrap all options with f2fs_sb_info.mount_opt
b6d2ec83e0c0 f2fs: Don't overwrite all types of node to keep node chain
9a954816298c f2fs: introduce mount option for fsync mode
4ce4eb697068 f2fs: fix to restore old mount option in ->remount_fs
8f711c344e61 f2fs: wrap sb_rdonly with f2fs_readonly
c07478ee84bf f2fs: avoid selinux denial on CAP_SYS_RESOURCE
ac734c416fa9 f2fs: support hot file extension
f4f10221accc f2fs: fix to avoid race in between atomic write and background GC
e87b13ec160b f2fs: do gc in greedy mode for whole range if gc_urgent mode is set
e9878588de94 f2fs: issue discard aggressively in the gc_urgent mode
ad3ce479e6e4 f2fs: set readdir_ra by default
5aae2026bbd2 f2fs: add auto tuning for small devices
78c1fc2d8f27 f2fs: add mount option for segment allocation policy
ecd02f564631 f2fs: don't stop GC if GC is contended
1e72cb27d2d6 f2fs: expose extension_list sysfs entry
061839d178ab f2fs: fix to set KEEP_SIZE bit in f2fs_zero_range
4951ebcbc4e2 f2fs: introduce sb_lock to make encrypt pwsalt update exclusive
939f6be0420f f2fs: remove redundant initialization of pointer 'p'
39bea4bc8ef2 f2fs: flush cp pack except cp pack 2 page at first
770611eb2ab4 f2fs: clean up f2fs_sb_has_xxx functions
4d8e4a8965f9 f2fs: remove redundant check of page type when submit bio
e9878588de94 f2fs: issue discard aggressively in the gc_urgent mode
ad3ce479e6e4 f2fs: set readdir_ra by default
5aae2026bbd2 f2fs: add auto tuning for small devices
78c1fc2d8f27 f2fs: add mount option for segment allocation policy
ecd02f564631 f2fs: don't stop GC if GC is contended
1e72cb27d2d6 f2fs: expose extension_list sysfs entry
061839d178ab f2fs: fix to set KEEP_SIZE bit in f2fs_zero_range
4951ebcbc4e2 f2fs: introduce sb_lock to make encrypt pwsalt update exclusive
939f6be0420f f2fs: remove redundant initialization of pointer 'p'
39bea4bc8ef2 f2fs: flush cp pack except cp pack 2 page at first
770611eb2ab4 f2fs: clean up f2fs_sb_has_xxx functions
4d8e4a8965f9 f2fs: remove redundant check of page type when submit bio
b57a37f01fda f2fs: fix to handle looped node chain during recovery
9ac5b8c54083 f2fs: handle quota for orphan inodes
87c18066016a f2fs: support passing down write hints to block layer with F2FS policy
bcdc571e8d8b f2fs: support passing down write hints given by users to block layer
92413bc12e32 f2fs: fix to clear CP_TRIMMED_FLAG
a1afb55f9784 f2fs: support large nat bitmap
636039140493 f2fs: fix to check extent cache in f2fs_drop_extent_tree
7de4fccdbce1 f2fs: restrict inline_xattr_size configuration
aae506a8b704 f2fs: fix heap mode to reset it back
8fa455bb6ea0 f2fs: fix potential corruption in area before F2FS_SUPER_OFFSET
9d9cb0ef73f9 fscrypt: fix build with pre-4.6 gcc versions
401052ffc6b4 fscrypt: remove 'ci' parameter from fscrypt_put_encryption_info()
549b2061b3b5 fscrypt: fix up fscrypt_fname_encrypted_size() for internal use
c440b5091a0c fscrypt: define fscrypt_fname_alloc_buffer() to be for presented names
7d82f0e1c39a ext4: switch to fscrypt ->symlink() helper functions
ba4efe560438 ext4: switch to fscrypt_get_symlink()
b0edc2f22d24 fscrypt: calculate NUL-padding length in one place only
62cfdd9868c7 fscrypt: move fscrypt_symlink_data to fscrypt_private.h
e4e6776522bc fscrypt: remove fscrypt_fname_usr_to_disk()
45028b5aaa4e f2fs: switch to fscrypt_get_symlink()
f62d3d31e0c7 f2fs: switch to fscrypt ->symlink() helper functions
da32a1633ad3 fscrypt: new helper function - fscrypt_get_symlink()
a7e05c731d11 fscrypt: new helper functions for ->symlink()
eb9c5fd896de fscrypt: trim down fscrypt.h includes
0a02472d8ae2 fscrypt: move fscrypt_is_dot_dotdot() to fs/crypto/fname.c
9d51ca80274c fscrypt: move fscrypt_valid_enc_modes() to fscrypt_private.h
efbfa8c6a056 fscrypt: move fscrypt_operations declaration to fscrypt_supp.h
616dbd2bdc6a fscrypt: split fscrypt_dummy_context_enabled() into supp/notsupp versions
f0c472bcbf1c fscrypt: move fscrypt_ctx declaration to fscrypt_supp.h
bc76f39109b1 fscrypt: move fscrypt_info_cachep declaration to fscrypt_private.h
b67b07ec4964 fscrypt: move fscrypt_control_page() to supp/notsupp headers
d8dfb89961d0 fscrypt: move fscrypt_has_encryption_key() to supp/notsupp headers
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Changes in 4.9.83
scsi: smartpqi: allow static build ("built-in")
drm/radeon: Add dpm quirk for Jet PRO (v2)
drm/radeon: adjust tested variable
rtc-opal: Fix handling of firmware error codes, prevent busy loops
mbcache: initialize entry->e_referenced in mb_cache_entry_create()
jbd2: fix sphinx kernel-doc build warnings
ext4: fix a race in the ext4 shutdown path
ext4: save error to disk in __ext4_grp_locked_error()
ext4: correct documentation for grpid mount option
mm: hide a #warning for COMPILE_TEST
mm: Fix memory size alignment in devm_memremap_pages_release()
MIPS: Fix typo BIG_ENDIAN to CPU_BIG_ENDIAN
PCI: keystone: Fix interrupt-controller-node lookup
video: fbdev: atmel_lcdfb: fix display-timings lookup
console/dummy: leave .con_font_get set to NULL
rtlwifi: rtl8821ae: Fix connection lost problem correctly
target/iscsi: avoid NULL dereference in CHAP auth error path
Btrfs: fix deadlock in run_delalloc_nocow
Btrfs: fix crash due to not cleaning up tree log block's dirty bits
Btrfs: fix extent state leak from tree log
Btrfs: fix btrfs_evict_inode to handle abnormal inodes correctly
Btrfs: fix unexpected -EEXIST when creating new inode
9p/trans_virtio: discard zero-length reply
mtd: nand: vf610: set correct ooblayout
ALSA: hda - Fix headset mic detection problem for two Dell machines
ALSA: usb-audio: Fix UAC2 get_ctl request with a RANGE attribute
ALSA: hda/realtek - Enable Thinkpad Dock device for ALC298 platform
ALSA: hda/realtek: PCI quirk for Fujitsu U7x7
ALSA: usb-audio: add implicit fb quirk for Behringer UFX1204
ALSA: seq: Fix racy pool initializations
mvpp2: fix multicast address filter
usb: Move USB_UHCI_BIG_ENDIAN_* out of USB_SUPPORT
dm: correctly handle chained bios in dec_pending()
powerpc: fix build errors in stable tree
IB/qib: Fix comparison error with qperf compare/swap test
IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH ports
kselftest: fix OOM in memory compaction test
RDMA/rxe: Fix a race condition related to the QP error state
cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin
PM / devfreq: Propagate error from devfreq_add_device()
ocfs2: try a blocking lock before return AOP_TRUNCATED_PAGE
s390: fix handling of -1 in set{,fs}[gu]id16 syscalls
arm64: dts: msm8916: Correct ipc references for smsm
ARM: lpc3250: fix uda1380 gpio numbers
ARM: dts: STi: Add gpio polarity for "hdmi,hpd-gpio" property
ARM: dts: nomadik: add interrupt-parent for clcd
arm: spear600: Add missing interrupt-parent of rtc
arm: spear13xx: Fix dmas cells
arm: spear13xx: Fix spics gpio controller's warning
x86/entry/64/compat: Clear registers for compat syscalls, to reduce speculation attack surface
compiler-gcc.h: Introduce __optimize function attribute
x86/speculation: Update Speculation Control microcode blacklist
x86/speculation: Correct Speculation Control microcode blacklist again
KVM/x86: Reduce retpoline performance impact in slot_handle_level_range(), by always inlining iterator helper methods
X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs
x86/speculation: Clean up various Spectre related details
selftests/x86/pkeys: Remove unused functions
selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c
selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c
x86/speculation: Fix up array_index_nospec_mask() asm constraint
nospec: Move array_index_nospec() parameter checking into separate macro
x86/speculation: Add <asm/msr-index.h> dependency
selftests/x86/mpx: Fix incorrect bounds with old _sigfault
x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping
x86/spectre: Fix an error message
x86/cpu: Change type of x86_cache_size variable to unsigned int
x86: fix build warnign with 32-bit PAE
vfs: don't do RCU lookup of empty pathnames
ARM: dts: exynos: fix RTC interrupt for exynos5410
ARM: pxa/tosa-bt: add MODULE_LICENSE tag
arm64: dts: msm8916: Add missing #phy-cells
ARM: dts: s5pv210: add interrupt-parent for ohci
arm: dts: mt2701: Add reset-cells
ARM: dts: Delete bogus reference to the charlcd
media: r820t: fix r820t_write_reg for KASAN
Linux 4.9.83
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Cherry-picked from upstream-f2fs-stable-linux-4.9.y
Changes include:
commit 30da3a4de96733 ("f2fs: hurry up to issue discard after io interruption")
commit d1c363b48398d4 ("f2fs: fix to show correct discard_granularity in sysfs")
...
commit e6b120d4d01ab0 ("f2fs/fscrypt: catch up to v4.12")
commit 4d7931d72758db ("KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()")
Signed-off-by: Hyojun Kim <hyojun@google.com>
Userspace processes often have multiple allocators that each do
anonymous mmaps to get memory. When examining memory usage of
individual processes or systems as a whole, it is useful to be
able to break down the various heaps that were allocated by
each layer and examine their size, RSS, and physical memory
usage.
This patch adds a user pointer to the shared union in
vm_area_struct that points to a null terminated string inside
the user process containing a name for the vma. vmas that
point to the same address will be merged, but vmas that
point to equivalent strings at different addresses will
not be merged.
Userspace can set the name for a region of memory by calling
prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name);
Setting the name to NULL clears it.
The names of named anonymous vmas are shown in /proc/pid/maps
as [anon:<name>] and in /proc/pid/smaps in a new "Name" field
that is only present for named vmas. If the userspace pointer
is no longer valid all or part of the name will be replaced
with "<fault>".
The idea to store a userspace pointer to reduce the complexity
within mm (at the expense of the complexity of reading
/proc/pid/mem) came from Dave Hansen. This results in no
runtime overhead in the mm subsystem other than comparing
the anon_name pointers when considering vma merging. The pointer
is stored in a union with fieds that are only used on file-backed
mappings, so it does not increase memory usage.
Includes fix from Jed Davis <jld@mozilla.com> for typo in
prctl_set_vma_anon_name, which could attempt to set the name
across two vmas at the same time due to a typo, which might
corrupt the vma list. Fix it to use tmp instead of end to limit
the name setting to a single vma at a time.
Change-Id: I9aa7b6b5ef536cd780599ba4e2fba8ceebe8b59f
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
This reverts more of:
b76437579d ("procfs: mark thread stack correctly in proc/<pid>/maps")
... which was partially reverted by:
65376df582 ("proc: revert /proc/<pid>/maps [stack:TID] annotation")
Originally, /proc/PID/task/TID/maps was the same as /proc/TID/maps.
In current kernels, /proc/PID/maps (or /proc/TID/maps even for
threads) shows "[stack]" for VMAs in the mm's stack address range.
In contrast, /proc/PID/task/TID/maps uses KSTK_ESP to guess the
target thread's stack's VMA. This is racy, probably returns garbage
and, on arches with CONFIG_TASK_INFO_IN_THREAD=y, is also crash-prone:
KSTK_ESP is not safe to use on tasks that aren't known to be running
ordinary process-context kernel code.
This patch removes the difference and just shows "[stack]" for VMAs
in the mm's stack range. This is IMO much more sensible -- the
actual "stack" address really is treated specially by the VM code,
and the current thread stack isn't even well-defined for programs
that frequently switch stacks on their own.
Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux API <linux-api@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Link: http://lkml.kernel.org/r/3e678474ec14e0a0ec34c611016753eea2e1b8ba.1475257877.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull kselftest updates from Shuah Khan:
"This update consists of:
- Fixes and improvements to existing tests
- Moving code from Documentation to selftests, samples, and tools:
* Moves dnotify_test, prctl, ptp, vDSO, ia64, watchdog, and
networking tests from Documentation to selftests.
* Moves mic/mpssd, misc-devices/mei, timers, watchdog, auxdisplay,
and blackfin examples from Documentation to samples.
* Moves accounting, laptops/dslm, and pcmcia/crc32hash tools from
Documentation to tools.
* Deletes BUILD_DOCSRC and its dependencies"
* tag 'linux-kselftest-4.9-rc1-update' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (21 commits)
selftests/futex: Check ANSI terminal color support
Doc: update 00-INDEX files to reflect the runnable code move
samples: move blackfin gptimers-example from Documentation
tools: move pcmcia crc32hash tool from Documentation
tools: move laptops dslm tool from Documentation
tools: move accounting tool from Documentation
samples: move auxdisplay example code from Documentation
samples: move watchdog example code from Documentation
samples: move timers example code from Documentation
samples: move misc-devices/mei example code from Documentation
samples: move mic/mpssd example code from Documentation
selftests: Move networking/timestamping from Documentation
selftests: move watchdog tests from Documentation/watchdog
selftests: move ia64 tests from Documentation/ia64
selftests: move vDSO tests from Documentation/vDSO
selftests: move ptp tests from Documentation/ptp
selftests: move prctl tests from Documentation/prctl
selftests: move dnotify_test from Documentation/filesystems
selftests/timers: Add missing error code assignment before test
selftests/zram: replace ZRAM_LZ4_COMPRESS
...
Pull more vfs updates from Al Viro:
">rename2() work from Miklos + current_time() from Deepa"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: Replace current_fs_time() with current_time()
fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
fs: Replace CURRENT_TIME with current_time() for inode timestamps
fs: proc: Delete inode time initializations in proc_alloc_inode()
vfs: Add current_time() api
vfs: add note about i_op->rename changes to porting
fs: rename "rename2" i_op to "rename"
vfs: remove unused i_op->rename
fs: make remaining filesystems use .rename2
libfs: support RENAME_NOREPLACE in simple_rename()
fs: support RENAME_NOREPLACE for local filesystems
ncpfs: fix unused variable warning
Pull vfs xattr updates from Al Viro:
"xattr stuff from Andreas
This completes the switch to xattr_handler ->get()/->set() from
->getxattr/->setxattr/->removexattr"
* 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: Remove {get,set,remove}xattr inode operations
xattr: Stop calling {get,set,remove}xattr inode operations
vfs: Check for the IOP_XATTR flag in listxattr
xattr: Add __vfs_{get,set,remove}xattr helpers
libfs: Use IOP_XATTR flag for empty directory handling
vfs: Use IOP_XATTR flag for bad-inode handling
vfs: Add IOP_XATTR inode operations flag
vfs: Move xattr_resolve_name to the front of fs/xattr.c
ecryptfs: Switch to generic xattr handlers
sockfs: Get rid of getxattr iop
sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
kernfs: Switch to generic xattr handlers
hfs: Switch to generic xattr handlers
jffs2: Remove jffs2_{get,set,remove}xattr macros
xattr: Remove unnecessary NULL attribute name check
Pull Ceph updates from Ilya Dryomov:
"The big ticket item here is support for rbd exclusive-lock feature,
with maintenance operations offloaded to userspace (Douglas Fuller,
Mike Christie and myself). Another block device bullet is a series
fixing up layering error paths (myself).
On the filesystem side, we've got patches that improve our handling of
buffered vs dio write races (Neil Brown) and a few assorted fixes from
Zheng. Also included a couple of random cleanups and a minor CRUSH
update"
* tag 'ceph-for-4.9-rc1' of git://github.com/ceph/ceph-client: (39 commits)
crush: remove redundant local variable
crush: don't normalize input of crush_ln iteratively
libceph: ceph_build_auth() doesn't need ceph_auth_build_hello()
libceph: use CEPH_AUTH_UNKNOWN in ceph_auth_build_hello()
ceph: fix description for rsize and rasize mount options
rbd: use kmalloc_array() in rbd_header_from_disk()
ceph: use list_move instead of list_del/list_add
ceph: handle CEPH_SESSION_REJECT message
ceph: avoid accessing / when mounting a subpath
ceph: fix mandatory flock check
ceph: remove warning when ceph_releasepage() is called on dirty page
ceph: ignore error from invalidate_inode_pages2_range() in direct write
ceph: fix error handling of start_read()
rbd: add rbd_obj_request_error() helper
rbd: img_data requests don't own their page array
rbd: don't call rbd_osd_req_format_read() for !img_data requests
rbd: rework rbd_img_obj_exists_submit() error paths
rbd: don't crash or leak on errors in rbd_img_obj_parent_read_full_callback()
rbd: move bumping img_request refcount into rbd_obj_request_submit()
rbd: mark the original request as done if stat request fails
...
Pull misc vfs updates from Al Viro:
"Assorted misc bits and pieces.
There are several single-topic branches left after this (rename2
series from Miklos, current_time series from Deepa Dinamani, xattr
series from Andreas, uaccess stuff from from me) and I'd prefer to
send those separately"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits)
proc: switch auxv to use of __mem_open()
hpfs: support FIEMAP
cifs: get rid of unused arguments of CIFSSMBWrite()
posix_acl: uapi header split
posix_acl: xattr representation cleanups
fs/aio.c: eliminate redundant loads in put_aio_ring_file
fs/internal.h: add const to ns_dentry_operations declaration
compat: remove compat_printk()
fs/buffer.c: make __getblk_slow() static
proc: unsigned file descriptors
fs/file: more unsigned file descriptors
fs: compat: remove redundant check of nr_segs
cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]
cifs: don't use memcpy() to copy struct iov_iter
get rid of separate multipage fault-in primitives
fs: Avoid premature clearing of capabilities
fs: Give dentry to inode_change_ok() instead of inode
fuse: Propagate dentry down to inode_change_ok()
ceph: Propagate dentry down to inode_change_ok()
xfs: Propagate dentry down to inode_change_ok()
...
All filesystems that support xattrs by now do so via xattr handlers.
They all define sb->s_xattr, and their getxattr, setxattr, and
removexattr inode operations use the generic inode operations. On
filesystems that don't support xattrs, the xattr inode operations are
all NULL, and sb->s_xattr is also NULL.
This means that we can remove the getxattr, setxattr, and removexattr
inode operations and directly call the generic handlers, or better,
inline expand those handlers into fs/xattr.c.
Filesystems that do not support xattrs on some inodes should clear the
IOP_XATTR i_opflags flag in those inodes. (Right now, some filesystems
have checks to disable xattrs on some inodes in the ->list, ->get, and
->set xattr handler operations instead.) The IOP_XATTR flag is
automatically cleared in inodes of filesystems that don't have xattr
support.
In orangefs, symlinks do have a setxattr iop but no getxattr iop. Add a
check for symlinks to orangefs_inode_getxattr to preserve the current,
weird behavior; that check may not be necessary though.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've investigated how f2fs deals with errors given by
our fault injection facility. With this, we could fix several corner
cases. And, in order to improve the performance, we set inline_dentry
by default and enhance the exisiting discard issue flow. In addition,
we added f2fs_migrate_page for better memory management.
Enhancements:
- set inline_dentry by default
- improve discard issue flow
- add more fault injection cases in f2fs
- allow block preallocation for encrypted files
- introduce migrate_page callback function
- avoid truncating the next direct node block at every checkpoint
Bug fixes:
- set page flag correctly between write_begin and write_end
- missing error handling cases detected by fault injection
- preallocate blocks regarding to 4KB alignement correctly
- dentry and filename handling of encryption
- lost xattrs of directories"
* tag 'for-f2fs-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (69 commits)
f2fs: introduce update_ckpt_flags to clean up
f2fs: don't submit irrelevant page
f2fs: fix to commit bio cache after flushing node pages
f2fs: introduce get_checkpoint_version for cleanup
f2fs: remove dead variable
f2fs: remove redundant io plug
f2fs: support checkpoint error injection
f2fs: fix to recover old fault injection config in ->remount_fs
f2fs: do fault injection initialization in default_options
f2fs: remove redundant value definition
f2fs: support configuring fault injection per superblock
f2fs: adjust display format of segment bit
f2fs: remove dirty inode pages in error path
f2fs: do not unnecessarily null-terminate encrypted symlink data
f2fs: handle errors during recover_orphan_inodes
f2fs: avoid gc in cp_error case
f2fs: should put_page for summary page
f2fs: assign return value in f2fs_gc
f2fs: add customized migrate_page callback
f2fs: introduce cp_lock to protect updating of ckpt_flags
...
Pull xfs and iomap updates from Dave Chinner:
"The main things in this update are the iomap-based DAX infrastructure,
an XFS delalloc rework, and a chunk of fixes to how log recovery
schedules writeback to prevent spurious corruption detections when
recovery of certain items was not required.
The other main chunk of code is some preparation for the upcoming
reflink functionality. Most of it is generic and cleanups that stand
alone, but they were ready and reviewed so are in this pull request.
Speaking of reflink, I'm currently planning to send you another pull
request next week containing all the new reflink functionality. I'm
working through a similar process to the last cycle, where I sent the
reverse mapping code in a separate request because of how large it
was. The reflink code merge is even bigger than reverse mapping, so
I'll be doing the same thing again....
Summary for this update:
- change of XFS mailing list to linux-xfs@vger.kernel.org
- iomap-based DAX infrastructure w/ XFS and ext2 support
- small iomap fixes and additions
- more efficient XFS delayed allocation infrastructure based on iomap
- a rework of log recovery writeback scheduling to ensure we don't
fail recovery when trying to replay items that are already on disk
- some preparation patches for upcoming reflink support
- configurable error handling fixes and documentation
- aio access time update race fixes for XFS and
generic_file_read_iter"
* tag 'xfs-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (40 commits)
fs: update atime before I/O in generic_file_read_iter
xfs: update atime before I/O in xfs_file_dio_aio_read
ext2: fix possible integer truncation in ext2_iomap_begin
xfs: log recovery tracepoints to track current lsn and buffer submission
xfs: update metadata LSN in buffers during log recovery
xfs: don't warn on buffers not being recovered due to LSN
xfs: pass current lsn to log recovery buffer validation
xfs: rework log recovery to submit buffers on LSN boundaries
xfs: quiesce the filesystem after recovery on readonly mount
xfs: remote attribute blocks aren't really userdata
ext2: use iomap to implement DAX
ext2: stop passing buffer_head to ext2_get_blocks
xfs: use iomap to implement DAX
xfs: refactor xfs_setfilesize
xfs: take the ilock shared if possible in xfs_file_iomap_begin
xfs: fix locking for DAX writes
dax: provide an iomap based fault handler
dax: provide an iomap based dax read/write path
dax: don't pass buffer_head to copy_user_dax
dax: don't pass buffer_head to dax_insert_mapping
...
Pull documentation updates from Jonathan Corbet:
"This is the documentation update pull for the 4.9 merge window.
The Sphinx transition is still creating a fair amount of work. Here we
have a number of fixes and, importantly, a proper PDF output solution,
thanks to Jani Nikula, Mauro Carvalho Chehab and Markus Heiser.
I've started a couple of new books: a driver API book (based on the
old device-drivers.tmpl) and a development tools book. Both are meant
to show how we can integrate together our existing documentation into
a more coherent and accessible whole. It involves moving some stuff
around and formatting changes, but, I think, the results are worth it.
The good news is that most of our existing Documentation/*.txt files
are *almost* in RST format already; the amount of messing around
required is minimal.
And, of course, there's the usual set of updates, typo fixes, and
more"
* tag 'docs-4.9' of git://git.lwn.net/linux: (120 commits)
URL changed for Linux Foundation TAB
dax : Fix documentation with respect to struct pages
iio: Documentation: Correct the path used to create triggers.
docs: Remove space-before-label guidance from CodingStyle
docs-rst: add inter-document cross references
Documentation/email-clients.txt: convert it to ReST markup
Documentation/kernel-docs.txt: reorder based on timestamp
Documentation/kernel-docs.txt: Add dates for online docs
Documentation/kernel-docs.txt: get rid of broken docs
Documentation/kernel-docs.txt: move in-kernel docs
Documentation/kernel-docs.txt: remove more legacy references
Documentation/kernel-docs.txt: add two published books
Documentation/kernel-docs.txt: sort books per publication date
Documentation/kernel-docs.txt: adjust LDD references
Documentation/kernel-docs.txt: some improvements on the ReST output
Documentation/kernel-docs.txt: Consistent indenting: 4 spaces
Documentation/kernel-docs.txt: Add 4 paper/book references
Documentation/kernel-docs.txt: Improve layouting of book list
Documentation/kernel-docs.txt: Remove offline or outdated entries
docs: Clean up bare :: lines
...
The documentation for dax is not up to date with respect to the struct
page support available in some of the device drivers that utilize
it.
Signed-off-by: Stephen Bates <sbates@raithlin.com>
Acked-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Move dnotify_test.c, Makefile, and .gitignore from Documentation/filesystems
to selftests/filesystems.
Remove filesystems build target from Documentation/Makefile and update
selftests/filesystems/Makefile to work under selftests. dnotify_test will
not be run as part of selftests suite and will not be included in install
targets. It can be built separately for now.
Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Document the implementation of error handlers into sysfs.
[dchinner: Added lots more detail.]
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Make inline_dentry as default mount option to improve space usage and
IO performance in scenario of numerous small directory.
It adds noinline_dentry mount option, instead.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull documentation fixes from Jonathan Corbet:
"Three fixes for the docs build, including removing an annoying warning
on 'make help' if sphinx isn't present"
* tag 'doc-4.8-fixes' of git://git.lwn.net/linux:
DocBook: use DOCBOOKS="" to ignore DocBooks instead of IGNORE_DOCBOOKS=1
Documenation: update cgroup's document path
Documentation/sphinx: do not warn about missing tools in 'make help'
Pull more vfs updates from Al Viro:
"Assorted cleanups and fixes.
In the "trivial API change" department - ->d_compare() losing 'parent'
argument"
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
cachefiles: Fix race between inactivating and culling a cache object
9p: use clone_fid()
9p: fix braino introduced in "9p: new helper - v9fs_parent_fid()"
vfs: make dentry_needs_remove_privs() internal
vfs: remove file_needs_remove_privs()
vfs: fix deadlock in file_remove_privs() on overlayfs
get rid of 'parent' argument of ->d_compare()
cifs, msdos, vfat, hfs+: don't bother with parent in ->d_compare()
affs ->d_compare(): don't bother with ->d_inode
fold _d_rehash() and __d_rehash() together
fold dentry_rcuwalk_invalidate() into its only remaining caller
cgroup's document path is changed to "cgroup-v1". update it.
Signed-off-by: seokhoon.yoon <iamyooon@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Merge yet more updates from Andrew Morton:
- the rest of ocfs2
- various hotfixes, mainly MM
- quite a bit of misc stuff - drivers, fork, exec, signals, etc.
- printk updates
- firmware
- checkpatch
- nilfs2
- more kexec stuff than usual
- rapidio updates
- w1 things
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (111 commits)
ipc: delete "nr_ipc_ns"
kcov: allow more fine-grained coverage instrumentation
init/Kconfig: add clarification for out-of-tree modules
config: add android config fragments
init/Kconfig: ban CONFIG_LOCALVERSION_AUTO with allmodconfig
relay: add global mode support for buffer-only channels
init: allow blacklisting of module_init functions
w1:omap_hdq: fix regression
w1: add helper macro module_w1_family
w1: remove need for ida and use PLATFORM_DEVID_AUTO
rapidio/switches: add driver for IDT gen3 switches
powerpc/fsl_rio: apply changes for RIO spec rev 3
rapidio: modify for rev.3 specification changes
rapidio: change inbound window size type to u64
rapidio/idt_gen2: fix locking warning
rapidio: fix error handling in mbox request/release functions
rapidio/tsi721_dma: advance queue processing from transfer submit call
rapidio/tsi721: add messaging mbox selector parameter
rapidio/tsi721: add PCIe MRRS override parameter
rapidio/tsi721_dma: add channel mask and queue size parameters
...
The header file "include/linux/nilfs2_fs.h" is composed of parts for
ioctl and disk format, and both are intended to be shared with user
space programs.
This moves them to the uapi directory "include/uapi/linux" splitting the
file to "nilfs2_api.h" and "nilfs2_ondisk.h". The following minor
changes are accompanied by this migration:
- nilfs_direct_node struct in nilfs2/direct.h is converged to
nilfs2_ondisk.h because it's an on-disk structure.
- inline functions nilfs_rec_len_from_disk() and
nilfs_rec_len_to_disk() are moved to nilfs2/dir.c.
Link: http://lkml.kernel.org/r/1465825507-3407-4-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Describe use of jiffy-based timeout values involved in inode maintenance.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Pull libnvdimm updates from Dan Williams:
- Replace pcommit with ADR / directed-flushing.
The pcommit instruction, which has not shipped on any product, is
deprecated. Instead, the requirement is that platforms implement
either ADR, or provide one or more flush addresses per nvdimm.
ADR (Asynchronous DRAM Refresh) flushes data in posted write buffers
to the memory controller on a power-fail event.
Flush addresses are defined in ACPI 6.x as an NVDIMM Firmware
Interface Table (NFIT) sub-structure: "Flush Hint Address Structure".
A flush hint is an mmio address that when written and fenced assures
that all previous posted writes targeting a given dimm have been
flushed to media.
- On-demand ARS (address range scrub).
Linux uses the results of the ACPI ARS commands to track bad blocks
in pmem devices. When latent errors are detected we re-scrub the
media to refresh the bad block list, userspace can also request a
re-scrub at any time.
- Support for the Microsoft DSM (device specific method) command
format.
- Support for EDK2/OVMF virtual disk device memory ranges.
- Various fixes and cleanups across the subsystem.
* tag 'libnvdimm-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (41 commits)
libnvdimm-btt: Delete an unnecessary check before the function call "__nd_device_register"
nfit: do an ARS scrub on hitting a latent media error
nfit: move to nfit/ sub-directory
nfit, libnvdimm: allow an ARS scrub to be triggered on demand
libnvdimm: register nvdimm_bus devices with an nd_bus driver
pmem: clarify a debug print in pmem_clear_poison
x86/insn: remove pcommit
Revert "KVM: x86: add pcommit support"
nfit, tools/testing/nvdimm/: unify shutdown paths
libnvdimm: move ->module to struct nvdimm_bus_descriptor
nfit: cleanup acpi_nfit_init calling convention
nfit: fix _FIT evaluation memory leak + use after free
tools/testing/nvdimm: add manufacturing_{date|location} dimm properties
tools/testing/nvdimm: add virtual ramdisk range
acpi, nfit: treat virtual ramdisk SPA as pmem region
pmem: kill __pmem address space
pmem: kill wmb_pmem()
libnvdimm, pmem: use nvdimm_flush() for namespace I/O writes
fs/dax: remove wmb_pmem()
libnvdimm, pmem: flush posted-write queues on shutdown
...
Pull vfs updates from Al Viro:
"Assorted cleanups and fixes.
Probably the most interesting part long-term is ->d_init() - that will
have a bunch of followups in (at least) ceph and lustre, but we'll
need to sort the barrier-related rules before it can get used for
really non-trivial stuff.
Another fun thing is the merge of ->d_iput() callers (dentry_iput()
and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all
except the one in __d_lookup_lru())"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
fs/dcache.c: avoid soft-lockup in dput()
vfs: new d_init method
vfs: Update lookup_dcache() comment
bdev: get rid of ->bd_inodes
Remove last traces of ->sync_page
new helper: d_same_name()
dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends()
vfs: clean up documentation
vfs: document ->d_real()
vfs: merge .d_select_inode() into .d_real()
unify dentry_iput() and dentry_unlink_inode()
binfmt_misc: ->s_root is not going anywhere
drop redundant ->owner initializations
ufs: get rid of redundant checks
orangefs: constify inode_operations
missed comment updates from ->direct_IO() prototype change
file_inode(f)->i_mapping is f->f_mapping
trim fsnotify hooks a bit
9p: new helper - v9fs_parent_fid()
debugfs: ->d_parent is never NULL or negative
...