[ Upstream commit 28e18ee636 ]
The uninitialized variable dn.node_changed does not get set when a
call to f2fs_get_node_page fails. This uninitialized value gets used
in the call to f2fs_balance_fs() that may or not may not balances
dirty node and dentry pages depending on the uninitialized state of
the variable. Fix this by only calling f2fs_balance_fs if err is
not set.
Thanks to Jaegeuk Kim for suggesting an appropriate fix.
Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 2a34076070 ("f2fs: call f2fs_balance_fs only when node was changed")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
PD#SWPL-1773
Problem:
After adding optimization of vmap stack, we can found stack usage
of each functions when handle vmap fault. From test log we see some
functions using large stack size which over 256bytes. Especially
common call path from fs. We need to optimize stack usage of these
functions to reduce stack fault probability and save stack memory
usage.
Solution:
1. remove CONFIG_CC_STACKPROTECTOR_STRONG and set STACKPROTECTOR to
NONE. This can save stack usage add by compiler for most functions.
Kernel code size can also save over 1MB.
2. Add some noinline functions for android_fs_data rw trace calls. In
these trace call it allcated a 256 bytes local buffer.
3. Add a wrap function for mem abort handler. By default, it defined a
siginfo struct(size over 100 bytes) in local but only used when fault
can't be handled.
4. reduce cached page size for vmap stack since probability of page
fault caused by stack overflow is reduced after function stack usage
optimized.
Monkey test show real stack usage ratio compared with 1st vmap
implementation reduced from 35% ~ 38% to 26 ~ 27%. Which is very
close to 25%, theory limit.
Verify:
P212
Change-Id: I5505cacc1cab51f88654052902852fd648b6a036
Signed-off-by: tao zeng <tao.zeng@amlogic.com>
This patch clears PageError in some pages tagged by read path, but when we
write the pages with valid contents, writepage should clear the bit likewise
ext4.
Change-Id: I434b22132f29f7243ab9170296a6e0b52e40701d
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit f453147e9315b3bc1050b590278a63d91fc2a681)
Cherry-pick from origin/upstream-f2fs-stable-linux-4.9.y:
f69e814ccf1e ("f2fs: refactor read path to allow multiple postprocessing steps")
Currently f2fs's ->readpage() and ->readpages() assume that either the
data undergoes no postprocessing, or decryption only. But with
fs-verity, there will be an additional authenticity verification step,
and it may be needed either by itself, or combined with decryption.
To support this, store a 'struct bio_post_read_ctx' in ->bi_private
which contains a work struct, a bitmask of postprocessing steps that are
enabled, and an indicator of the current step. The bio completion
routine, if there was no I/O error, enqueues the first postprocessing
step. When that completes, it continues to the next step. Pages that
fail any postprocessing step have PageError set. Once all steps have
completed, pages without PageError set are set Uptodate, and all pages
are unlocked.
Also replace f2fs_encrypted_file() with a new function
f2fs_post_read_required() in places like direct I/O and garbage
collection that really should be testing whether the file needs special
I/O processing, not whether it is encrypted specifically.
This may also be useful for other future f2fs features such as
compression.
Change-Id: I9e1d7a21b8a7d89029c509df7edd895887993ab1
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cherry-pick from origin/upstream-f2fs-stable-linux-4.9.y:
975c5679a2d7 ("f2fs: don't put dentry page in pagecache into highmem")
Previous dentry page uses highmem, which will cause panic in platforms
using highmem (such as arm), since the address space of dentry pages
from highmem directly goes into the decryption path via the function
fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not
from highmem, and then cause panic since it doesn't call kmap_high but
kunmap_high is triggered at the end. To fix this problem in a simple
way, this patch avoids to put dentry page in pagecache into highmem.
Change-Id: Ia22ed1e5503e6c15d63e4ab3b02a747a47cbc9b1
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix coding style]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull f2fs updates from Jaegeuk Kim:
"In this round, we introduce sysfile-based quota support which is
required for Android by default. In addition, we allow that users are
able to reserve some blocks in runtime to mitigate performance drops
in low free space.
Enhancements:
- assign proper data segments according to write_hints given by user
- issue cache_flush on dirty devices only among multiple devices
- exploit cp_error flag and add more faults to enhance fault
injection test
- conduct more readaheads during f2fs_readdir
- add a range for discard commands
Bug fixes:
- fix zero stat->st_blocks when inline_data is set
- drop crypto key and free stale memory pointer while evict_inode is
failing
- fix some corner cases in free space and segment management
- fix wrong last_disk_size
This series includes lots of clean-ups and code enhancement in terms
of xattr operations, discard/flush command control. In addition, it
adds versatile debugfs entries to monitor f2fs status"
Cherry-picked from origin/upstream-f2fs-stable-linux-4.9.y:
5b2b7f7dd87f f2fs: deny accessing encryption policy if encryption is off
05dac2e89867 f2fs: inject fault in inc_valid_node_count
2e08de4fda00 f2fs: fix to clear FI_NO_PREALLOC
931ecc22b402 f2fs: expose quota information in debugfs
45d6e702d3a9 f2fs: separate nat entry mem alloc from nat_tree_lock
8e2f721703b4 f2fs: validate before set/clear free nat bitmap
27d50282d073 f2fs: avoid opened loop codes in __add_ino_entry
b1823df0e68f f2fs: apply write hints to select the type of segments for buffered write
b561061c067b f2fs: introduce scan_curseg_cache for cleanup
5772e0c102b0 f2fs: optimize the way of traversing free_nid_bitmap
a51e85eae2c3 f2fs: keep scanning until enough free nids are acquired
d75eb8d7345e f2fs: trace checkpoint reason in fsync()
bed6cffdf7e4 f2fs: keep isize once block is reserved cross EOF
5f3fdd2afc9b f2fs: avoid race in between GC and block exchange
51cb399e7ead f2fs: save a multiplication for last_nid calculation
7f41aab3d61d f2fs: fix summary info corruption
148c518517fc f2fs: remove dead code in update_meta_page
c3bc6e5183f0 f2fs: remove unneeded semicolon
9e71a0321f32 f2fs: don't bother with inode->i_version
49f72728e708 f2fs: check curseg space before foreground GC
25d0becffa0a f2fs: use rw_semaphore to protect SIT cache
0108c481d7af f2fs: support quota sys files
d4c292db7b81 f2fs: add quota_ino feature infra
1033eee92c41 f2fs: optimize __update_nat_bits
247e8951164a f2fs: modify for accurate fggc node io stat
c7272f8aebe7 Revert "f2fs: handle dirty segments inside refresh_sit_entry"
068868fc7e26 f2fs: add a function to move nid
b9f73875af11 f2fs: export SSR allocation threshold
ab30204bb9d8 f2fs: give correct trimmed blocks in fstrim
b5db2de4623f f2fs: support bio allocation error injection
58ddec85e417 f2fs: support get_page error injection
ef216e610a14 f2fs: add missing sysfs description
68ab6f8dd541 f2fs: support soft block reservation
d7947e2a3118 f2fs: handle error case when adding xattr entry
50ffaa980f98 f2fs: support flexible inline xattr size
5a8ed073c7fa f2fs: show current cp state
d888fcd74c18 f2fs: add missing quota_initialize
af1cc1ea2309 f2fs: show # of dirty segments via sysfs
6663422a3642 f2fs: stop all the operations by cp_error flag
872d8e3af080 f2fs: remove several redundant assignments
bf823c82e3fe f2fs: avoid using timespec
c70ab1b99321 f2fs: fix to correct no_fggc_candidate
0e6275dc317b Revert "f2fs: return wrong error number on f2fs_quota_write"
41d59230e302 f2fs: remove obsolete pointer for truncate_xattr_node
8c12a10f2ee4 f2fs: retry ENOMEM for quota_read|write
35e13ca2e9d9 f2fs: limit # of inmemory pages
9ca57a7e96e0 f2fs: update ctx->pos correctly when hitting hole in directory
a04208e54b9c f2fs: relocate readahead codes in readdir()
905d0370e6ab f2fs: allow readdir() to be interrupted
2dfbda03f941 f2fs: trace f2fs_readdir
d67586ddf3e9 f2fs: trace f2fs_lookup
4c94f14b3c8b f2fs: skip searching non-exist range in truncate_hole
ac5d4b425739 f2fs: expose some sectors to user in inline data or dentry case
5ded3b82dc2b f2fs: avoid stale fi->gdirty_list pointer
f6b708e25fb5 f2fs/crypto: drop crypto key at evict_inode only
33fdebbb0e7e f2fs: fix to avoid race when accessing last_disk_size
595046758d8e f2fs: Fix bool initialization/comparison
1e5305afa81e f2fs: give up CP_TRIMMED_FLAG if it drops discards
8258fd3054c1 f2fs: trace f2fs_remove_discard
6c46b37d9b43 f2fs: reduce cmd_lock coverage in __issue_discard_cmd
daf437d37cff f2fs: split discard policy
69a596797adf f2fs: wrap discard policy
28e1023e8e8a f2fs: support issuing/waiting discard in range
fd6422ea9264 f2fs: fix to flush multiple device in checkpoint
f014be822ce7 f2fs: enhance multiple device flush
0597a6e4bdcd f2fs: fix to show ino management cache size correctly
cacc1ed0c46a f2fs: drop FI_UPDATE_WRITE tag after f2fs_issue_flush
84af6aeceb49 f2fs: obsolete ALLOC_NID_LIST list
8456d343780d f2fs: convert inline data for direct I/O & FI_NO_PREALLOC
3f01af786c84 f2fs: allow readpages with NULL file pointer
2f0df25e6529 f2fs: show flush list status in sysfs
20ef20fbf78e f2fs: introduce read_xattr_block
126221de375b f2fs: introduce read_inline_xattr
127faa71f6a6 Revert "f2fs: reuse nids more aggressively"
c19928e660fb Revert "f2fs: node segment is prior to data segment selected victim"
Change-Id: I2f892e6ee75c41e84241f37b1903e0c32387d95b
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Cherry-picked from upstream-f2fs-stable-linux-4.9.y
Changes include:
commit 30da3a4de96733 ("f2fs: hurry up to issue discard after io interruption")
commit d1c363b48398d4 ("f2fs: fix to show correct discard_granularity in sysfs")
...
commit e6b120d4d01ab0 ("f2fs/fscrypt: catch up to v4.12")
commit 4d7931d72758db ("KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()")
Signed-off-by: Hyojun Kim <hyojun@google.com>
Changes in 4.9.30
usb: misc: legousbtower: Fix buffers on stack
usb: misc: legousbtower: Fix memory leak
USB: ene_usb6250: fix DMA to the stack
watchdog: pcwd_usb: fix NULL-deref at probe
char: lp: fix possible integer overflow in lp_setup()
USB: core: replace %p with %pK
tpm_tis_core: Choose appropriate timeout for reading burstcount
ALSA: hda: Fix cpu lockup when stopping the cmd dmas
ARM: tegra: paz00: Mark panel regulator as enabled on boot
fanotify: don't expose EOPENSTALE to userspace
tpm_tis_spi: Use single function to transfer data
tpm_tis_spi: Abort transfer when too many wait states are signaled
tpm_tis_spi: Check correct byte for wait state indicator
tpm_tis_spi: Remove limitation of transfers to MAX_SPI_FRAMESIZE bytes
tpm_tis_spi: Add small delay after last transfer
tpm: msleep() delays - replace with usleep_range() in i2c nuvoton driver
tpm: add sleep only for retry in i2c_nuvoton_write_status()
tpm_crb: check for bad response size
ASoC: cs4271: configure reset GPIO as output
mlx5: Fix mlx5_ib_map_mr_sg mr length
infiniband: call ipv6 route lookup via the stub interface
dm btree: fix for dm_btree_find_lowest_key()
dm raid: select the Kconfig option CONFIG_MD_RAID0
dm bufio: avoid a possible ABBA deadlock
dm bufio: check new buffer allocation watermark every 30 seconds
dm mpath: split and rename activate_path() to prepare for its expanded use
dm cache metadata: fail operations if fail_io mode has been established
dm bufio: make the parameter "retain_bytes" unsigned long
dm thin metadata: call precommit before saving the roots
dm space map disk: fix some book keeping in the disk space map
md: update slab_cache before releasing new stripes when stripes resizing
md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop
rtlwifi: rtl8821ae: setup 8812ae RFE according to device type
mwifiex: MAC randomization should not be persistent
mwifiex: pcie: fix cmd_buf use-after-free in remove/reset
ima: accept previously set IMA_NEW_FILE
KVM: x86: Fix load damaged SSEx MXCSR register
KVM: x86: Fix potential preemption when get the current kvmclock timestamp
KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation
x86: fix 32-bit case of __get_user_asm_u64()
regulator: rk808: Fix RK818 LDO2
regulator: tps65023: Fix inverted core enable logic.
s390/kdump: Add final note
s390/cputime: fix incorrect system time
ath9k_htc: Add support of AirTies 1eda:2315 AR9271 device
ath9k_htc: fix NULL-deref at probe
drm/amdgpu: Make display watermark calculations more accurate
drm/amdgpu: Avoid overflows/divide-by-zero in latency_watermark calculations.
drm/amdgpu: Add missing lb_vblank_lead_lines setup to DCE-6 path.
drm/nouveau/therm: remove ineffective workarounds for alarm bugs
drm/nouveau/tmr: ack interrupt before processing alarms
drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm
drm/nouveau/tmr: avoid processing completed alarms when adding a new one
drm/nouveau/tmr: handle races with hw when updating the next alarm time
gpio: omap: return error if requested debounce time is not possible
cdc-acm: fix possible invalid access when processing notification
ohci-pci: add qemu quirk
cxl: Force context lock during EEH flow
cxl: Route eeh events to all drivers in cxl_pci_error_detected()
proc: Fix unbalanced hard link numbers
of: fix sparse warning in of_pci_range_parser_one
of: fix "/cpus" reference leak in of_numa_parse_cpu_nodes()
of: fdt: add missing allocation-failure check
ibmvscsis: Do not send aborted task response
iio: dac: ad7303: fix channel description
IIO: bmp280-core.c: fix error in humidity calculation
IB/hfi1: Return an error on memory allocation failure
IB/hfi1: Fix a subcontext memory leak
pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes
pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes()
USB: serial: ftdi_sio: fix setting latency for unprivileged users
USB: serial: ftdi_sio: add Olimex ARM-USB-TINY(H) PIDs
USB: chaoskey: fix Alea quirk on big-endian hosts
f2fs: check entire encrypted bigname when finding a dentry
fscrypt: avoid collisions when presenting long encrypted filenames
libnvdimm: fix clear length of nvdimm_forget_poison()
xhci: remove GFP_DMA flag from allocation
usb: host: xhci-plat: propagate return value of platform_get_irq()
xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton
usb: host: xhci-mem: allocate zeroed Scratchpad Buffer
net: irda: irda-usb: fix firmware name on big-endian hosts
usbvision: fix NULL-deref at probe
mceusb: fix NULL-deref at probe
ttusb2: limit messages to buffer size
dvb-usb-dibusb-mc-common: Add MODULE_LICENSE
usb: dwc3: gadget: Prevent losing events in event cache
usb: musb: tusb6010_omap: Do not reset the other direction's packet size
usb: musb: Fix trying to suspend while active for OTG configurations
USB: iowarrior: fix info ioctl on big-endian hosts
usb: serial: option: add Telit ME910 support
USB: serial: qcserial: add more Lenovo EM74xx device IDs
USB: serial: mct_u232: fix big-endian baud-rate handling
USB: serial: io_ti: fix div-by-zero in set_termios
USB: hub: fix SS hub-descriptor handling
USB: hub: fix non-SS hub-descriptor handling
ipx: call ipxitf_put() in ioctl error path
iio: proximity: as3935: fix as3935_write
iio: hid-sensor: Store restore poll and hysteresis on S3
s5p-mfc: Fix race between interrupt routine and device functions
gspca: konica: add missing endpoint sanity check
s5p-mfc: Fix unbalanced call to clock management
dib0700: fix NULL-deref at probe
zr364xx: enforce minimum size when reading header
dvb-frontends/cxd2841er: define symbol_rate_min/max in T/C fe-ops
digitv: limit messages to buffer size
dw2102: limit messages to buffer size
cx231xx-audio: fix init error path
cx231xx-audio: fix NULL-deref at probe
cx231xx-cards: fix NULL-deref at probe
powerpc/mm: Ensure IRQs are off in switch_mm()
powerpc/eeh: Avoid use after free in eeh_handle_special_event()
powerpc/book3s/mce: Move add_taint() later in virtual mode
powerpc/pseries: Fix of_node_put() underflow during DLPAR remove
powerpc/iommu: Do not call PageTransHuge() on tail pages
powerpc/64e: Fix hang when debugging programs with relocated kernel
powerpc/tm: Fix FP and VMX register corruption
arm64: KVM: Do not use stack-protector to compile EL2 code
arm: KVM: Do not use stack-protector to compile HYP code
KVM: arm: plug potential guest hardware debug leakage
ARM: 8662/1: module: split core and init PLT sections
ARM: 8670/1: V7M: Do not corrupt vector table around v7m_invalidate_l1 call
ARM: dts: at91: sama5d3_xplained: fix ADC vref
ARM: dts: at91: sama5d3_xplained: not all ADC channels are available
ARM: dts: imx6sx-sdb: Remove OPP override
arm64: dts: hi6220: Reset the mmc hosts
arm64: xchg: hazard against entire exchange variable
arm64: ensure extension of smp_store_release value
arm64: armv8_deprecated: ensure extension of addr
arm64: uaccess: ensure extension of access_ok() addr
arm64: documentation: document tagged pointer stack constraints
staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory.
staging: rtl8192e: fix 2 byte alignment of register BSSIDR.
staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD.
staging: rtl8192e: GetTs Fix invalid TID 7 warning.
iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings
metag/uaccess: Fix access_ok()
metag/uaccess: Check access_ok in strncpy_from_user
stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms
uwb: fix device quirk on big-endian hosts
genirq: Fix chained interrupt data ordering
nvme: unmap CMB and remove sysfs file in reset path
MIPS: Loongson-3: Select MIPS_L1_CACHE_SHIFT_6
osf_wait4(): fix infoleak
um: Fix to call read_initrd after init_bootmem
tracing/kprobes: Enforce kprobes teardown after testing
PCI: hv: Allocate interrupt descriptors with GFP_ATOMIC
PCI: hv: Specify CPU_AFFINITY_ALL for MSI affinity when >= 32 CPUs
PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platforms
PCI: Fix another sanity check bug in /proc/pci mmap
PCI: Only allow WC mmap on prefetchable resources
PCI: Freeze PME scan before suspending devices
mtd: nand: orion: fix clk handling
mtd: nand: omap2: Fix partition creation via cmdline mtdparts
mtd: nand: add ooblayout for old hamming layout
drm/edid: Add 10 bpc quirk for LGD 764 panel in HP zBook 17 G2
NFSv4: Fix a hang in OPEN related to server reboot
NFS: Fix use after free in write error path
NFS: Use GFP_NOIO for two allocations in writeback
nfsd: fix undefined behavior in nfsd4_layout_verify
nfsd: encoders mustn't use unitialized values in error cases
drivers: char: mem: Check for address space wraparound with mmap()
drm/i915/gvt: Disable access to stolen memory as a guest
Linux 4.9.30
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 6332cd32c8 upstream.
If user has no key under an encrypted dir, fscrypt gives digested dentries.
Previously, when looking up a dentry, f2fs only checks its hash value with
first 4 bytes of the digested dentry, which didn't handle hash collisions fully.
This patch enhances to check entire dentry bytes likewise ext4.
Eric reported how to reproduce this issue by:
# seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 | xargs touch
# find edir -type f | xargs stat -c %i | sort | uniq | wc -l
100000
# sync
# echo 3 > /proc/sys/vm/drop_caches
# keyctl new_session
# find edir -type f | xargs stat -c %i | sort | uniq | wc -l
99999
Cc: <stable@vger.kernel.org>
Reported-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(fixed f2fs_dentry_hash() to work even when the hash is 0)
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Refactor the fs readpage/write tracepoints to move the
inode->path lookup outside the tracepoint code, and pass a pointer
to the path into the tracepoint code instead. This is necessary
because the tracepoint code runs non-preemptible. Thanks to
Trilok Soni for catching this in 4.4.
Signed-off-by: Mohan Srinivasan <srmohan@google.com>
Adds tracepoints in ext4/f2fs/mpage to track readpages/buffered
write()s. This allows us to track files that are being read/written
to PIDs. (Merged from android4.4-common).
Signed-off-by: Mohan Srinivasan <srmohan@google.com>
Pull more vfs updates from Al Viro:
">rename2() work from Miklos + current_time() from Deepa"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: Replace current_fs_time() with current_time()
fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
fs: Replace CURRENT_TIME with current_time() for inode timestamps
fs: proc: Delete inode time initializations in proc_alloc_inode()
vfs: Add current_time() api
vfs: add note about i_op->rename changes to porting
fs: rename "rename2" i_op to "rename"
vfs: remove unused i_op->rename
fs: make remaining filesystems use .rename2
libfs: support RENAME_NOREPLACE in simple_rename()
fs: support RENAME_NOREPLACE for local filesystems
ncpfs: fix unused variable warning
Previously, we only support global fault injection configuration, so that
when we configure type/rate of fault injection through sysfs, mount
option, it will influence all f2fs partition which is being used.
It is not make sence, since it will be not convenient if developer want
to test separated partitions with different fault injection rate/type
simultaneously, also it's not possible to enable fault injection in one
partition and disable fault injection in other one.
>From now on, we move global configuration of fault injection in module
into per-superblock, hence injection testing can be more flexible.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_time() instead.
CURRENT_TIME is also not y2038 safe.
This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe. As part of the effort current_time() will be
extended to do range checks. Hence, it is necessary for all
file system timestamps to use current_time(). Also,
current_time() will be transitioned along with vfs to be
y2038 safe.
Note that whenever a single call to current_time() is used
to change timestamps in different inodes, it is because they
share the same time granularity.
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Felipe Balbi <balbi@kernel.org>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch sets encryption name flag in the add inline entry path
if filename is encrypted.
Signed-off-by: Shuoran Liu <liushuoran@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
When creating new inode, security_inode_init_security will be called for
initializing security info related to the inode, and filename is passed to
security module, it helps security module such as SElinux to know which
rule or label could be applied for the inode with specified name.
Previously, if new inode is created as an encrypted one, f2fs will transfer
encrypted filename to security module which may fail the check of security
policy belong to the inode. So in order to this issue, alter to transfer
original unencrypted filename instead.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull f2fs updates from Jaegeuk Kim:
"The major change in this version is mitigating cpu overheads on write
paths by replacing redundant inode page updates with mark_inode_dirty
calls. And we tried to reduce lock contentions as well to improve
filesystem scalability. Other feature is setting F2FS automatically
when detecting host-managed SMR.
Enhancements:
- ioctl to move a range of data between files
- inject orphan inode errors
- avoid flush commands congestion
- support lazytime
Bug fixes:
- return proper results for some dentry operations
- fix deadlock in add_link failure
- disable extent_cache for fcollapse/finsert"
* tag 'for-f2fs-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (68 commits)
f2fs: clean up coding style and redundancy
f2fs: get victim segment again after new cp
f2fs: handle error case with f2fs_bug_on
f2fs: avoid data race when deciding checkpoin in f2fs_sync_file
f2fs: support an ioctl to move a range of data blocks
f2fs: fix to report error number of f2fs_find_entry
f2fs: avoid memory allocation failure due to a long length
f2fs: reset default idle interval value
f2fs: use blk_plug in all the possible paths
f2fs: fix to avoid data update racing between GC and DIO
f2fs: add maximum prefree segments
f2fs: disable extent_cache for fcollapse/finsert inodes
f2fs: refactor __exchange_data_block for speed up
f2fs: fix ERR_PTR returned by bio
f2fs: avoid mark_inode_dirty
f2fs: move i_size_write in f2fs_write_end
f2fs: fix to avoid redundant discard during fstrim
f2fs: avoid mismatching block range for discard
f2fs: fix incorrect f_bfree calculation in ->statfs
f2fs: use percpu_rw_semaphore
...
If dotdot directory is corrupted, its slot may be ocupied by another
file. In this case, dentry[1] is not the parent directory. Rename and
cross-rename will update the inode in dentry[1] incorrectly. This
patch finds dotdot dentry by name.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
[Jaegeuk Kim: remove wron bug_on]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Separate the op from the rq_flag_bits and have f2fs
set/get the bio using bio_set_op_attrs/bio_op.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
If we get ENOMEM or EIO in f2fs_find_entry, we should stop right away.
Otherwise, for example, we can get duplicate directory entry by ->chash and
->clevel.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch reduces to call them across the whole tree.
- sync_inode_page()
- update_inode_page()
- update_inode()
- f2fs_write_inode()
Instead, checkpoint will flush all the dirty inode metadata before syncing
node pages.
Note that, this is doable, since we call mark_inode_dirty_sync() for all
inode's field change which needs to update on-disk inode as well.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch calls mark_inode_dirty_sync() for the following on-disk inode
changes.
-> largest
-> ctime/mtime/atime
-> i_current_depth
-> i_xattr_nid
-> i_pino
-> i_advise
-> i_flags
-> i_mode
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Fix two bugs in error path of f2fs_move_rehashed_dirents:
- release dir's inode page if fail to call kmalloc
- recover i_current_depth if fail to converting
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
With below steps, we will see that dentry page becoming unaccessable later.
This is because we forget updating i_current_depth in inode during inline
dentry conversion, after that, once we failed at somewhere, it will leave
i_current_depth as 0 in non-inline directory. Then, during ->lookup, the
current_depth value makes all dentry pages in first level invisible. Fix
it.
1) mount f2fs with inline_dentry option
2) mkdir dir
3) touch 180 files named [0-179] in dir
4) touch 180 in dir (fail after inline dir conversion)
5) ll dir
ls: cannot access /mnt/f2fs/dir/0: No such file or directory
ls: cannot access /mnt/f2fs/dir/1: No such file or directory
ls: cannot access /mnt/f2fs/dir/2: No such file or directory
ls: cannot access /mnt/f2fs/dir/3: No such file or directory
ls: cannot access /mnt/f2fs/dir/4: No such file or directory
drwxr-xr-x 2 root root 4096 may 13 21:47 ./
drwxr-xr-x 3 root root 4096 may 13 21:46 ../
-????????? ? ? ? ? ? 0
-????????? ? ? ? ? ? 1
-????????? ? ? ? ? ? 10
-????????? ? ? ? ? ? 100
-????????? ? ? ? ? ? 101
-????????? ? ? ? ? ? 102
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The filename length in dirent of may become zero-sized after random junk
data injection, once encounter such dirent, find_target_dentry or
f2fs_add_inline_entries will run into an infinite loop. So let f2fs being
aware of that to avoid deadloop.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
With below serials, we will lose parts of dirents:
1) mount f2fs with inline_dentry option
2) echo 1 > /sys/fs/f2fs/sdX/dir_level
3) mkdir dir
4) touch 180 files named [1-180] in dir
5) touch 181 in dir
6) echo 3 > /proc/sys/vm/drop_caches
7) ll dir
ls: cannot access 2: No such file or directory
ls: cannot access 4: No such file or directory
ls: cannot access 5: No such file or directory
ls: cannot access 6: No such file or directory
ls: cannot access 8: No such file or directory
ls: cannot access 9: No such file or directory
...
total 360
drwxr-xr-x 2 root root 4096 Feb 19 15:12 ./
drwxr-xr-x 3 root root 4096 Feb 19 15:11 ../
-rw-r--r-- 1 root root 0 Feb 19 15:12 1
-rw-r--r-- 1 root root 0 Feb 19 15:12 10
-rw-r--r-- 1 root root 0 Feb 19 15:12 100
-????????? ? ? ? ? ? 101
-????????? ? ? ? ? ? 102
-????????? ? ? ? ? ? 103
...
The reason is: when doing the inline dir conversion, we didn't consider
that directory has hierarchical hash structure which can be configured
through sysfs interface 'dir_level'.
By default, dir_level of directory inode is 0, it means we have one bucket
in hash table located in first level, all dirents will be hashed in this
bucket, so it has no problem for us to do the duplication simply between
inline dentry page and converted normal dentry page.
However, if we configured dir_level with the value N (greater than 0), it
will expand the bucket number of first level hash table by 2^N - 1, it
hashs dirents into different buckets according their hash value, if we
still move all dirents to first bucket, it makes incorrent locating for
inline dirents, the result is, although we can iterate all dirents through
->readdir, we can't stat some of them in ->lookup which based on hash
table searching.
This patch fixes this issue by rehashing dirents into correct position
when converting inline directory.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the renamed functions moved from the f2fs crypto files.
1. definitions for per-file encryption used by ext4 and f2fs.
2. crypto.c for encrypt/decrypt functions
a. IO preparation:
- fscrypt_get_ctx / fscrypt_release_ctx
b. before IOs:
- fscrypt_encrypt_page
- fscrypt_decrypt_page
- fscrypt_zeroout_range
c. after IOs:
- fscrypt_decrypt_bio_pages
- fscrypt_pullback_bio_page
- fscrypt_restore_control_page
3. policy.c supporting context management.
a. For ioctls:
- fscrypt_process_policy
- fscrypt_get_policy
b. For context permission
- fscrypt_has_permitted_context
- fscrypt_inherit_context
4. keyinfo.c to handle permissions
- fscrypt_get_encryption_info
- fscrypt_free_encryption_info
5. fname.c to support filename encryption
a. general wrapper functions
- fscrypt_fname_disk_to_usr
- fscrypt_fname_usr_to_disk
- fscrypt_setup_filename
- fscrypt_free_filename
b. specific filename handling functions
- fscrypt_fname_alloc_buffer
- fscrypt_fname_free_buffer
6. Makefile and Kconfig
Cc: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Michael Halcrow <mhalcrow@google.com>
Signed-off-by: Ildar Muslukhov <ildarm@google.com>
Signed-off-by: Uday Savagaonkar <savagaon@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Add a new help f2fs_update_data_blkaddr to clean up redundant codes.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
f2fs_convert_inline_page introduce what read_inline_data
already does for copying out the inline data from inode_page.
We can use read_inline_data instead to simplify the code.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
no need to wait inline file page writeback for no one
use it, so this patch delete unnecessary wait.
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In write_begin, if storage supports stable_page, we don't need to wait for
writeback to update its contents.
This patch introduces to use wait_for_stable_page instead of
wait_on_page_writeback.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The sceanrio is:
1. create fully node blocks
2. flush node blocks
3. write inline_data for all the node blocks again
4. flush node blocks redundantly
So, this patch tries to flush inline_data when flushing node blocks.
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If inline_data option is disable, when truncating an inline inode with
size which is not exceed maxinum inline size, we should not convert
inline inode to regular one to avoid the overhead of synchronizing
conversion.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If user tries to update or read data, we don't need to call f2fs_balance_fs
which triggers f2fs_gc, which increases unnecessary long latency.
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We can check inode's inline_data flag when calling to convert it.
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This fixes error handling for calls to various functions in the
function recover_inline_data to check if these particular functions
either return a error code or the boolean value false to signal their
caller they have failed internally and if this arises return false
to signal failure immediately to the caller of recover_inline_data
as we cannot continue after failures to calling either the function
truncate_inline_inode or truncate_blocks.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>