Commit Graph

570174 Commits

Author SHA1 Message Date
Jaegeuk Kim
48743604b6 f2fs: return fs_trim if there is no candidate
commit 25290fa559 upstream.

If there is no candidate to submit discard command during f2sf_trim_fs, let's
return without checkpoint.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
bab2a03c46 f2fs: avoid needless checkpoint in f2fs_trim_fs
commit 0333ad4e4f upstream.

The f2fs_trim_fs() doesn't need to do checkpoint if there are newly allocated
data blocks only which didn't change the critical checkpoint data such as nat
and sit entries.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
551836d68b f2fs: relax async discard commands more
commit 4e6a8d9b22 upstream.

This patch relaxes async discard commands to avoid waiting its end_io during
checkpoint.
Instead of waiting them during checkpoint, it will be done when actually reusing
them.

Test on initial partition of nvme drive.

 # time fstrim /mnt/test

Before : 6.158s
After : 4.822s

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
3e60c843ef f2fs: drop exist_data for inline_data when truncated to 0
commit bb95d9ab2a upstream.

A test program gets the SEEK_DATA with two values between
a new created file and the exist file on f2fs filesystem.

F2FS filesystem,  (the first "test1" is a new file)
SEEK_DATA size != 0 (offset = 8192)
SEEK_DATA size != 0 (offset = 4096)

PNFS filesystem, (the first "test1" is a new file)
SEEK_DATA size != 0 (offset = 4096)
SEEK_DATA size != 0 (offset = 4096)

int main(int argc, char **argv)
{
        char *filename = argv[1];
        int offset = 1, i = 0, fd = -1;

        if (argc < 2) {
                printf("Usage: %s f2fsfilename\n", argv[0]);
                return -1;
        }

        /*
        if (!access(filename, F_OK) || errno != ENOENT) {
                printf("Needs a new file for test, %m\n");
                return -1;
        }*/

        fd = open(filename, O_RDWR | O_CREAT, 0777);
        if (fd < 0) {
                printf("Create test file %s failed, %m\n", filename);
                return -1;
        }

        for (i = 0; i < 20; i++) {
                offset = 1 << i;
                ftruncate(fd, 0);
                lseek(fd, offset, SEEK_SET);
                write(fd, "test", 5);
                /* Get the alloc size by seek data equal zero*/
                if (lseek(fd, 0, SEEK_DATA)) {
                        printf("SEEK_DATA size != 0 (offset = %d)\n", offset);
                        break;
                }
        }

        close(fd);
        return 0;
}

Reported-and-Tested-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
e447d60a08 f2fs: don't allow encrypted operations without keys
commit 363fa4e078 upstream.

This patch fixes the renaming bug on encrypted filenames, which was pointed by

 (ext4: don't allow encrypted operations without keys)

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
2c4e5c2eb0 f2fs: show the max number of atomic operations
commit 26a28a0c1e upstream.

This patch adds to show the max number of atomic operations which are
conducting concurrently.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
d595df2468 f2fs: get io size bit from mount option
commit ec91538dcc upstream.

This patch adds to set io_size_bits from mount option.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
caa5a4a163 f2fs: support IO alignment for DATA and NODE writes
commit 0a595ebaaa upstream.

This patch implements IO alignment by filling dummy blocks in DATA and NODE
write bios. If we can guarantee, for example, 32KB or 64KB for such the IOs,
we can eliminate underlying dummy page problem which FTL conducts in order to
close MLC or TLC partial written pages.

Note that,
 - it requires "-o mode=lfs".
 - IO size should be power of 2, not exceed BIO_MAX_PAGES, 256.
 - read IO is still 4KB.
 - do checkpoint at fsync, if dummy NODE page was written.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
1c2238bfa9 f2fs: add submit_bio tracepoint
commit 554b5125f5 upstream.

This patch adds final submit_bio() tracepoint.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
919fe37467 f2fs: reassign new segment for mode=lfs
commit 9d52a504db upstream.

Otherwise we can remain wrong curseg->next_blkoff, resulting in fsck failure.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Yunlei He
b0bc7fc867 f2fs: fix a missing discard prefree segments
commit 650d3c4e56 upstream.

If userspace issue a fstrim with a range not involve prefree segments,
it will reuse these segments without discard. This patch fix it.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Geliang Tang
7c033b7c61 f2fs: use rb_entry_safe
commit ed0b56209f upstream.

Use rb_entry_safe() instead of open-coding it.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Yunlei He
71744660e1 f2fs: add a case of no need to read a page in write begin
commit 746e240392 upstream.

If the range we write cover the whole valid data in the last page,
we do not need to read it.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
[Jaegeuk Kim: nullify the remaining area (fix: xfstests/f2fs/001)]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Yunlei He
b1e97c8a7e f2fs: fix a problem of using memory after free
commit 7855eba4d6 upstream.

This patch fix a problem of using memory after free
in function __try_merge_extent_node.

Fixes: 0f825ee6e8 ("f2fs: add new interfaces for extent tree")
Cc: <stable@vger.kernel.org>
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Dan Carpenter
9e36c28b49 f2fs: remove unneeded condition
commit 07fe8d4440 upstream.

We checked that "inode" is not an error pointer earlier so there is
no need to check again here.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Chao Yu
c85ef3084f f2fs: don't cache nat entry if out of memory
commit 5c9e418436 upstream.

If we run out of memory, in cache_nat_entry, it's better to avoid loop
for allocating memory to cache nat entry, so in low memory scenario, for
read path of node block, I expect this can avoid unneeded latency.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Yunlei He
cb1f89c83e f2fs: remove unused values in recover_fsync_data
commit fed2466848 upstream.

This patch remove unused values in function recover_fsync_data

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
64d8b297c4 f2fs: support async discard based on v4.9
commit 275b66b09e upstream.

This patch is based on commit 275b66b09e (f2fs: support async discard).

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
cfbece0226 f2fs: resolve op and op_flags confilcts
commit 70fd76140a upstream.

This patch backported ("block,fs: use REQ_* flags directly")

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Jaegeuk Kim
80bd5344d7 f2fs: remove wrong backported codes
Kconfig and dentry RCU mode fixes.

Fixes: c1286ff41c ("f2fs: backport from 'for-f2fs-4.9'")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:32 +05:30
Todd Kjos
ca027af9a9 FROMLIST: binder: fix use-after-free in binder_transaction()
(from https://patchwork.kernel.org/patch/9978801/)

User-space normally keeps the node alive when creating a transaction
since it has a reference to the target. The local strong ref keeps it
alive if the sending process dies before the target process processes
the transaction. If the source process is malicious or has a reference
counting bug, this can fail.

In this case, when we attempt to decrement the node in the failure
path, the node has already been freed.

This is fixed by taking a tmpref on the node while constructing
the transaction. To avoid re-acquiring the node lock and inner
proc lock to increment the proc's tmpref, a helper is used that
does the ref increments on both the node and proc.

Bug: 66899329
Change-Id: Iad40e1e0bccee88234900494fb52a510a37fe8d7
Signed-off-by: Todd Kjos <tkjos@google.com>
2017-10-15 23:28:32 +05:30
Ido Schimmel
d8e59293ff UPSTREAM: ipv6: fib: Unlink replaced routes from their nodes
When a route is deleted its node pointer is set to NULL to indicate it's
no longer linked to its node. Do the same for routes that are replaced.

This will later allow us to test if a route is still in the FIB by
checking its node pointer instead of its reference count.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Cherry-pick from: 7483cea799
Bug: 64978549

Change-Id: Ibfa54cf918084138b6b19437e9ef86bfaea5deae
2017-10-15 23:28:32 +05:30
Yunlei He
cc0b17f6cc f2fs: fix a missing size change in f2fs_setattr
commit c0ed4405a9 upstream.

This patch fix a missing size change in f2fs_setattr

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
f913880bf2 f2fs: fix to access nullified flush_cmd_control pointer
commit 5eba8c5d1f upstream.

f2fs_sync_file()             remount_ro
 - f2fs_readonly
                               - destroy_flush_cmd_control
 - f2fs_issue_flush
   - no fcc pointer!

So, this patch doesn't free fcc in this case, but just stop its kernel thread
which sends flush commands.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
3cc247fdfa f2fs: free meta pages if sanity check for ckpt is failed
commit a2125ff7dd upstream.

This fixes missing freeing meta pages in the error case.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
5a28184ecf f2fs: detect wrong layout
commit 2040fce83f upstream.

Previous mkfs.f2fs allows small partition inappropriately, so f2fs should detect
that as well.

Refer this in f2fs-tools.

mkfs.f2fs: detect small partition by overprovision ratio and # of segments

Reported-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
323d6b4b69 f2fs: call sync_fs when f2fs is idle
commit f455c8a5f0 upstream.

The sync_fs in f2fs_balance_fs_bg must avoid interrupting current user requests.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
9d8116ef64 Revert "f2fs: use percpu_counter for # of dirty pages in inode"
commit 204706c7ac upstream.

This reverts commit 1beba1b3a9.

The perpcu_counter doesn't provide atomicity in single core and consume more
DRAM. That incurs fs_mark test failure due to ENOMEM.

Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Chao Yu
9c9f348d72 f2fs: return AOP_WRITEPAGE_ACTIVATE for writepage
commit 0002b61bda upstream.

We should use AOP_WRITEPAGE_ACTIVATE when we bypass writing pages.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
f5cd931dc0 f2fs: do not activate auto_recovery for fallocated i_size
commit 26787236b3 upstream.

If a file needs to keep its i_size by fallocate, we need to turn off auto
recovery during roll-forward recovery.

This will resolve the below scenario.

1. xfs_io -f /mnt/f2fs/file -c "pwrite 0 4096" -c "fsync"
2. xfs_io -f /mnt/f2fs/file -c "falloc -k 4096 4096" -c "fsync"
3. md5sum /mnt/f2fs/file;
4. godown /mnt/f2fs/
5. umount /mnt/f2fs/
6. mount -t f2fs /dev/sdx /mnt/f2fs
7. md5sum /mnt/f2fs/file

Reported-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Arnd Bergmann
248e942e55 f2fs: fix 32-bit build
commit 19c526515f upstream.

The addition of multiple-device support broke CONFIG_BLK_DEV_ZONED
on 32-bit machines because of a 64-bit division:

fs/f2fs/f2fs.o: In function `__issue_discard_async':
extent_cache.c:(.text.__issue_discard_async+0xd4): undefined reference to `__aeabi_uldivmod'

Fortunately, bdev_zone_size() is guaranteed to return a power-of-two
number, so we can replace the % operator with a cheaper bit mask.

Fixes: 792b84b74b54 ("f2fs: support multiple devices")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Chao Yu
8d0c7ea775 f2fs: fix incorrect free inode count in ->statfs
commit b08b12d2dd upstream.

While calculating inode count that we can create at most in the left space,
we should consider space which data/node blocks occupied, since we create
data/node mixly in main area. So fix the wrong calculation in ->statfs.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Geliang Tang
4816405177 f2fs: drop duplicate header timer.h
commit b4ceec2921 upstream.

Drop duplicate header timer.h from segment.c.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
4935b166e9 f2fs: fix wrong AUTO_RECOVER condition
commit 97dd26ad83 upstream.

If i_size is not aligned to the f2fs's block size, we should not skip inode
update during fsync.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
65bdf690a3 f2fs: do not recover i_size if it's valid
commit 3a3a5ead7b upstream.

If i_size is already valid during roll_forward recovery, we should not update
it according to the block alignment.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Chao Yu
8cc3e395c9 f2fs: fix fdatasync
commit 281518c694 upstream.

For below two cases, we can't guarantee data consistence:

a)
1. xfs_io "pwrite 0 4195328" "fsync"
2. xfs_io "pwrite 4195328 1024" "fdatasync"
3. godown
4. umount & mount
--> isize we updated before fdatasync won't be recovered

b)
1. xfs_io "pwrite -S 0xcc 0 4202496" "fsync"
2. xfs_io "fpunch 4194304 4096" "fdatasync"
3. godown
4. umount & mount
--> dnode we punched before fdatasync won't be recovered

The reason is that normally fdatasync won't be aware of modification
of metadata in file, e.g. isize changing, dnode updating, so in ->fsync
we will skip flushing node pages for above cases, result in making
fdatasynced file being lost during recovery.

Currently we have introduced DIRTY_META global list in sbi for tracking
dirty inode selectively, so in fdatasync we can choose to flush nodes
depend on dirty state of current inode in the list.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Chao Yu
c2cec7e4ab f2fs: fix to account total free nid correctly
commit 04d47e6738 upstream.

Thread A		Thread B		Thread C
- f2fs_create
 - f2fs_new_inode
  - f2fs_lock_op
   - alloc_nid
    alloc last nid
  - f2fs_unlock_op
			- f2fs_create
			 - f2fs_new_inode
			  - f2fs_lock_op
			   - alloc_nid
			    as node count still not
			    be increased, we will
			    loop in alloc_nid
						- f2fs_write_node_pages
						 - f2fs_balance_fs_bg
						  - f2fs_sync_fs
						   - write_checkpoint
						    - block_operations
						     - f2fs_lock_all
 - f2fs_lock_op

While creating new inode, we do not allocate and account nid atomically,
so that when there is almost no free nids left, we may encounter deadloop
like above stack.

In order to avoid that, reuse nm_i::available_nids for accounting free nids
and make nid allocation and counting being atomical during node creation.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Yunlei He
d197d6732e f2fs: fix an infinite loop when flush nodes in cp
commit d40a43af0a upstream.

Thread A			Thread B

- write_checkpoint
 - block_operations
   -blk_start_plug
    -sync_node_pages		- f2fs_do_sync_file
				 - fsync_node_pages
				  - f2fs_wait_on_page_writeback

Thread A wait for global F2FS_DIRTY_NODES decreased to zero,
it start a plug list, some requests have been added to this list.
Thread B lock one dirty node page, and wait this page write back.
But this page has been in plug list of thread A with PG_writeback flag.
Thread A keep on running and its plug list has no chance to finish,
so it seems a deadlock between cp and fsync path.

This patch add a wait on page write back before set node page dirty
to avoid this problem.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Pengyang Hou <houpengyang@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Chao Yu
c6e489e887 f2fs: don't wait writeback for datas during checkpoint
commit 36951b38d1 upstream.

Normally, while committing checkpoint, we will wait on all pages to be
writebacked no matter the page is data or metadata, so in scenario where
there are lots of data IO being submitted with metadata, we may suffer
long latency for waiting writeback during checkpoint.

Indeed, we only care about persistence for pages with metadata, but not
pages with data, as file system consistent are only related to metadate,
so in order to avoid encountering long latency in above scenario, let's
recognize and reference metadata in submitted IOs, wait writeback only
for metadatas.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
028f924c99 f2fs: fix wrong written_valid_blocks counting
commit c79b7ff1d3 upstream.

Previously, written_valid_blocks was got by ckpt->valid_block_count. But if
the last checkpoint has some NEW_ADDR due to power-cut, we can get wrong value.
Fix it to get the number from actual written block count from sit entries.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
3857ba036b f2fs: avoid BG_GC in f2fs_balance_fs
commit 7702bdbe50 upstream.

If many threads hit has_not_enough_free_secs() in f2fs_balance_fs() at the same
time, all the threads would do FG_GC or BG_GC.
In this critical path, we totally don't need to do BG_GC at all.
Let's avoid that.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
c678a1a897 f2fs: fix redundant block allocation
commit c040ff9d69 upstream.

In direct_IO path of f2fs_file_write_iter(),
1. f2fs_preallocate_blocks(F2FS_GET_BLOCK_PRE_DIO)
   -> allocate LBA X
2. f2fs_direct_IO()
   -> return 0;

Then,
f2fs_write_data_page() will allocate another LBA X+1.

This makes EIO triggered by HM-SMR.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
9f13883d39 f2fs: use err for f2fs_preallocate_blocks
commit a7de608691 upstream.

This patch has no functional change.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
88d97b16ec f2fs: support multiple devices
commit 3c62be17d4 upstream.

This patch implements multiple devices support for f2fs.
Given multiple devices by mkfs.f2fs, f2fs shows them entirely as one big
volume under one f2fs instance.

Internal block management is very simple, but we will modify block allocation
and background GC policy to boost IO speed by exploiting them accoording to
each device speed.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
983624348f f2fs: allow dio read for LFS mode
commit e57e9ae5b1 upstream.

We can allow dio reads for LFS mode, while doing buffered writes for dio writes.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
b55145a8ee f2fs: revert segment allocation for direct IO
commit 6ae1be13e8 upstream.

Now we don't need to be too much careful about storage alignment for dio, since
its speed becomes quite fast and we'd better avoid any misalignment first.

Revert: 38aa0889b2 (f2fs: align direct_io'ed data to section)

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Yunlei He
c36393b43a f2fs: return directly if block has been removed from the victim
commit 2061471128 upstream.

If one block has been to written to a new place, just return
in move data process. This patch check it again with holding
page lock.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Chao Yu
f762da7ab5 Revert "f2fs: do not recover from previous remained wrong dnodes"
commit d47b871595 upstream.

i_times of inode will be set with current system time which can be
configured through 'date', so it's not safe to judge dnode block as
garbage data or unchanged inode depend on i_times.

Now, we have used enhanced 'cp_ver + cp' crc method to verify valid
dnode block, so I expect recoverying invalid dnode is almost not
possible.

This reverts commit 807b1e1c8e.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
1f74fc2b10 f2fs: remove checkpoint in f2fs_freeze
commit b4b9d34c85 upstream.

The generic freeze_super() calls sync_filesystems() before f2fs_freeze().
So, basically we don't need to do checkpoint in f2fs_freeze(). But, in xfs/068,
it triggers circular locking problem below due to gc_mutex for checkpoint.

======================================================
[ INFO: possible circular locking dependency detected ]
4.9.0-rc1+ #132 Tainted: G           OE
-------------------------------------------------------

1. wait for __sb_start_write() by

 [<ffffffff9845f353>] dump_stack+0x85/0xc2
 [<ffffffff980e80bf>] print_circular_bug+0x1cf/0x230
 [<ffffffff980eb4d0>] __lock_acquire+0x19e0/0x1bc0
 [<ffffffff980ebdcb>] lock_acquire+0x11b/0x220
 [<ffffffffc08c7c3b>] ? f2fs_drop_inode+0x9b/0x160 [f2fs]
 [<ffffffff9826bdd0>] __sb_start_write+0x130/0x200
 [<ffffffffc08c7c3b>] ? f2fs_drop_inode+0x9b/0x160 [f2fs]
 [<ffffffffc08c7c3b>] f2fs_drop_inode+0x9b/0x160 [f2fs]
 [<ffffffff98289991>] iput+0x171/0x2c0
 [<ffffffffc08cfccf>] f2fs_sync_inode_meta+0x3f/0xf0 [f2fs]
 [<ffffffffc08cfe04>] block_operations+0x84/0x110 [f2fs]
 [<ffffffffc08cff78>] write_checkpoint+0xe8/0xf20 [f2fs]
 [<ffffffff980e979d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffffc08c6de9>] ? f2fs_sync_fs+0x79/0x190 [f2fs]
 [<ffffffff9803e9d9>] ? sched_clock+0x9/0x10
 [<ffffffffc08c6de9>] ? f2fs_sync_fs+0x79/0x190 [f2fs]
 [<ffffffffc08c6df5>] f2fs_sync_fs+0x85/0x190 [f2fs]
 [<ffffffff982a4f90>] ? do_fsync+0x70/0x70
 [<ffffffff982a4f90>] ? do_fsync+0x70/0x70
 [<ffffffff982a4fb0>] sync_fs_one_sb+0x20/0x30
 [<ffffffff9826ca3e>] iterate_supers+0xae/0x100
 [<ffffffff982a50b5>] sys_sync+0x55/0x90
 [<ffffffff9890b345>] entry_SYSCALL_64_fastpath+0x23/0xc6

2. wait for sbi->gc_mutex by

 [<ffffffff980ebdcb>] lock_acquire+0x11b/0x220
 [<ffffffff989063d6>] mutex_lock_nested+0x76/0x3f0
 [<ffffffffc08c6de9>] f2fs_sync_fs+0x79/0x190 [f2fs]
 [<ffffffffc08c7a6c>] f2fs_freeze+0x1c/0x20 [f2fs]
 [<ffffffff9826b6ef>] freeze_super+0xcf/0x190
 [<ffffffff9827eebc>] do_vfs_ioctl+0x53c/0x6a0
 [<ffffffff9827f099>] SyS_ioctl+0x79/0x90
 [<ffffffff9890b345>] entry_SYSCALL_64_fastpath+0x23/0xc6

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30
Jaegeuk Kim
c1c304e68e f2fs: assign segments correctly for direct_io
commit bdb7d964c4 upstream.

Previously, we assigned CURSEG_WARM_DATA for direct_io, but if we have two or
four logs, we do not use that type at all.
Let's fix it.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-15 23:28:17 +05:30