linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-08 20:07:46 +09:00

Author	SHA1	Message	Date
Chao Yu	39ebaaa82e	f2fs: fix to migrate blocks correctly during defragment During defragment, we missed to trigger fragmented blocks migration for below condition: In defragment region: - total number of valid blocks is smaller than 512; - the tail part of the region are all holes; In addtion, return zero to user via range->len if there is no fragmented blocks. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:12:00 -07:00
Chao Yu	0e87a048ef	f2fs: use wrapped f2fs_cp_error() Just cleanup, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:12:00 -07:00
Chao Yu	216de9d1ec	f2fs: fix to use more generic EOPNOTSUPP EOPNOTSUPP is widely used as error number indicating operation is not supported in syscall, and ENOTSUPP was defined and only used for NFSv3 protocol, so use EOPNOTSUPP instead. Fixes: `0a2aa8fbb9` ("f2fs: refactor __exchange_data_block for speed up") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:12:00 -07:00
Chao Yu	e0185895b2	f2fs: use wrapped IS_SWAPFILE() Just cleanup, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:12:00 -07:00
Daniel Rosenberg	b1951281ef	f2fs: Support case-insensitive file name lookups Modeled after commit `b886ee3e77` ("ext4: Support case-insensitive file name lookups") """ This patch implements the actual support for case-insensitive file name lookups in f2fs, based on the feature bit and the encoding stored in the superblock. A filesystem that has the casefold feature set is able to configure directories with the +F (F2FS_CASEFOLD_FL) attribute, enabling lookups to succeed in that directory in a case-insensitive fashion, i.e: match a directory entry even if the name used by userspace is not a byte per byte match with the disk name, but is an equivalent case-insensitive version of the Unicode string. This operation is called a case-insensitive file name lookup. The feature is configured as an inode attribute applied to directories and inherited by its children. This attribute can only be enabled on empty directories for filesystems that support the encoding feature, thus preventing collision of file names that only differ by case. * dcache handling: For a +F directory, F2Fs only stores the first equivalent name dentry used in the dcache. This is done to prevent unintentional duplication of dentries in the dcache, while also allowing the VFS code to quickly find the right entry in the cache despite which equivalent string was used in a previous lookup, without having to resort to ->lookup(). d_hash() of casefolded directories is implemented as the hash of the casefolded string, such that we always have a well-known bucket for all the equivalencies of the same string. d_compare() uses the utf8_strncasecmp() infrastructure, which handles the comparison of equivalent, same case, names as well. For now, negative lookups are not inserted in the dcache, since they would need to be invalidated anyway, because we can't trust missing file dentries. This is bad for performance but requires some leveraging of the vfs layer to fix. We can live without that for now, and so does everyone else. * on-disk data: Despite using a specific version of the name as the internal representation within the dcache, the name stored and fetched from the disk is a byte-per-byte match with what the user requested, making this implementation 'name-preserving'. i.e. no actual information is lost when writing to storage. DX is supported by modifying the hashes used in +F directories to make them case/encoding-aware. The new disk hashes are calculated as the hash of the full casefolded string, instead of the string directly. This allows us to efficiently search for file names in the htree without requiring the user to provide an exact name. * Dealing with invalid sequences: By default, when a invalid UTF-8 sequence is identified, ext4 will treat it as an opaque byte sequence, ignoring the encoding and reverting to the old behavior for that unique file. This means that case-insensitive file name lookup will not work only for that file. An optional bit can be set in the superblock telling the filesystem code and userspace tools to enforce the encoding. When that optional bit is set, any attempt to create a file name using an invalid UTF-8 sequence will fail and return an error to userspace. * Normalization algorithm: The UTF-8 algorithms used to compare strings in f2fs is implemented in fs/unicode, and is based on a previous version developed by SGI. It implements the Canonical decomposition (NFD) algorithm described by the Unicode specification 12.1, or higher, combined with the elimination of ignorable code points (NFDi) and full case-folding (CF) as documented in fs/unicode/utf8_norm.c. NFD seems to be the best normalization method for F2FS because: - It has a lower cost than NFC/NFKC (which requires decomposing to NFD as an intermediary step) - It doesn't eliminate important semantic meaning like compatibility decompositions. Although: - This implementation is not completely linguistic accurate, because different languages have conflicting rules, which would require the specialization of the filesystem to a given locale, which brings all sorts of problems for removable media and for users who use more than one language. """ Signed-off-by: Daniel Rosenberg <drosen@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:59 -07:00
Daniel Rosenberg	c69fe4fbf0	f2fs: include charset encoding information in the superblock Add charset encoding to f2fs to support casefolding. It is modeled after the same feature introduced in commit `c83ad55eaa` ("ext4: include charset encoding information in the superblock") Currently this is not compatible with encryption, similar to the current ext4 imlpementation. This will change in the future. >From the ext4 patch: """ The s_encoding field stores a magic number indicating the encoding format and version used globally by file and directory names in the filesystem. The s_encoding_flags defines policies for using the charset encoding, like how to handle invalid sequences. The magic number is mapped to the exact charset table, but the mapping is specific to ext4. Since we don't have any commitment to support old encodings, the only encoding I am supporting right now is utf8-12.1.0. The current implementation prevents the user from enabling encoding and per-directory encryption on the same filesystem at the same time. The incompatibility between these features lies in how we do efficient directory searches when we cannot be sure the encryption of the user provided fname will match the actual hash stored in the disk without decrypting every directory entry, because of normalization cases. My quickest solution is to simply block the concurrent use of these features for now, and enable it later, once we have a better solution. """ Signed-off-by: Daniel Rosenberg <drosen@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:59 -07:00
Daniel Rosenberg	e304fb5ba0	fs: Reserve flag for casefolding In preparation for including the casefold feature within f2fs, elevate the EXT4_CASEFOLD_FL flag to FS_CASEFOLD_FL. Signed-off-by: Daniel Rosenberg <drosen@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:59 -07:00
Chao Yu	a8037a8544	f2fs: fix to avoid call kvfree under spinlock vfree() don't wish to be called from interrupt context, move it out of spin_lock_irqsave() coverage. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:59 -07:00
Jia-Ju Bai	1cb57e2eea	fs: f2fs: Remove unnecessary checks of SM_I(sbi) in update_general_status() In fill_super() and put_super(), f2fs_destroy_stats() is called in prior to f2fs_destroy_segment_manager(), so if current sbi can still be visited in global stat list, SM_I(sbi) should be released yet. For this reason, SM_I(sbi) does not need to be checked in update_general_status(). Thank Chao Yu for advice. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:59 -07:00
Chao Yu	eb44d8769b	f2fs: disallow direct IO in atomic write Atomic write needs page cache to cache data of transaction, direct IO should never be allowed in atomic write, detect and deny it when open atomic write file. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:59 -07:00
Chao Yu	c3d777c7b0	f2fs: fix to handle quota_{on,off} correctly With quota_ino feature on, generic/232 reports an inconsistence issue on the image. The root cause is that the testcase tries to: - use quotactl to shutdown journalled quota based on sysfile; - and then use quotactl to enable/turn on quota based on specific file (aquota.user or aquota.group). Eventually, quota sysfile will be out-of-update due to following specific file creation. Change as below to fix this issue: - deny enabling quota based on specific file if quota sysfile exists. - set SBI_QUOTA_NEED_REPAIR once sysfile based quota shutdowns via ioctl. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:59 -07:00
Chao Yu	fc42f9b12f	f2fs: fix to detect cp error in f2fs_setxattr() It needs to return -EIO if filesystem has been shutdown, fix the miss case in f2fs_setxattr(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:58 -07:00
Chao Yu	99317081e6	f2fs: fix to spread f2fs_is_checkpoint_ready() We missed to call f2fs_is_checkpoint_ready() in several places, it may allow space allocation even when free space was exhausted during checkpoint is disabled, fix to add them. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:58 -07:00
Chao Yu	93720242e4	f2fs: support fiemap() for directory inode Adjust f2fs_fiemap() to support fiemap() on directory inode. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:58 -07:00
Chao Yu	80140c1f21	f2fs: fix to avoid discard command leak ============================================================================= BUG discard_cmd (Tainted: G B OE ): Objects remaining in discard_cmd on __kmem_cache_shutdown() ----------------------------------------------------------------------------- INFO: Slab 0xffffe1ac481d22c0 objects=36 used=2 fp=0xffff936b4748bf50 flags=0x2ffff0000000100 Call Trace: dump_stack+0x63/0x87 slab_err+0xa1/0xb0 __kmem_cache_shutdown+0x183/0x390 shutdown_cache+0x14/0x110 kmem_cache_destroy+0x195/0x1c0 f2fs_destroy_segment_manager_caches+0x21/0x40 [f2fs] exit_f2fs_fs+0x35/0x641 [f2fs] SyS_delete_module+0x155/0x230 ? vtime_user_exit+0x29/0x70 do_syscall_64+0x6e/0x160 entry_SYSCALL64_slow_path+0x25/0x25 INFO: Object 0xffff936b4748b000 @offset=0 INFO: Object 0xffff936b4748b070 @offset=112 kmem_cache_destroy discard_cmd: Slab cache still has objects Call Trace: dump_stack+0x63/0x87 kmem_cache_destroy+0x1b4/0x1c0 f2fs_destroy_segment_manager_caches+0x21/0x40 [f2fs] exit_f2fs_fs+0x35/0x641 [f2fs] SyS_delete_module+0x155/0x230 do_syscall_64+0x6e/0x160 entry_SYSCALL64_slow_path+0x25/0x25 Recovery can cache discard commands, so in error path of fill_super(), we need give a chance to handle them, otherwise it will lead to leak of discard_cmd slab cache. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:58 -07:00
Chao Yu	d7be5b042f	f2fs: fix to avoid tagging SBI_QUOTA_NEED_REPAIR incorrectly On a quota disabled image, with fault injection, SBI_QUOTA_NEED_REPAIR will be set incorrectly in error path of f2fs_evict_inode(), fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:58 -07:00
Chao Yu	d7895f7914	f2fs: fix to drop meta/node pages during umount As reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204193 A null pointer dereference bug is triggered in f2fs under kernel-5.1.3. kasan_report.cold+0x5/0x32 f2fs_write_end_io+0x215/0x650 bio_endio+0x26e/0x320 blk_update_request+0x209/0x5d0 blk_mq_end_request+0x2e/0x230 lo_complete_rq+0x12c/0x190 blk_done_softirq+0x14a/0x1a0 __do_softirq+0x119/0x3e5 irq_exit+0x94/0xe0 call_function_single_interrupt+0xf/0x20 During umount, we will access NULL sbi->node_inode pointer in f2fs_write_end_io(): f2fs_bug_on(sbi, page->mapping == NODE_MAPPING(sbi) && page->index != nid_of_node(page)); The reason is if disable_checkpoint mount option is on, meta dirty pages can remain during umount, and then be flushed by iput() of meta_inode, however node_inode has been iput()ed before meta_inode's iput(). Since checkpoint is disabled, all meta/node datas are useless and should be dropped in next mount, so in umount, let's adjust drop_inode() to give a hint to iput_final() to drop all those dirty datas correctly. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:58 -07:00
Chao Yu	a0a65a4a0e	f2fs: disallow switching io_bits option during remount If IO alignment feature is turned on after remount, we didn't initialize mempool of it, it turns out we will encounter panic during IO submission due to access NULL mempool pointer. This feature should be set only at mount time, so simply deny configuring during remount. This fixes bug reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204135 Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:58 -07:00
Chao Yu	d7ebeff93f	f2fs: fix panic of IO alignment feature Since `07173c3ec2` ("block: enable multipage bvecs"), one bio vector can store multi pages, so that we can not calculate max IO size of bio as PAGE_SIZE * bio->bi_max_vecs. However IO alignment feature of f2fs always has that assumption, so finally, it may cause panic during IO submission as below stack. kernel BUG at fs/f2fs/data.c:317! RIP: 0010:__submit_merged_bio+0x8b0/0x8c0 Call Trace: f2fs_submit_page_write+0x3cd/0xdd0 do_write_page+0x15d/0x360 f2fs_outplace_write_data+0xd7/0x210 f2fs_do_write_data_page+0x43b/0xf30 __write_data_page+0xcf6/0x1140 f2fs_write_cache_pages+0x3ba/0xb40 f2fs_write_data_pages+0x3dd/0x8b0 do_writepages+0xbb/0x1e0 __writeback_single_inode+0xb6/0x800 writeback_sb_inodes+0x441/0x910 wb_writeback+0x261/0x650 wb_workfn+0x1f9/0x7a0 process_one_work+0x503/0x970 worker_thread+0x7d/0x820 kthread+0x1ad/0x210 ret_from_fork+0x35/0x40 This patch adds one extra condition to check left space in bio while trying merging page to bio, to avoid panic. This bug was reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204043 Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:57 -07:00
Chao Yu	6f1c7fb76e	f2fs: introduce {page,io}_is_mergeable() for readability Wrap merge condition into function for readability, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:57 -07:00
Jaegeuk Kim	125f8ac104	f2fs: fix livelock in swapfile writes This patch fixes livelock in the below call path when writing swap pages. [46374.617256] c2 701 __switch_to+0xe4/0x100 [46374.617265] c2 701 __schedule+0x80c/0xbc4 [46374.617273] c2 701 schedule+0x74/0x98 [46374.617281] c2 701 rwsem_down_read_failed+0x190/0x234 [46374.617291] c2 701 down_read+0x58/0x5c [46374.617300] c2 701 f2fs_map_blocks+0x138/0x9a8 [46374.617310] c2 701 get_data_block_dio_write+0x74/0x104 [46374.617320] c2 701 __blockdev_direct_IO+0x1350/0x3930 [46374.617331] c2 701 f2fs_direct_IO+0x55c/0x8bc [46374.617341] c2 701 __swap_writepage+0x1d0/0x3e8 [46374.617351] c2 701 swap_writepage+0x44/0x54 [46374.617360] c2 701 shrink_page_list+0x140/0xe80 [46374.617371] c2 701 shrink_inactive_list+0x510/0x918 [46374.617381] c2 701 shrink_node_memcg+0x2d4/0x804 [46374.617391] c2 701 shrink_node+0x10c/0x2f8 [46374.617400] c2 701 do_try_to_free_pages+0x178/0x38c [46374.617410] c2 701 try_to_free_pages+0x348/0x4b8 [46374.617419] c2 701 __alloc_pages_nodemask+0x7f8/0x1014 [46374.617429] c2 701 pagecache_get_page+0x184/0x2cc [46374.617438] c2 701 f2fs_new_node_page+0x60/0x41c [46374.617449] c2 701 f2fs_new_inode_page+0x50/0x7c [46374.617460] c2 701 f2fs_init_inode_metadata+0x128/0x530 [46374.617472] c2 701 f2fs_add_inline_entry+0x138/0xd64 [46374.617480] c2 701 f2fs_do_add_link+0xf4/0x178 [46374.617488] c2 701 f2fs_create+0x1e4/0x3ac [46374.617497] c2 701 path_openat+0xdc0/0x1308 [46374.617507] c2 701 do_filp_open+0x78/0x124 [46374.617516] c2 701 do_sys_open+0x134/0x248 [46374.617525] c2 701 SyS_openat+0x14/0x20 Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2019-09-23 14:11:57 -07:00
Eric Biggers	b27113a45c	f2fs: add fs-verity support Add fs-verity support to f2fs. fs-verity is a filesystem feature that enables transparent integrity protection and authentication of read-only files. It uses a dm-verity like mechanism at the file level: a Merkle tree is used to verify any block in the file in log(filesize) time. It is implemented mainly by helper functions in fs/verity/. See Documentation/filesystems/fsverity.rst for the full documentation. The f2fs support for fs-verity consists of: - Adding a filesystem feature flag and an inode flag for fs-verity. - Implementing the fsverity_operations to support enabling verity on an inode and reading/writing the verity metadata. - Updating ->readpages() to verify data as it's read from verity files and to support reading verity metadata pages. - Updating ->write_begin(), ->write_end(), and ->writepages() to support writing verity metadata pages. - Calling the fs-verity hooks for ->open(), ->setattr(), and ->ioctl(). Like ext4, f2fs stores the verity metadata (Merkle tree and fsverity_descriptor) past the end of the file, starting at the first 64K boundary beyond i_size. This approach works because (a) verity files are readonly, and (b) pages fully beyond i_size aren't visible to userspace but can be read/written internally by f2fs with only some relatively small changes to f2fs. Extended attributes cannot be used because (a) f2fs limits the total size of an inode's xattr entries to 4096 bytes, which wouldn't be enough for even a single Merkle tree block, and (b) f2fs encryption doesn't encrypt xattrs, yet the verity metadata must be encrypted when the file is because it contains hashes of the plaintext data. Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:57 -07:00
Eric Biggers	9771896780	ext4: update on-disk format documentation for fs-verity Document the format of verity files on ext4, and the corresponding inode and superblock flags. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:57 -07:00
Eric Biggers	f954fe23c5	ext4: add fs-verity read support Make ext4_mpage_readpages() verify data as it is read from fs-verity files, using the helper functions from fs/verity/. To support both encryption and verity simultaneously, this required refactoring the decryption workflow into a generic "post-read processing" workflow which can do decryption, verification, or both. The case where the ext4 block size is not equal to the PAGE_SIZE is not supported yet, since in that case ext4_mpage_readpages() sometimes falls back to block_read_full_page(), which does not support fs-verity yet. Co-developed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:57 -07:00
Eric Biggers	5ed7b76f84	ext4: add basic fs-verity support Add most of fs-verity support to ext4. fs-verity is a filesystem feature that enables transparent integrity protection and authentication of read-only files. It uses a dm-verity like mechanism at the file level: a Merkle tree is used to verify any block in the file in log(filesize) time. It is implemented mainly by helper functions in fs/verity/. See Documentation/filesystems/fsverity.rst for the full documentation. This commit adds all of ext4 fs-verity support except for the actual data verification, including: - Adding a filesystem feature flag and an inode flag for fs-verity. - Implementing the fsverity_operations to support enabling verity on an inode and reading/writing the verity metadata. - Updating ->write_begin(), ->write_end(), and ->writepages() to support writing verity metadata pages. - Calling the fs-verity hooks for ->open(), ->setattr(), and ->ioctl(). ext4 stores the verity metadata (Merkle tree and fsverity_descriptor) past the end of the file, starting at the first 64K boundary beyond i_size. This approach works because (a) verity files are readonly, and (b) pages fully beyond i_size aren't visible to userspace but can be read/written internally by ext4 with only some relatively small changes to ext4. This approach avoids having to depend on the EA_INODE feature and on rearchitecturing ext4's xattr support to support paging multi-gigabyte xattrs into memory, and to support encrypting xattrs. Note that the verity metadata must be encrypted when the file is, since it contains hashes of the plaintext data. This patch incorporates work by Theodore Ts'o and Chandan Rajendra. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:56 -07:00
Eric Biggers	0c22f68fc4	fs-verity: support builtin file signatures To meet some users' needs, add optional support for having fs-verity handle a portion of the authentication policy in the kernel. An ".fs-verity" keyring is created to which X.509 certificates can be added; then a sysctl 'fs.verity.require_signatures' can be set to cause the kernel to enforce that all fs-verity files contain a signature of their file measurement by a key in this keyring. See the "Built-in signature verification" section of Documentation/filesystems/fsverity.rst for the full documentation. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:56 -07:00
Eric Biggers	9b8425a7cd	fs-verity: add SHA-512 support Add SHA-512 support to fs-verity. This is primarily a demonstration of the trivial changes needed to support a new hash algorithm in fs-verity; most users will still use SHA-256, due to the smaller space required to store the hashes. But some users may prefer SHA-512. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:56 -07:00
Eric Biggers	ebeb654881	fs-verity: implement FS_IOC_MEASURE_VERITY ioctl Add a function for filesystems to call to implement the FS_IOC_MEASURE_VERITY ioctl. This ioctl retrieves the file measurement that fs-verity calculated for the given file and is enforcing for reads; i.e., reads that don't match this hash will fail. This ioctl can be used for authentication or logging of file measurements in userspace. See the "FS_IOC_MEASURE_VERITY" section of Documentation/filesystems/fsverity.rst for the documentation. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:56 -07:00
Eric Biggers	43005f9bdb	fs-verity: implement FS_IOC_ENABLE_VERITY ioctl Add a function for filesystems to call to implement the FS_IOC_ENABLE_VERITY ioctl. This ioctl enables fs-verity on a file. See the "FS_IOC_ENABLE_VERITY" section of Documentation/filesystems/fsverity.rst for the documentation. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:56 -07:00
Eric Biggers	1d403c387a	fs-verity: add data verification hooks for ->readpages() Add functions that verify data pages that have been read from a fs-verity file, against that file's Merkle tree. These will be called from filesystems' ->readpage() and ->readpages() methods. Since data verification can block, a workqueue is provided for these methods to enqueue verification work from their bio completion callback. See the "Verifying data" section of Documentation/filesystems/fsverity.rst for more information. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:55 -07:00
Eric Biggers	5dbea586d2	fs-verity: add the hook for file ->setattr() Add a function fsverity_prepare_setattr() which filesystems that support fs-verity must call to deny truncates of verity files. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:55 -07:00
Eric Biggers	0ba4ec719c	fs-verity: add the hook for file ->open() Add the fsverity_file_open() function, which prepares an fs-verity file to be read from. If not already done, it loads the fs-verity descriptor from the filesystem and sets up an fsverity_info structure for the inode which describes the Merkle tree and contains the file measurement. It also denies all attempts to open verity files for writing. This commit also begins the include/linux/fsverity.h header, which declares the interface between fs/verity/ and filesystems. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:55 -07:00
Eric Biggers	806d34a384	fs-verity: add inode and superblock fields Analogous to fs/crypto/, add fields to the VFS inode and superblock for use by the fs/verity/ support layer: - ->s_vop: points to the fsverity_operations if the filesystem supports fs-verity, otherwise is NULL. - ->i_verity_info: points to cached fs-verity information for the inode after someone opens it, otherwise is NULL. - S_VERITY: bit in ->i_flags that identifies verity inodes, even when they haven't been opened yet and thus still have NULL ->i_verity_info. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:55 -07:00
Eric Biggers	981bfe6851	fs-verity: add Kconfig and the helper functions for hashing Add the beginnings of the fs/verity/ support layer, including the Kconfig option and various helper functions for hashing. To start, only SHA-256 is supported, but other hash algorithms can easily be added. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:55 -07:00
Eric Biggers	375b9e1f36	fs: uapi: define verity bit for FS_IOC_GETFLAGS Add FS_VERITY_FL to the flags for FS_IOC_GETFLAGS, so that applications can easily determine whether a file is a verity file at the same time as they're checking other file flags. This flag will be gettable only; FS_IOC_SETFLAGS won't allow setting it, since an ioctl must be used instead to provide more parameters. This flag matches the on-disk bit that was already allocated for ext4. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:55 -07:00
Eric Biggers	489fcc8c99	fs-verity: add UAPI header Add the UAPI header for fs-verity, including two ioctls: - FS_IOC_ENABLE_VERITY - FS_IOC_MEASURE_VERITY These ioctls are documented in the "User API" section of Documentation/filesystems/fsverity.rst. Examples of using these ioctls can be found in fsverity-utils (https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/fsverity-utils.git). I've also written xfstests that test these ioctls (https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/xfstests-dev.git/log/?h=fsverity). Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:55 -07:00
Eric Biggers	7b3e23b950	fs-verity: add MAINTAINERS file entry fs-verity will be jointly maintained by Eric Biggers and Theodore Ts'o. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:54 -07:00
Eric Biggers	38852eac4c	fs-verity: add a documentation file Add a documentation file for fs-verity, covering: - Introduction - Use cases - User API - FS_IOC_ENABLE_VERITY - FS_IOC_MEASURE_VERITY - FS_IOC_GETFLAGS - Accessing verity files - File measurement computation - Merkle tree - fs-verity descriptor - Built-in signature verification - Filesystem support - ext4 - f2fs - Implementation details - Verifying data - Pagecache - Block device based filesystems - Userspace utility - Tests - FAQ Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2019-09-23 14:11:54 -07:00
Theodore Ts'o	d375fed0c6	ext4: fix kernel oops caused by spurious casefold flag If an directory has the a casefold flag set without the casefold feature set, s_encoding will not be initialized, and this will cause the kernel to dereference a NULL pointer. In addition to adding checks to avoid these kernel oops, attempts to load inodes with the casefold flag when the casefold feature is not enable will cause the file system to be declared corrupted. [Jaegeuk Kim: use EXT4_ERROR_INODE] Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 14:11:33 -07:00
Gabriel Krisman Bertazi	ad37c94d48	ext4: fix coverity warning on error path of filename setup Fix the following coverity warning reported by Dan Carpenter: fs/ext4/namei.c:1311 ext4_fname_setup_ci_filename() warn: 'cf_name->len' unsigned <= 0 Fixes: `3ae72562ad` ("ext4: optimize case-insensitive lookups") Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>	2019-09-23 13:34:27 -07:00
Gabriel Krisman Bertazi	e9ca36895a	ext4: optimize case-insensitive lookups Temporarily cache a casefolded version of the file name under lookup in ext4_filename, to avoid repeatedly casefolding it. I got up to 30% speedup on lookups of large directories (>100k entries), depending on the length of the string under lookup. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 13:23:30 -07:00
Gabriel Krisman Bertazi	e9bc9003b0	ext4: fix dcache lookup of !casefolded directories Found by visual inspection, this wasn't caught by my xfstest, since it's effect is ignoring positive dentries in the cache the fallback just goes to the disk. it was introduced in the last iteration of the case-insensitive patch. d_compare should return 0 when the entries match, so make sure we are correctly comparing the entire string if the encoding feature is set and we are on a case-INsensitive directory. Fixes: `b886ee3e77` ("ext4: Support case-insensitive file name lookups") Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 13:23:30 -07:00
Theodore Ts'o	8f7904ac28	unicode: update to Unicode 12.1.0 final Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Gabriel Krisman Bertazi <krisman@collabora.com>	2019-09-23 13:23:29 -07:00
Theodore Ts'o	839b115309	unicode: add missing check for an error return from utf8lookup() Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Gabriel Krisman Bertazi <krisman@collabora.com>	2019-09-23 13:23:29 -07:00
Theodore Ts'o	82eacc0a18	ext4: export /sys/fs/ext4/feature/casefold if Unicode support is present Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 13:23:29 -07:00
Masahiro Yamada	2022b4157d	unicode: refactor the rule for regenerating utf8data.h scripts/mkutf8data is used only when regenerating utf8data.h, which never happens in the normal kernel build. However, it is irrespectively built if CONFIG_UNICODE is enabled. Moreover, there is no good reason for it to reside in the scripts/ directory since it is only used in fs/unicode/. Hence, move it from scripts/ to fs/unicode/. In some cases, we bypass build artifacts in the normal build. The conventional way to do so is to surround the code with ifdef REGENERATE_. For example, - `7373f4f83c` ("kbuild: add implicit rules for parser generation") - `6aaf49b495` ("crypto: arm,arm64 - Fix random regeneration of S_shipped") I rewrote the rule in a more kbuild'ish style. In the normal build, utf8data.h is just shipped from the check-in file. $ make [ snip ] SHIPPED fs/unicode/utf8data.h CC fs/unicode/utf8-norm.o CC fs/unicode/utf8-core.o CC fs/unicode/utf8-selftest.o AR fs/unicode/built-in.a If you want to generate utf8data.h based on UCD, put .txt files into fs/unicode/, then pass REGENERATE_UTF8DATA=1 from the command line. The mkutf8data tool will be automatically compiled to generate the utf8data.h from the *.txt files. $ make REGENERATE_UTF8DATA=1 [ snip ] HOSTCC fs/unicode/mkutf8data GEN fs/unicode/utf8data.h CC fs/unicode/utf8-norm.o CC fs/unicode/utf8-core.o CC fs/unicode/utf8-selftest.o AR fs/unicode/built-in.a I renamed the check-in utf8data.h to utf8data.h_shipped so that this will work for the out-of-tree build. You can update it based on the latest UCD like this: $ make REGENERATE_UTF8DATA=1 fs/unicode/ $ cp fs/unicode/utf8data.h fs/unicode/utf8data.h_shipped Also, I added entries to .gitignore and dontdiff. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 13:23:29 -07:00
Gabriel Krisman Bertazi	ad2d18a3b9	ext4: Support case-insensitive file name lookups This patch implements the actual support for case-insensitive file name lookups in ext4, based on the feature bit and the encoding stored in the superblock. A filesystem that has the casefold feature set is able to configure directories with the +F (EXT4_CASEFOLD_FL) attribute, enabling lookups to succeed in that directory in a case-insensitive fashion, i.e: match a directory entry even if the name used by userspace is not a byte per byte match with the disk name, but is an equivalent case-insensitive version of the Unicode string. This operation is called a case-insensitive file name lookup. The feature is configured as an inode attribute applied to directories and inherited by its children. This attribute can only be enabled on empty directories for filesystems that support the encoding feature, thus preventing collision of file names that only differ by case. * dcache handling: For a +F directory, Ext4 only stores the first equivalent name dentry used in the dcache. This is done to prevent unintentional duplication of dentries in the dcache, while also allowing the VFS code to quickly find the right entry in the cache despite which equivalent string was used in a previous lookup, without having to resort to ->lookup(). d_hash() of casefolded directories is implemented as the hash of the casefolded string, such that we always have a well-known bucket for all the equivalencies of the same string. d_compare() uses the utf8_strncasecmp() infrastructure, which handles the comparison of equivalent, same case, names as well. For now, negative lookups are not inserted in the dcache, since they would need to be invalidated anyway, because we can't trust missing file dentries. This is bad for performance but requires some leveraging of the vfs layer to fix. We can live without that for now, and so does everyone else. * on-disk data: Despite using a specific version of the name as the internal representation within the dcache, the name stored and fetched from the disk is a byte-per-byte match with what the user requested, making this implementation 'name-preserving'. i.e. no actual information is lost when writing to storage. DX is supported by modifying the hashes used in +F directories to make them case/encoding-aware. The new disk hashes are calculated as the hash of the full casefolded string, instead of the string directly. This allows us to efficiently search for file names in the htree without requiring the user to provide an exact name. * Dealing with invalid sequences: By default, when a invalid UTF-8 sequence is identified, ext4 will treat it as an opaque byte sequence, ignoring the encoding and reverting to the old behavior for that unique file. This means that case-insensitive file name lookup will not work only for that file. An optional bit can be set in the superblock telling the filesystem code and userspace tools to enforce the encoding. When that optional bit is set, any attempt to create a file name using an invalid UTF-8 sequence will fail and return an error to userspace. * Normalization algorithm: The UTF-8 algorithms used to compare strings in ext4 is implemented lives in fs/unicode, and is based on a previous version developed by SGI. It implements the Canonical decomposition (NFD) algorithm described by the Unicode specification 12.1, or higher, combined with the elimination of ignorable code points (NFDi) and full case-folding (CF) as documented in fs/unicode/utf8_norm.c. NFD seems to be the best normalization method for EXT4 because: - It has a lower cost than NFC/NFKC (which requires decomposing to NFD as an intermediary step) - It doesn't eliminate important semantic meaning like compatibility decompositions. Although: - This implementation is not completely linguistic accurate, because different languages have conflicting rules, which would require the specialization of the filesystem to a given locale, which brings all sorts of problems for removable media and for users who use more than one language. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 13:23:29 -07:00
Gabriel Krisman Bertazi	e7fda225d9	ext4: include charset encoding information in the superblock Support for encoding is considered an incompatible feature, since it has potential to create collisions of file names in existing filesystems. If the feature flag is not enabled, the entire filesystem will operate on opaque byte sequences, respecting the original behavior. The s_encoding field stores a magic number indicating the encoding format and version used globally by file and directory names in the filesystem. The s_encoding_flags defines policies for using the charset encoding, like how to handle invalid sequences. The magic number is mapped to the exact charset table, but the mapping is specific to ext4. Since we don't have any commitment to support old encodings, the only encoding I am supporting right now is utf8-12.1.0. The current implementation prevents the user from enabling encoding and per-directory encryption on the same filesystem at the same time. The incompatibility between these features lies in how we do efficient directory searches when we cannot be sure the encryption of the user provided fname will match the actual hash stored in the disk without decrypting every directory entry, because of normalization cases. My quickest solution is to simply block the concurrent use of these features for now, and enable it later, once we have a better solution. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 13:23:28 -07:00
Gabriel Krisman Bertazi	9866760844	unicode: update unicode database unicode version 12.1.0 Regenerate utf8data.h based on the latest UCD files and run tests against the latest version. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 13:23:28 -07:00
Gabriel Krisman Bertazi	56427c9cf5	unicode: introduce test module for normalized utf8 implementation This implements a in-kernel sanity test module for the utf8 normalization core. At probe time, it will run basic sequences through the utf8n core, to identify problems will equivalent sequences and normalization/casefold code. This is supposed to be useful for regression testing when adding support for a new version of utf8 to linux. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-09-23 13:23:28 -07:00

1 2 3 4 5 ...

784105 Commits