linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-08 20:07:46 +09:00

Author	SHA1	Message	Date
Kees Cook	aa8893750a	UPSTREAM: seccomp: Fix tracer exit notifications during fatal signals This fixes a ptrace vs fatal pending signals bug as manifested in seccomp now that seccomp was reordered to happen after ptrace. The short version is that seccomp should not attempt to call do_exit() while fatal signals are pending under a tracer. The existing code was trying to be as defensively paranoid as possible, but it now ends up confusing ptrace. Instead, the syscall can just be skipped (which solves the original concern that the do_exit() was addressing) and normal signal handling, tracer notification, and process death can happen. Paraphrasing from the original bug report: If a tracee task is in a PTRACE_EVENT_SECCOMP trap, or has been resumed after such a trap but not yet been scheduled, and another task in the thread-group calls exit_group(), then the tracee task exits without the ptracer receiving a PTRACE_EVENT_EXIT notification. Test case here: https://gist.github.com/khuey/3c43ac247c72cef8c956ca73281c9be7 The bug happens because when __seccomp_filter() detects fatal_signal_pending(), it calls do_exit() without dequeuing the fatal signal. When do_exit() sends the PTRACE_EVENT_EXIT notification and that task is descheduled, __schedule() notices that there is a fatal signal pending and changes its state from TASK_TRACED to TASK_RUNNING. That prevents the ptracer's waitpid() from returning the ptrace event. A more detailed analysis is here: https://github.com/mozilla/rr/issues/1762#issuecomment-237396255. Reported-by: Robert O'Callahan <robert@ocallahan.org> Reported-by: Kyle Huey <khuey@kylehuey.com> Tested-by: Kyle Huey <khuey@kylehuey.com> Fixes: `93e35efb8d` ("x86/ptrace: run seccomp after ptrace") Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> (cherry picked from commit `485a252a55`) Bug: 119769499 Change-Id: I444e69093e88d58587b4d5c4f2d777985591c32d Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:47:10 +05:30
Kees Cook	edc64cccc5	UPSTREAM: arm64/ptrace: run seccomp after ptrace Close the hole where ptrace can change a syscall out from under seccomp. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org (cherry picked from commit `a5cd110cb8`) Bug: 119769499 Change-Id: I9fd3e8e6d38122866df434b2676bf7ba0e808e32 Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:47:00 +05:30
Kees Cook	ee785cf928	UPSTREAM: arm/ptrace: run seccomp after ptrace Close the hole where ptrace can change a syscall out from under seccomp. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org (cherry picked from commit `0f3912fd93`) Bug: 119769499 Change-Id: Id82e4137207db42a8af31b2745581c53eaaf1f89 Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:46:49 +05:30
Kees Cook	9782f1fbb8	BACKPORT: x86/ptrace: run seccomp after ptrace This moves seccomp after ptrace on x86 to that seccomp can catch changes made by ptrace. Emulation should skip the rest of processing too. We can get rid of test_thread_flag because there's no longer any opportunity for seccomp to mess with ptrace state before invoking ptrace. Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: x86@kernel.org Cc: Andy Lutomirski <luto@kernel.org> (cherry picked from commit `93e35efb8d`) Bug: 119769499 Change-Id: Ie1b9a18360799e68e22f67ce6a819c93433fdeaa [ghackmann@google.com: adjust context] Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:46:40 +05:30
Kees Cook	9cf8a2391a	UPSTREAM: seccomp: recheck the syscall after RET_TRACE When RET_TRACE triggers, a tracer may change a syscall into something that should be filtered by seccomp. This re-runs seccomp after a trace event to make sure things continue to pass. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@kernel.org> (cherry picked from commit `ce6526e8af`) Bug: 119769499 Change-Id: Ib67732df3c2ac8c6b1de87e75f96aaed02f4627d Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:46:31 +05:30
Kees Cook	ceb1e1cda7	UPSTREAM: seccomp: remove 2-phase API Since nothing is using the 2-phase API, and it adds more complexity than benefit, remove it. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@kernel.org> (cherry picked from commit `8112c4f140`) Bug: 119769499 Change-Id: Iff6246c1e6e9dd0161b80b666a5e796f78a5c785 Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:46:22 +05:30
Andy Lutomirski	5b053daef3	BACKPORT: x86/entry: Get rid of two-phase syscall entry work I added two-phase syscall entry work back when the entry slow path was very slow. Nowadays, the entry slow path is fast and two-phase entry work serves no purpose. Remove it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> (cherry picked from commit `c87a85177e`) Bug: 119769499 Change-Id: Ieac4470411f88ca8830794d0322d8d8bb348039e [ghackmann@google.com: - adjust for post-4.4 is_ia32_task() -> in_ia32_syscall() renaming - preserve TF flags fixup in syscall_trace_enter() - keep syscall_trace_enter() exported, since we haven't taken patches to move the calling code from entry_64.S to common.c] Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:46:12 +05:30
Andy Lutomirski	ec8e12f60f	BACKPORT: seccomp: Add a seccomp_data parameter secure_computing() Currently, if arch code wants to supply seccomp_data directly to seccomp (which is generally much faster than having seccomp do it using the syscall_get_xyz() API), it has to use the two-phase seccomp hooks. Add it to the easy hooks, too. Cc: linux-arch@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> (cherry picked from commit `2f275de5d1`) Bug: 119769499 Change-Id: I96876ecd8d1743c289ecef6d2deb65361d1f5baa [ghackmann@google.com: drop changes to parisc, tile, and um, which didn't implement seccomp support in this kernel version] Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:46:03 +05:30
Andy Lutomirski	6b2efe86a1	BACKPORT: x86/entry/64: Always run ptregs-using syscalls on the slow path 64-bit syscalls currently have an optimization in which they are called with partial pt_regs. A small handful require full pt_regs. In the 32-bit and compat cases, I cleaned this up by forcing full pt_regs for all syscalls. The performance hit doesn't really matter as the affected system calls are fundamentally heavy and this is the 32-bit compat case. I want to clean up the 64-bit case as well, but I don't want to hurt fast path performance. To do that, I want to force the syscalls that use pt_regs onto the slow path. This will enable us to make slow path syscalls be real ABI-compliant C functions. Use the new syscall entry qualification machinery for this. 'stub_clone' is now 'stub_clone/ptregs'. The next patch will eliminate the stubs, and we'll just have 'sys_clone/ptregs'. As of this patch, two-phase entry tracing is no longer used. It has served its purpose (namely a huge speedup on some workloads prior to more general opportunistic SYSRET support), and once the dust settles I'll send patches to back it out. The implementation is heavily based on a patch from Brian Gerst: http://lkml.kernel.org/g/1449666173-15366-1-git-send-email-brgerst@gmail.com Originally-From: Brian Gerst <brgerst@gmail.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/b9beda88460bcefec6e7d792bd44eca9b760b0c4.1454022279.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `302f5b260c`) Bug: 119769499 Change-Id: I3e5ac760ef9ca8dcecd8075564118bd10a8be91f [ghackmann@google.com: adjust context] Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:45:54 +05:30
Andy Lutomirski	42c427d727	UPSTREAM: x86/syscalls: Add syscall entry qualifiers This will let us specify something like 'sys_xyz/foo' instead of 'sys_xyz' in the syscall table, where the 'foo' qualifier conveys some extra information to the C code. The intent is to allow things like sys_execve/ptregs to indicate that sys_execve() touches pt_regs. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/2de06e33dce62556b3ec662006fcb295504e296e.1454022279.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `cfcbadb49d`) Bug: 119769499 Change-Id: I39c3b052526991d7958861712f1e3e9bf453225e Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:45:45 +05:30
Andy Lutomirski	db6bbaf0e6	UPSTREAM: x86/syscalls: Move compat syscall entry handling into syscalltbl.sh Rather than duplicating the compat entry handling in all consumers of syscalls_BITS.h, handle it directly in syscalltbl.sh. Now we generate entries in syscalls_32.h like: __SYSCALL_I386(5, sys_open) __SYSCALL_I386(5, compat_sys_open) and all of its consumers implicitly get the right entry point. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/b7c2b501dc0e6e43050e916b95807c3e2e16e9bb.1454022279.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `3e65654e3d`) Bug: 119769499 Change-Id: I7b2b8206f243e33458fe6cc69affe043aaf177ce Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:45:35 +05:30
Andy Lutomirski	30bff6461c	UPSTREAM: x86/syscalls: Remove __SYSCALL_COMMON and __SYSCALL_X32 The common/64/x32 distinction has no effect other than determining which kernels actually support the syscall. Move the logic into syscalltbl.sh. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/58d4a95f40e43b894f93288b4a3633963d0ee22e.1454022279.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `32324ce15e`) Bug: 119769499 Change-Id: Ib994586ac47f8f4cbc3f746492c2b47b22e03d39 Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:45:26 +05:30
Andy Lutomirski	298668fe9d	UPSTREAM: x86/syscalls: Refactor syscalltbl.sh This splits out the code to emit a syscall line. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1bfcbba991f5cfaa9291ff950a593daa972a205f.1454022279.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `fba324744b`) Bug: 119769499 Change-Id: Ie36f49882c4c3a69d87288795e4525353bb05ec5 Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:45:16 +05:30
Peter Kalauskas	41ee54318f	ANDROID: zram: set comp_len to PAGE_SIZE when page is huge This bug was introduced when two patches were applied out of order. * zram: drop max_zpage_size and use zs_huge_class_size() * zram: mark incompressible page as ZRAM_HUGE Signed-off-by: Peter Kalauskas <peskal@google.com> Bug: 119260394 Change-Id: I437d35c8d23c15237ad9c2d5bd7f99d7bff42872 Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:42:30 +05:30
Benedict Wong	2734aadd19	BACKPORT: xfrm: Allow Output Mark to be Updated Using UPDSA Allow UPDSA to change "output mark" to permit policy separation of packet routing decisions from SA keying in systems that use mark-based routing. The set mark, used as a routing and firewall mark for outbound packets, is made update-able which allows routing decisions to be handled independently of keying/SA creation. To maintain consistency with other optional attributes, the output mark is only updated if sent with a non-zero value. The per-SA lock and the xfrm_state_lock are taken in that order to avoid a deadlock with xfrm_timer_handler(), which also takes the locks in that order. Signed-off-by: Nathan Harold <nharold@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> (cherry picked from commit `6d8e85ffe1`) Backport resolution required using props.output_mark instead of props.smark Change-Id: I08c7bfc114ac9826a8a18f5ac1c3ff17a4e0940b Signed-off-by: Benedict Wong <benedictwong@google.com> Bug: 114060045 Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:42:16 +05:30
Daniel Rosenberg	da81903e96	ANDROID: sdcardfs: Add option to drop unused dentries This adds the nocache mount option, which will cause sdcardfs to always drop dentries that are not in use, preventing cached entries from holding on to lower dentries, which could cause strange behavior when bypassing the sdcardfs layer and directly changing the lower fs. Change-Id: I70268584a20b989ae8cfdd278a2e4fa1605217fb Signed-off-by: Daniel Rosenberg <drosen@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:41:47 +05:30
Amit Pundir	fbbf1a527a	Merge 4.20-rc1-4.4 into android-4.4 * origin/upstream-f2fs-stable-linux-4.4.y: f2fs: guarantee journalled quota data by checkpoint f2fs: cleanup dirty pages if recover failed f2fs: fix data corruption issue with hardware encryption f2fs: fix to recover inode->i_flags of inode block during POR f2fs: spread f2fs_set_inode_flags() f2fs: fix to spread clear_cold_data() Revert "f2fs: fix to clear PG_checked flag in set_page_dirty()" f2fs: account read IOs and use IO counts for is_idle f2fs: fix to account IO correctly for cgroup writeback f2fs: fix to account IO correctly f2fs: remove request_list check in is_idle() f2fs: allow to mount, if quota is failed f2fs: update REQ_TIME in f2fs_cross_rename() f2fs: do not update REQ_TIME in case of error conditions f2fs: remove unneeded disable_nat_bits() f2fs: remove unused sbi->trigger_ssr_threshold f2fs: shrink sbi->sb_lock coverage in set_file_temperature() f2fs: fix to recover cold bit of inode block during POR f2fs: submit cached bio to avoid endless PageWriteback f2fs: checkpoint disabling ... Conflicts: fs/f2fs/data.c Change-Id: I95097a969bbd23c2009106b07be8a1eeec675b1c Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:39:12 +05:30
Alistair Strachan	f111b61338	Revert "BACKPORT, FROMLIST: fscrypt: add Speck128/256 support" This reverts commit `eafbcb2354`. Resolves conflicts with a later CL, `e7724207f7` "fscrypt: log the crypto algorithm implementations". [astrachan: This is an update to `7977ade69d` which reverted only the fscrypt changes, not the ext4 changes, due to a historical bad merge.] Bug: 116008047 Change-Id: I772d78d6e4bd126dcd2b25049c719f8522a1c157 Signed-off-by: Alistair Strachan <astrachan@google.com> [AmitP: Updated commit message for correct SHA1 hash] Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:34:12 +05:30
Alistair Strachan	00428ff5a2	Build fix for `7977ade69d`. Bug: 116008047 Change-Id: I053866f49e1ec090fde9a6fbf19a34c5e4e6e8e3 Signed-off-by: Alistair Strachan <astrachan@google.com> [AmitP: Updated commit message for correct SHA1 hash] Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:33:50 +05:30
Alistair Strachan	4b20ec3e75	Revert "BACKPORT, FROMGIT: crypto: speck - add support for the Speck block cipher" This reverts commit `18954d93f3`. Bug: 116008047 Change-Id: If9192b30cdb4212fb6c8111d70c532a109695fbd Signed-off-by: Alistair Strachan <astrachan@google.com> [AmitP: Updated commit message for correct SHA1 hash] Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:33:27 +05:30
Alistair Strachan	881ff5a2fd	Revert "FROMGIT: crypto: speck - export common helpers" This reverts commit `bc84402781`. Bug: 116008047 Change-Id: I9d0a8357be1ab090a793646716771015299fb7fe Signed-off-by: Alistair Strachan <astrachan@google.com> [AmitP: Updated commit message for correct SHA1 hash] Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:32:50 +05:30
Alistair Strachan	3c95c088da	Revert "BACKPORT, FROMGIT: crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS" This reverts commit `8fce82e266`. Bug: 116008047 Change-Id: I3c76a77dfae5894b46e39266f9f8d7c7294ab615 Signed-off-by: Alistair Strachan <astrachan@google.com> [AmitP: Updated commit message for correct SHA1 hash] Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:32:11 +05:30
Alistair Strachan	87174b9852	Revert "BACKPORT, FROMGIT: crypto: speck - add test vectors for Speck128-XTS" This reverts commit `239fd75203`. Bug: 116008047 Change-Id: Id68b333c66c89bc69e61bdd96d21fa6bc6dbf3b8 Signed-off-by: Alistair Strachan <astrachan@google.com> [AmitP: Updated commit message for correct SHA1 hash] Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:31:38 +05:30
Alistair Strachan	cddb52b89b	Revert "BACKPORT, FROMGIT: crypto: speck - add test vectors for Speck64-XTS" This reverts commit `c21d7a0568`. Bug: 116008047 Change-Id: I96897223f5daced94e3d62ae817c2de2b59b417f Signed-off-by: Alistair Strachan <astrachan@google.com> [AmitP: Updated commit message for correct SHA1 hash] Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:30:39 +05:30
Alistair Strachan	eb75269732	Revert "BACKPORT, FROMLIST: crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS" This reverts commit `cf055c4ac0`. Bug: 116008047 Change-Id: Ic47509910c162a35f6fba10a196f4369299451ac Signed-off-by: Alistair Strachan <astrachan@google.com> [AmitP: Updated commit message for correct SHA1 hash] Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:26:24 +05:30
Alistair Strachan	7977ade69d	Revert "fscrypt: add Speck128/256 support" This reverts commit `eb13e0b692`. Resolves conflicts with a later CL, `e7724207f7` "fscrypt: log the crypto algorithm implementations". Also leaves the include/uapi/linux/fs.h constants in place to prevent future accidental re-use. Bug: 116008047 Change-Id: I2d64d8d3e384400b7bdfc06a353c3844d4ebb377 Signed-off-by: Alistair Strachan <astrachan@google.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 23:21:29 +05:30
Evan Green	697a80ff00	UPSTREAM: loop: Add LOOP_SET_BLOCK_SIZE in compat ioctl This change adds LOOP_SET_BLOCK_SIZE as one of the supported ioctls in lo_compat_ioctl. It only takes an unsigned long argument, and in practice a 32-bit value works fine. Bug: 117823094 Change-Id: I0061a082eb2632c47b7d66f35f2c909d33ff1653 Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Evan Green <evgreen@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> (cherry picked from commit `9fea4b3952`) Signed-off-by: Martijn Coenen <maco@android.com> (cherry picked from commit `c82807c7dd`) Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 22:31:55 +05:30
Shaohua Li	3c24e1d194	BACKPORT: block/loop: set hw_sectors Loop can handle any size of request. Limiting it to 255 sectors just burns the CPU for bio split and request merge for underlayer disk and also cause bad fs block allocation in directio mode. Bug: 117823094 Change-Id: Ic4957181433c5a0d15f4cfdbf69dc5558d6dc5bd Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> (cherry picked from commit `54bb0ade66`) Signed-off-by: Martijn Coenen <maco@android.com> (cherry picked from commit `8567ea359c`) Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 22:31:43 +05:30
Omar Sandoval	3881718492	UPSTREAM: loop: add ioctl for changing logical block size This is a different approach from the first attempt in `f2c6df7dbf` ("loop: support 4k physical blocksize"). Rather than extending LOOP_{GET,SET}_STATUS, add a separate ioctl just for setting the block size. Bug: 117823094 Change-Id: I8e69b8839d7fee3be564cbfce1797ce108e1aa1e Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> (cherry picked from commit `89e4fdecb5`) Signed-off-by: Martijn Coenen <maco@android.com> (cherry picked from commit `6edf1ad773`) Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 22:31:33 +05:30
Jerry Zhang	afe4c8d03b	ANDROID: usb: gadget: f_mtp: Return error if count is negative If the user passes in a negative file size in a int64, this will compare to be smaller than buffer length, and it will get truncated to form a read length that is larger than the buffer length. To fix, return -EINVAL if the count argument is negative, so the loop will never happen. Bug: 37429972 Test: Test with PoC Change-Id: I5d52e38e6fbe2c17eb8c493f9eb81df6cfd780a4 Signed-off-by: Jerry Zhang <zhangjerry@google.com> (cherry picked from commit `34e65b671b`) Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 22:30:28 +05:30
Steve Muckle	8f2aa58a67	ANDROID: x86_64_cuttlefish_defconfig: disable CONFIG_MEMORY_STATE_TIME Bug: 117847156 Change-Id: Idfbac9c1f0dc2617642c30ddb65400083da44b49 Signed-off-by: Steve Muckle <smuckle@google.com> (cherry picked from commit `7a95540418`) Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2018-12-10 22:20:45 +05:30
Chao Yu	0eea3276ee	f2fs: guarantee journalled quota data by checkpoint For journalled quota mode, let checkpoint to flush dquot dirty data and quota file data to guarntee persistence of all quota sysfile in last checkpoint, by this way, we can avoid corrupting quota sysfile when encountering SPO. The implementation is as below: 1. add a global state SBI_QUOTA_NEED_FLUSH to indicate that there is cached dquot metadata changes in quota subsystem, and later checkpoint should: a) flush dquot metadata into quota file. b) flush quota file to storage to keep file usage be consistent. 2. add a global state SBI_QUOTA_NEED_REPAIR to indicate that quota operation failed due to -EIO or -ENOSPC, so later, a) checkpoint will skip syncing dquot metadata. b) CP_QUOTA_NEED_FSCK_FLAG will be set in last cp pack to give a hint for fsck repairing. 3. add a global state SBI_QUOTA_SKIP_FLUSH, in checkpoint, if quota data updating is very heavy, it may cause hungtask in block_operation(). To avoid this, if our retry time exceed threshold, let's just skip flushing and retry in next checkpoint(). Signed-off-by: Weichao Guo <guoweichao@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: avoid warnings and set fsck flag] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:25 -07:00
Sheng Yong	ba406b6308	f2fs: cleanup dirty pages if recover failed During recover, we will try to create new dentries for inodes with dentry_mark. But if the parent is missing (e.g. killed by fsck), recover will break. But those recovered dirty pages are not cleanup. This will hit f2fs_bug_on: [ 53.519566] F2FS-fs (loop0): Found nat_bits in checkpoint [ 53.539354] F2FS-fs (loop0): recover_inode: ino = 5, name = file, inline = 3 [ 53.539402] F2FS-fs (loop0): recover_dentry: ino = 5, name = file, dir = 0, err = -2 [ 53.545760] F2FS-fs (loop0): Cannot recover all fsync data errno=-2 [ 53.546105] F2FS-fs (loop0): access invalid blkaddr:4294967295 [ 53.546171] WARNING: CPU: 1 PID: 1798 at fs/f2fs/checkpoint.c:163 f2fs_is_valid_blkaddr+0x26c/0x320 [ 53.546174] Modules linked in: [ 53.546183] CPU: 1 PID: 1798 Comm: mount Not tainted 4.19.0-rc2+ #1 [ 53.546186] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 53.546191] RIP: 0010:f2fs_is_valid_blkaddr+0x26c/0x320 [ 53.546195] Code: 85 bb 00 00 00 48 89 df 88 44 24 07 e8 ad a8 db ff 48 8b 3b 44 89 e1 48 c7 c2 40 03 72 a9 48 c7 c6 e0 01 72 a9 e8 84 3c ff ff <0f> 0b 0f b6 44 24 07 e9 8a 00 00 00 48 8d bf 38 01 00 00 e8 7c a8 [ 53.546201] RSP: 0018:ffff88006c067768 EFLAGS: 00010282 [ 53.546208] RAX: 0000000000000000 RBX: ffff880068844200 RCX: ffffffffa83e1a33 [ 53.546211] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88006d51e590 [ 53.546215] RBP: 0000000000000005 R08: ffffed000daa3cb3 R09: ffffed000daa3cb3 [ 53.546218] R10: 0000000000000001 R11: ffffed000daa3cb2 R12: 00000000ffffffff [ 53.546221] R13: ffff88006a1f8000 R14: 0000000000000200 R15: 0000000000000009 [ 53.546226] FS: 00007fb2f3646840(0000) GS:ffff88006d500000(0000) knlGS:0000000000000000 [ 53.546229] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 53.546234] CR2: 00007f0fd77f0008 CR3: 00000000687e6002 CR4: 00000000000206e0 [ 53.546237] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 53.546240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 53.546242] Call Trace: [ 53.546248] f2fs_submit_page_bio+0x95/0x740 [ 53.546253] read_node_page+0x161/0x1e0 [ 53.546271] ? truncate_node+0x650/0x650 [ 53.546283] ? add_to_page_cache_lru+0x12c/0x170 [ 53.546288] ? pagecache_get_page+0x262/0x2d0 [ 53.546292] __get_node_page+0x200/0x660 [ 53.546302] f2fs_update_inode_page+0x4a/0x160 [ 53.546306] f2fs_write_inode+0x86/0xb0 [ 53.546317] __writeback_single_inode+0x49c/0x620 [ 53.546322] writeback_single_inode+0xe4/0x1e0 [ 53.546326] sync_inode_metadata+0x93/0xd0 [ 53.546330] ? sync_inode+0x10/0x10 [ 53.546342] ? do_raw_spin_unlock+0xed/0x100 [ 53.546347] f2fs_sync_inode_meta+0xe0/0x130 [ 53.546351] f2fs_fill_super+0x287d/0x2d10 [ 53.546367] ? vsnprintf+0x742/0x7a0 [ 53.546372] ? f2fs_commit_super+0x180/0x180 [ 53.546379] ? up_write+0x20/0x40 [ 53.546385] ? set_blocksize+0x5f/0x140 [ 53.546391] ? f2fs_commit_super+0x180/0x180 [ 53.546402] mount_bdev+0x181/0x200 [ 53.546406] mount_fs+0x94/0x180 [ 53.546411] vfs_kern_mount+0x6c/0x1e0 [ 53.546415] do_mount+0xe5e/0x1510 [ 53.546420] ? fs_reclaim_release+0x9/0x30 [ 53.546424] ? copy_mount_string+0x20/0x20 [ 53.546428] ? fs_reclaim_acquire+0xd/0x30 [ 53.546435] ? __might_sleep+0x2c/0xc0 [ 53.546440] ? ___might_sleep+0x53/0x170 [ 53.546453] ? __might_fault+0x4c/0x60 [ 53.546468] ? _copy_from_user+0x95/0xa0 [ 53.546474] ? memdup_user+0x39/0x60 [ 53.546478] ksys_mount+0x88/0xb0 [ 53.546482] __x64_sys_mount+0x5d/0x70 [ 53.546495] do_syscall_64+0x65/0x130 [ 53.546503] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 53.547639] ---[ end trace b804d1ea2fec893e ]--- So if recover fails, we need to drop all recovered data. Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:25 -07:00
Sahitya Tummala	f8a7408381	f2fs: fix data corruption issue with hardware encryption Direct IO can be used in case of hardware encryption. The following scenario results into data corruption issue in this path - Thread A - Thread B- -> write file#1 in direct IO -> GC gets kicked in -> GC submitted bio on meta mapping for file#1, but pending completion -> write file#1 again with new data in direct IO -> GC bio gets completed now -> GC writes old data to the new location and thus file#1 is corrupted. Fix this by submitting and waiting for pending io on meta mapping for direct IO case in f2fs_map_blocks(). Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:25 -07:00
Chao Yu	797d94b07e	f2fs: fix to recover inode->i_flags of inode block during POR Testcase to reproduce this bug: 1. mkfs.f2fs /dev/sdd 2. mount -t f2fs /dev/sdd /mnt/f2fs 3. touch /mnt/f2fs/file 4. sync 5. chattr +a /mnt/f2fs/file 6. xfs_io -a /mnt/f2fs/file -c "fsync" 7. godown /mnt/f2fs 8. umount /mnt/f2fs 9. mount -t f2fs /dev/sdd /mnt/f2fs 10. xfs_io /mnt/f2fs/file There is no error when opening this file w/o O_APPEND, but actually, we expect the correct result should be: /mnt/f2fs/file: Operation not permitted The root cause is, in recover_inode(), we recover inode->i_flags more than F2FS_I(inode)->i_flags, so fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:24 -07:00
Chao Yu	6a23a85581	f2fs: spread f2fs_set_inode_flags() This patch changes codes as below: - use f2fs_set_inode_flags() to update i_flags atomically to avoid potential race. - synchronize F2FS_I(inode)->i_flags to inode->i_flags in f2fs_new_inode(). - use f2fs_set_inode_flags() to simply codes in f2fs_quota_{on,off}. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:24 -07:00
Chao Yu	fbfc2e102c	f2fs: fix to spread clear_cold_data() We need to drop PG_checked flag on page as well when we clear PG_uptodate flag, in order to avoid treating the page as GCing one later. Signed-off-by: Weichao Guo <guoweichao@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:24 -07:00
Jaegeuk Kim	0246fe9d2e	Revert "f2fs: fix to clear PG_checked flag in set_page_dirty()" This reverts commit `66110abc4c`. If we clear the cold data flag out of the writeback flow, we can miscount -1 by end_io, which incurs a deadlock caused by all I/Os being blocked during heavy GC. Balancing F2FS Async: - IO (CP: 1, Data: -1, Flush: ( 0 0 1), Discard: ( ... GC thread: IRQ - move_data_page() - set_page_dirty() - clear_cold_data() - f2fs_write_end_io() - type = WB_DATA_TYPE(page); here, we get wrong type - dec_page_count(sbi, type); - f2fs_wait_on_page_writeback() Cc: <stable@vger.kernel.org> Reported-and-Tested-by: Park Ju Hyung <qkrwngud825@gmail.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:24 -07:00
Jaegeuk Kim	003e915c01	f2fs: account read IOs and use IO counts for is_idle This patch adds issued read IO counts which is under block layer. Chao modified a bit, since: Below race can cause reversed reference on F2FS_RD_DATA, there is the same issue in f2fs_submit_page_bio(), fix them by relocate __submit_bio() and inc_page_count. Thread A Thread B - f2fs_write_begin - f2fs_submit_page_read - __submit_bio - f2fs_read_end_io - __read_end_io - dec_page_count(, F2FS_RD_DATA) - inc_page_count(, F2FS_RD_DATA) Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:24 -07:00
Chao Yu	9204be8a48	f2fs: fix to account IO correctly for cgroup writeback Now, we have supported cgroup writeback, it depends on correctly IO account of specified filesystem. But in commit `d1b3e72d54` ("f2fs: submit bio of in-place-update pages"), we split write paths from f2fs_submit_page_mbio() to two: - f2fs_submit_page_bio() for IPU path - f2fs_submit_page_bio() for OPU path But still we account write IO only in f2fs_submit_page_mbio(), result in incorrect IO account, fix it by adding missing IO account in IPU path. Fixes: `d1b3e72d54` ("f2fs: submit bio of in-place-update pages") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:23 -07:00
Chao Yu	fb363c5db4	f2fs: fix to account IO correctly Below race can cause reversed reference on dirty count, fix it by relocating __submit_bio() and inc_page_count(). Thread A Thread B - f2fs_inplace_write_data - f2fs_submit_page_bio - __submit_bio - f2fs_write_end_io - dec_page_count - inc_page_count Cc: <stable@vger.kernel.org> Fixes: `d1b3e72d54` ("f2fs: submit bio of in-place-update pages") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:23 -07:00
Jens Axboe	c4b87d0c1c	f2fs: remove request_list check in is_idle() This doesn't work on stacked devices, and it doesn't work on blk-mq devices. The request_list is only used on legacy, which we don't have much of anymore, and soon won't have any of. Kill the check. Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: linux-f2fs-devel@lists.sourceforge.net Signed-off-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:23 -07:00
Jaegeuk Kim	05d4dcf63d	f2fs: allow to mount, if quota is failed Since we can use the filesystem without quotas till next boot. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:23 -07:00
Sahitya Tummala	fe2b3bc0fc	f2fs: update REQ_TIME in f2fs_cross_rename() Update REQ_TIME in the missing path - f2fs_cross_rename(). Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> [Jaegeuk Kim: add it in f2fs_rename()] Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:23 -07:00
Sahitya Tummala	4aa5ef7fb1	f2fs: do not update REQ_TIME in case of error conditions The REQ_TIME should be updated only in case of success cases as followed at all other places in the file system. Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:23 -07:00
Chao Yu	2dc22029a7	f2fs: remove unneeded disable_nat_bits() Commit `7735730d39` ("f2fs: fix to propagate error from __get_meta_page()") added disable_nat_bits() in error path of __get_nat_bitmaps(), but it's unneeded, beause we will fail mount, we won't have chance to change nid usage status w/o nat full/empty bitmaps. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:22 -07:00
Chao Yu	3dd704fff9	f2fs: remove unused sbi->trigger_ssr_threshold Commit `a2a12b679f` ("f2fs: export SSR allocation threshold") introduced two threshold .min_ssr_sections and .trigger_ssr_threshold, but only .min_ssr_sections is used, so just remove redundant one for cleanup. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:22 -07:00
Chao Yu	9f2917f2bb	f2fs: shrink sbi->sb_lock coverage in set_file_temperature() file_set_{cold,hot} doesn't need holding sbi->sb_lock, so moving them out of the lock. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:22 -07:00
Chao Yu	3296764733	f2fs: fix to recover cold bit of inode block during POR Testcase to reproduce this bug: 1. mkfs.f2fs /dev/sdd 2. mount -t f2fs /dev/sdd /mnt/f2fs 3. touch /mnt/f2fs/file 4. sync 5. chattr +A /mnt/f2fs/file 6. xfs_io -f /mnt/f2fs/file -c "fsync" 7. godown /mnt/f2fs 8. umount /mnt/f2fs 9. mount -t f2fs /dev/sdd /mnt/f2fs 10. chattr -A /mnt/f2fs/file 11. xfs_io -f /mnt/f2fs/file -c "fsync" 12. umount /mnt/f2fs 13. mount -t f2fs /dev/sdd /mnt/f2fs 14. lsattr /mnt/f2fs/file -----------------N- /mnt/f2fs/file But actually, we expect the corrct result is: -------A---------N- /mnt/f2fs/file The reason is in step 9) we missed to recover cold bit flag in inode block, so later, in fsync, we will skip write inode block due to below condition check, result in lossing data in another SPOR. f2fs_fsync_node_pages() if (!IS_DNODE(page) \|\| !is_cold_node(page)) continue; Note that, I guess that some non-dir inode has already lost cold bit during POR, so in order to reenable recovery for those inode, let's try to recover cold bit in f2fs_iget() to save more fsynced data. Fixes: `c56675750d` ("f2fs: remove unneeded set_cold_node()") Cc: <stable@vger.kernel.org> 4.17+ Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:22 -07:00
Chao Yu	4bb9f775d5	f2fs: submit cached bio to avoid endless PageWriteback When migrating encrypted block from background GC thread, we only add them into f2fs inner bio cache, but forget to submit the cached bio, it may cause potential deadlock when we are waiting page writebacked, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2018-10-29 18:46:22 -07:00

1 2 3 4 5 ...

575890 Commits