commit 664ba972df upstream.
We don't need to allocate bio partially in order to maximize sequential writes.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 15d0435455 upstream.
If inode becomes dirty, we need to check the # of dirty inodes whether or not
further checkpoint would be required.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit b9610bdfcb upstream.
If there are a lot of dirty inodes, we need to flush all of them when doing
checkpoint. So, we need to count this for enough free space.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 02110a4fd5 upstream.
This patch makes sure it returns a positive value instead of a probable
casted negative value as shrink count.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 3a2ad5672b upstream.
Let build_free_nids support sync/async methods, in allocation flow of nids,
we use synchronuous method, so that we can avoid looping in alloc_nid when
free memory is low; in unblock_operations and f2fs_balance_fs_bg we use
asynchronuous method in where low memory condition can interrupt us.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit b8559dc242 upstream.
During free nid allocation, in order to do preallocation, we will tag free
nid entry as allocated one and still leave it in free nid list, for other
allocators who want to grab free nids, it needs to traverse the free nid
list for lookup. It becomes overhead in scenario of allocating free nid
intensively by multithreads.
This patch splits free nid list to two list: {free,alloc}_nid_list, to
keep free nids and preallocated free nids separately, after that, traverse
latency will be gone, besides split nid_cnt for separate statistic.
Additionally, introduce __insert_nid_to_list and __remove_nid_from_list for
cleanup.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: modify f2fs_bug_on to avoid needless branches]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit a11b9f65ea upstream.
We don't need to keep incomplete created inode in cache, so if we fail to
add link into directory during new inode creation, it's better to set
nlink of inode to zero, then we can evict inode immediately. Otherwise
release of nid belong to inode will be delayed until inode cache is being
shrunk, it may cause a seemingly endless loop while allocating free nids
in time of testing generic/269 case of fstest suit.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: add update_inode_page to fix kernel panic]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 0c0b471e43 upstream.
f2fs contained a number of endianness conversion bugs.
Also, one function should have been 'static'.
Found with sparse by running 'make C=2 CF=-D__CHECK_ENDIAN__ fs/f2fs/'
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 9de6927975 upstream.
In fsync_node_pages, if f2fs was taged with CP_ERROR_FLAG, make sure bio
cache was flushed before return.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit b691d98fdd upstream.
In order to avoid racing problem, make largest extent cache being updated
under lock.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 58736fa60f upstream.
f2fs can support fallocating blocks beyond file size without changing the
size, but ->fiemap of f2fs was restricted and can't detect these extents
fallocated past EOF, now relieve the restriction.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 6f2d8ed654 upstream.
In f2fs_map_blocks, let f2fs_balance_fs detects node page modification
with dn.node_changed to avoid miss some corner cases.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 9434fcde1f upstream.
f2fs_balance_fs should be called in between node page updating, otherwise
node page count will exceeded far beyond watermark of triggering
foreground garbage collection, result in facing high risk of hitting LFS
allocation failure.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 933439c8f3 upstream.
If there is no dirty pages in inode, we should give a chance to detach
the inode from global dirty list, otherwise it needs to call another
unnecessary .writepages for detaching.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 2dd15654ac upstream.
In f2fs_fill_super, if there is any IO error occurs during recovery,
cached discard entries will be leaked, in order to avoid this, make
write_checkpoint() handle memory release by itself, besides, move
clear_prefree_segments to write_checkpoint for readability.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 2411cf5bef upstream.
During nid allocation, it needs to exclude building and allocating flow
of free nids, this is because while building free nid cache, there are two
steps: a) load free nids from unused nat entries in NAT pages, b) update
free nid cache by checking nat journal. The two steps should be atomical,
otherwise an used nid can be allocated as free one after a) and before b).
This patch adds missing lock which covers build_free_nids in
unlock_operation and f2fs_balance_fs_bg to avoid that.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit e87f7329bb upstream.
In the last ilen case, i was already increased, resulting in accessing out-
of-boundary entry of do_replace and blkaddr.
Fix to check ilen first to exit the loop.
Fixes: 2aa8fbb9693020 ("f2fs: refactor __exchange_data_block for speed up")
Cc: stable@vger.kernel.org # 4.8+
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
commit 073931017b upstream.
Cherry-pick to f2fs only for generic/375 from:
(073931017: posix_acl: Clear SGID bit when setting file permissions)
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This reverts commit c5616f2f87.
If we re-init the per-cpu boostgroup spinlock every time that
we add a new boosted cgroup, we can easily wipe out (reinit)
a spinlock struct while in a critical section. We should only
be setting up the per-cpu boostgroup data, and the spin_lock
initialization need only happen once - which we're already
doing in a postcore_initcall.
For example:
-------- CPU 0 -------- | -------- CPU1 --------
cgroupX boost group added |
schedtune_enqueue_task |
acquires(bg->lock) | cgroupY boost group added
| for_each_cpu()
| raw_spin_lock_init(bg->lock)
releases(bg->lock) |
BUG (already unlocked) |
|
This results in the following BUG from the debug spinlock code:
BUG: spinlock already unlocked on CPU#5, rcuop/6/68
Change-Id: I3016702780b461a0cd95e26c538cd18df27d6316
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
(cherry picked from commit ec8d7c14ea)
Tetsuo has properly noted that mmput slow path might get blocked waiting
for another party (e.g. exit_aio waits for an IO). If that happens the
oom_reaper would be put out of the way and will not be able to process
next oom victim. We should strive for making this context as reliable
and independent on other subsystems as much as possible.
Introduce mmput_async which will perform the slow path from an async
(WQ) context. This will delay the operation but that shouldn't be a
problem because the oom_reaper has reclaimed the victim's address space
for most cases as much as possible and the remaining context shouldn't
bind too much memory anymore. The only exception is when mmap_sem
trylock has failed which shouldn't happen too often.
The issue is only theoretical but not impossible.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Only backports mmput_async.
Change-Id: I5fe54abcc629e7d9eab9fe03908903d1174177f1
Signed-off-by: Arve Hjønnevåg <arve@android.com>
(from https://patchwork.kernel.org/patch/9954125/)
Use binder_alloc struct's mm_struct rather than getting
a reference to the mm struct through get_task_mm to
avoid a potential deadlock between lru lock, task lock and
dentry lock, since a thread can be holding the task lock
and the dentry lock while trying to acquire the lru lock.
Test: ran binderLibTest, throughputtest, interfacetest and
mempressure w/lockdep
Bug: 63926541
Change-Id: Icc661404eb7a4a2ecc5234b1bf8f0104665f9b45
Acked-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Sherry Yang <sherryy@android.com>
(from https://patchwork.kernel.org/patch/9954123/)
The vma argument in update_page_range is no longer
used after 74310e06 ("android: binder: Move buffer
out of area shared with user space"), since mmap_handler
no longer calls update_page_range with a vma.
Test: ran binderLibTest, throughputtest, interfacetest and mempressure
Bug: 36007193
Change-Id: Ibd6f24c11750f8f7e6ed56e40dd18c08e02ace25
Acked-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Sherry Yang <sherryy@android.com>
(from https://patchwork.kernel.org/patch/9945123/)
Drop the global lru lock in isolate callback
before calling zap_page_range which calls
cond_resched, and re-acquire the global lru
lock before returning. Also change return
code to LRU_REMOVED_RETRY.
Use mmput_async when fail to acquire mmap sem
in an atomic context.
Fix "BUG: sleeping function called from invalid context"
errors when CONFIG_DEBUG_ATOMIC_SLEEP is enabled.
Bug: 63926541
Change-Id: I45dbada421b715abed9a66d03d30ae2285671ca1
Fixes: f2517eb76f ("android: binder: Add global lru shrinker to binder")
Reported-by: Kyle Yan <kyan@codeaurora.org>
Acked-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Sherry Yang <sherryy@android.com>
If the cpufreq_register_governor fails, resources for thread
speedchange_task should be released.
currently, concerned resources are released in module_exit,
but if module loading fails, exit will not be called
and resources will remain acquired. this may leave kernel
in an unstable state.
Change-Id: Ic33f058c069d30bfd114fa1c1380325c8e00b51c
Signed-off-by: gaurav jindal <gauravjindal1104@gmail.com>
Commit b235beea9e ("Clarify naming of thread info/stack allocators")
breaks the build on some powerpc configs, where THREAD_SIZE < PAGE_SIZE:
kernel/fork.c:235:2: error: implicit declaration of function 'free_thread_stack'
kernel/fork.c:355:8: error: assignment from incompatible pointer type
stack = alloc_thread_stack_node(tsk, node);
^
Fix it by renaming free_stack() to free_thread_stack(), and updating the
return type of alloc_thread_stack_node().
Fixes: b235beea9e ("Clarify naming of thread info/stack allocators")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 38331309
Change-Id: I5b7f920b459fb84adf5fc75f83bb488b855c4deb
(cherry picked from commit 9521d39976)
Signed-off-by: Zubin Mithra <zsm@google.com>
commit 173b8439e1 upstream.
While we allow deletes without the key, the following should not be
permitted:
# cd /vdc/encrypted-dir-without-key
# ls -l
total 4
-rw-r--r-- 1 root root 0 Dec 27 22:35 6,LKNRJsp209FbXoSvJWzB
-rw-r--r-- 1 root root 286 Dec 27 22:35 uRJ5vJh9gE7vcomYMqTAyD
# mv uRJ5vJh9gE7vcomYMqTAyD 6,LKNRJsp209FbXoSvJWzB
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3bb2d5587 upstream.
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of
__ext4_set_acl() into ext4_set_acl(). That way the function will not be
called when inheriting ACLs which is what we want as it prevents SGID
bit clearing and the mode has been properly set by posix_acl_create()
anyway.
Fixes: 073931017b
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a056bdaae7 upstream.
mpage_submit_page() can race with another process growing i_size and
writing data via mmap to the written-back page. As mpage_submit_page()
samples i_size too early, it may happen that ext4_bio_write_page()
zeroes out too large tail of the page and thus corrupts user data.
Fix the problem by sampling i_size only after the page has been
write-protected in page tables by clear_page_dirty_for_io() call.
Reported-by: Michael Zimmer <michael@swarm64.com>
CC: stable@vger.kernel.org
Fixes: cb20d51883
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 50e7663233 upstream.
Cpusets vs. suspend-resume is _completely_ broken. And it got noticed
because it now resulted in non-cpuset usage breaking too.
On suspend cpuset_cpu_inactive() doesn't call into
cpuset_update_active_cpus() because it doesn't want to move tasks about,
there is no need, all tasks are frozen and won't run again until after
we've resumed everything.
But this means that when we finally do call into
cpuset_update_active_cpus() after resuming the last frozen cpu in
cpuset_cpu_active(), the top_cpuset will not have any difference with
the cpu_active_mask and this it will not in fact do _anything_.
So the cpuset configuration will not be restored. This was largely
hidden because we would unconditionally create identity domains and
mobile users would not in fact use cpusets much. And servers what do use
cpusets tend to not suspend-resume much.
An addition problem is that we'd not in fact wait for the cpuset work to
finish before resuming the tasks, allowing spurious migrations outside
of the specified domains.
Fix the rebuild by introducing cpuset_force_rebuild() and fix the
ordering with cpuset_wait_for_hotplug().
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: deb7aa308e ("cpuset: reorganize CPU / memory hotplug handling")
Link: http://lkml.kernel.org/r/20170907091338.orwxrqkbfkki3c24@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 77bf25ea70 upstream.
[Back-ported to 4.4. The difference is the file location of the struct
definition that's adding the mutex.
This fixes reported kernel panics in 4.4-stable from simultaneous
controller resets that was never supposed to be allowed to happen.]
Signed-off-by: Keith Busch <keith.busch@intel.com>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bbf26183b7 upstream.
uwbd_start() calls kthread_run() and checks that the return value is
not NULL. But the return value is not NULL in case kthread_run() fails,
it takes the form of ERR_PTR(-EINTR).
Use IS_ERR() instead.
Also add a check to uwbd_stop().
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0964e40947 upstream.
The driver calls spi_get_drvdata() in its ->remove hook even though it
has never called spi_set_drvdata(). Stack trace for posterity:
Unable to handle kernel NULL pointer dereference at virtual address 00000220
Internal error: Oops: 5 [#1] SMP ARM
[<8072f564>] (mutex_lock) from [<7f1400d0>] (iio_device_unregister+0x24/0x7c [industrialio])
[<7f1400d0>] (iio_device_unregister [industrialio]) from [<7f15e020>] (mcp320x_remove+0x20/0x30 [mcp320x])
[<7f15e020>] (mcp320x_remove [mcp320x]) from [<8055a8cc>] (spi_drv_remove+0x2c/0x44)
[<8055a8cc>] (spi_drv_remove) from [<805087bc>] (__device_release_driver+0x98/0x134)
[<805087bc>] (__device_release_driver) from [<80509180>] (driver_detach+0xdc/0xe0)
[<80509180>] (driver_detach) from [<8050823c>] (bus_remove_driver+0x5c/0xb0)
[<8050823c>] (bus_remove_driver) from [<80509ab0>] (driver_unregister+0x38/0x58)
[<80509ab0>] (driver_unregister) from [<7f15e69c>] (mcp320x_driver_exit+0x14/0x1c [mcp320x])
[<7f15e69c>] (mcp320x_driver_exit [mcp320x]) from [<801a78d0>] (SyS_delete_module+0x184/0x1d0)
[<801a78d0>] (SyS_delete_module) from [<80108100>] (ret_fast_syscall+0x0/0x1c)
Fixes: f5ce4a7a92 ("iio: adc: add driver for MCP3204/08 12-bit ADC")
Cc: Oskar Andero <oskar.andero@gmail.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6f4794371 upstream.
Commit f686a36b4b ("iio: adc: mcp320x: Add support for mcp3301")
returns a signed voltage from mcp320x_adc_conversion() but neglects that
the caller interprets a negative return value as failure. Only mcp3301
(and the upcoming mcp3550/1/3) is affected as the other chips are
incapable of measuring negative voltages.
Fix and while at it, add mcp3301 to the list of supported chips at the
top of the file.
Fixes: f686a36b4b ("iio: adc: mcp320x: Add support for mcp3301")
Cc: Andrea Galbusera <gizero@gmail.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7ee3b7ebcb upstream.
The serial interface can be reset by writing 32 consecutive 1s to the device.
'ret' was initialized correctly but its value was overwritten when
ad7793_check_platform_data() was called. Since a dedicated reset function
is present now, it should be used instead.
Fixes: 2edb769d24 ("iio:ad7793: Add support for the ad7798 and ad7799")
Signed-off-by: Dragos Bogdan <dragos.bogdan@analog.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d62c78a6e upstream.
If an IIO device returns an error code for a read access via debugfs, it
is currently ignored by the IIO core (other than emitting an error
message). Instead, return this error code to user space, so upper layers
can detect it correctly.
Signed-off-by: Matt Fornero <matt.fornero@mathworks.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f790923f14 upstream.
Depends on: 691c4b95d1 ("iio: ad_sigma_delta: Implement a dedicated reset function")
SPI host drivers can use DMA to transfer data, so the buffer should be properly allocated.
Keeping it on the stack could cause an undefined behavior.
The dedicated reset function solves this issue.
Signed-off-by: Stefan Popa <stefan.popa@analog.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7fc10de8d4 upstream.
Since most of the SD ADCs have the option of reseting the serial
interface by sending a number of SCLKs with CS = 0 and DIN = 1,
a dedicated function that can do this is usefull.
Needed for the patch: iio: ad7793: Fix the serial interface reset
Signed-off-by: Dragos Bogdan <dragos.bogdan@analog.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>