[ Upstream commit 944e096aa2 ]
Dirty znodes will be written on flash in committing process with
following states:
process A | znode state
------------------------------------------------------
do_commit | DIRTY_ZNODE
ubifs_tnc_start_commit | DIRTY_ZNODE
get_znodes_to_commit | DIRTY_ZNODE | COW_ZNODE
layout_commit | DIRTY_ZNODE | COW_ZNODE
fill_gap | 0
write master | 0 or OBSOLETE_ZNODE
process B | znode state
------------------------------------------------------
do_commit | DIRTY_ZNODE[1]
ubifs_tnc_start_commit | DIRTY_ZNODE
get_znodes_to_commit | DIRTY_ZNODE | COW_ZNODE
ubifs_tnc_end_commit | DIRTY_ZNODE | COW_ZNODE
write_index | 0
write master | 0 or OBSOLETE_ZNODE[2] or
| DIRTY_ZNODE[3]
[1] znode is dirtied without concurrent committing process
[2] znode is copied up (re-dirtied by other process) before cleaned
up in committing process
[3] znode is re-dirtied after cleaned up in committing process
Currently, the clean znode count is updated in free_obsolete_znodes(),
which is called only in normal path. If do_commit failed, clean znode
count won't be updated, which triggers a failure ubifs assertion[4] in
ubifs_tnc_close():
ubifs_assert_failed [ubifs]: UBIFS assert failed: freed == n
[4] Commit 380347e9ca ("UBIFS: Add an assertion for clean_zn_cnt").
Fix it by re-statisticing cleaned znode count in tnc_destroy_cnext().
Fetch a reproducer in [Link].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216704
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c15859bfd3 ]
It willl cause null-ptr-deref in the following case:
uif_init()
ubi_add_volume()
cdev_add() -> if it fails, call kill_volumes()
device_register()
kill_volumes() -> if ubi_add_volume() fails call this function
ubi_free_volume()
cdev_del()
device_unregister() -> trying to delete a not added device,
it causes null-ptr-deref
So in ubi_free_volume(), it delete devices whether they are added
or not, it will causes null-ptr-deref.
Handle the error case whlie calling ubi_add_volume() to fix this
problem. If add volume fails, set the corresponding vol to null,
so it can not be accessed in kill_volumes() and release the
resource in ubi_add_volume() error path.
Fixes: 801c135ce7 ("UBI: Unsorted Block Images")
Suggested-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9af31d6ec1 ]
There is an use-after-free problem reported by KASAN:
==================================================================
BUG: KASAN: use-after-free in ubi_eba_copy_table+0x11f/0x1c0 [ubi]
Read of size 8 at addr ffff888101eec008 by task ubirsvol/4735
CPU: 2 PID: 4735 Comm: ubirsvol
Not tainted 6.1.0-rc1-00003-g84fa3304a7fc-dirty #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.14.0-1.fc33 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_report+0x171/0x472
kasan_report+0xad/0x130
ubi_eba_copy_table+0x11f/0x1c0 [ubi]
ubi_resize_volume+0x4f9/0xbc0 [ubi]
ubi_cdev_ioctl+0x701/0x1850 [ubi]
__x64_sys_ioctl+0x11d/0x170
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
When ubi_change_vtbl_record() returns an error in ubi_resize_volume(),
"new_eba_tbl" will be freed on error handing path, but it is holded
by "vol->eba_tbl" in ubi_eba_replace_table(). It means that the liftcycle
of "vol->eba_tbl" and "vol" are different, so when resizing volume in
next time, it causing an use-after-free fault.
Fix it by not freeing "new_eba_tbl" after it replaced in
ubi_eba_replace_table(), while will be freed in next volume resizing.
Fixes: 801c135ce7 ("UBI: Unsorted Block Images")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e874dcde1c ]
UBIFS calculates available space by c->main_bytes - c->lst.total_used
(which means non-index lebs' free and dirty space is accounted into
total available), then index lebs and four lebs (one for gc_lnum, one
for deletions, two for journal heads) are deducted.
In following situation, ubifs may get -ENOSPC from make_reservation():
LEB 84: DATAHD free 122880 used 1920 dirty 2176 dark 6144
LEB 110:DELETION free 126976 used 0 dirty 0 dark 6144 (empty)
LEB 201:gc_lnum free 126976 used 0 dirty 0 dark 6144
LEB 272:GCHD free 77824 used 47672 dirty 1480 dark 6144
LEB 356:BASEHD free 0 used 39776 dirty 87200 dark 6144
OTHERS: index lebs, zero-available non-index lebs
UBIFS calculates the available bytes is 6888 (How to calculate it:
126976 * 5[remain main bytes] - 1920[used] - 47672[used] - 39776[used] -
126976 * 1[deletions] - 126976 * 1[gc_lnum] - 126976 * 2[journal heads]
- 6144 * 5[dark] = 6888) after doing budget, however UBIFS cannot use
BASEHD's dirty space(87200), because UBIFS cannot find next BASEHD to
reclaim current BASEHD. (c->bi.min_idx_lebs equals to c->lst.idx_lebs,
the empty leb won't be found by ubifs_find_free_space(), and dirty index
lebs won't be picked as gced lebs. All non-index lebs has dirty space
less then c->dead_wm, non-index lebs won't be picked as gced lebs
either. So new free lebs won't be produced.). See more details in Link.
To fix it, reserve one leb for each journal head while doing budget.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216562
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 25fce616a6 ]
If target inode is a special file (eg. block/char device) with nlink
count greater than 1, the inode with ui->data will be re-written on
disk. However, UBIFS losts target inode's data_len while doing space
budget. Bad space budget may let make_reservation() return with -ENOSPC,
which could turn ubifs to read-only mode in do_writepage() process.
Fetch a reproducer in [Link].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216494
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b248eaf049 ]
Each dirty inode should reserve 'c->bi.inode_budget' bytes in space
budget calculation. Currently, space budget for dirty inode reports
more space than what UBIFS actually needs to write.
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1b2ba09060 ]
There is no space budget for ubifs_xrename(). It may let
make_reservation() return with -ENOSPC, which could turn
ubifs to read-only mode in do_writepage() process.
Fix it by adding space budget for ubifs_xrename().
Fetch a reproducer in [Link].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216569
Fixes: 9ec64962af ("ubifs: Implement RENAME_EXCHANGE")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c2c36cc6ca ]
Fix bad space budget when symlink file is encrypted. Bad space budget
may let make_reservation() return with -ENOSPC, which could turn ubifs
to read-only mode in do_writepage() process.
Fetch a reproducer in [Link].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216490
Fixes: ca7f85be8d ("ubifs: Add support for encrypted symlinks")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa6d148e6d ]
With CONFIG_UBIFS_FS_AUTHENTICATION not set, the compiler can assume that
ubifs_node_check_hash() is never true and drops the call to ubifs_bad_hash().
Is CONFIG_CC_OPTIMIZE_FOR_SIZE enabled this optimization does not happen anymore.
So When CONFIG_UBIFS_FS and CONFIG_CC_OPTIMIZE_FOR_SIZE is enabled but
CONFIG_UBIFS_FS_AUTHENTICATION is not set, the build errors is as followd:
ERROR: modpost: "ubifs_bad_hash" [fs/ubifs/ubifs.ko] undefined!
Fix it by add no-op ubifs_bad_hash() for the CONFIG_UBIFS_FS_AUTHENTICATION=n case.
Fixes: 16a26b20d2 ("ubifs: authentication: Add hashes to index nodes")
Signed-off-by: Li Hua <hucool.lihua@huawei.com>
Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8f88c73afe ]
If the return value of the uml_parse_vector_ifspec function is NULL,
we should call kfree(params) to prevent memory leak.
Fixes: 49da7e64f3 ("High Performance UML Vector Network Driver")
Signed-off-by: Xiang Yang <xiangyang3@huawei.com>
Acked-By: Anton Ivanov <anton.ivanov@kot-begemot.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae267fc1cf ]
Commit 7a10f0177e ("f2fs: don't give partially written atomic data
from process crash") attempted to drop atomic write data after process
crash, however, f2fs_abort_atomic_write() may be called from noncrash
case, fix it by adding missed PF_EXITING check condition
f2fs_file_flush().
- application crashs
- do_exit
- exit_signals -- sets PF_EXITING
- exit_files
- put_files_struct
- close_files
- filp_close
- flush (f2fs_file_flush)
- check atomic_write_task && PF_EXITING
- f2fs_abort_atomic_write
Fixes: 7a10f0177e ("f2fs: don't give partially written atomic data from process crash")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e6261beb0c ]
Files created by truncate have a size but no blocks, so
they can be allowed to set compression option.
Fixes: e1e8debec6 ("f2fs: add F2FS_IOC_SET_COMPRESS_OPTION ioctl")
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b1b9896718 ]
When aops->write_begin() does not initialize fsdata, KMSAN may report
an error passing the latter to aops->write_end().
Fix this by unconditionally initializing fsdata.
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Fixes: 95ae251fe8 ("f2fs: add fs-verity support")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0e8d040bfa ]
Otherwise, last .atomic_write_task will be remained in structure
f2fs_inode_info, resulting in aborting atomic_write accidentally
in race case. Meanwhile, clear original_i_size as well.
Fixes: 7a10f0177e ("f2fs: don't give partially written atomic data from process crash")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4d8d45df22 ]
We need to make sure i_size doesn't change until atomic write commit is
successful and restore it when commit is failed.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: 0e8d040bfa ("f2fs: clear atomic_write_task in f2fs_abort_atomic_write()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2f3a9ae990 ]
Commit 3db1de0e58 ("f2fs: change the current atomic write way")
removed old tracepoints, but it missed to add new one, this patch
fixes to introduce trace_f2fs_replace_atomic_write_block to trace
atomic_write commit flow.
Fixes: 3db1de0e58 ("f2fs: change the current atomic write way")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3066bc2d58 ]
The ARR (auto reload register) and CMP (compare) registers are
successively written. The status bits to check the update of these
registers are polled together with regmap_read_poll_timeout().
The condition to end the loop may become true, even if one of the
register isn't correctly updated.
So ensure both status bits are set before clearing them.
Fixes: e70a540b4e ("pwm: Add STM32 LPTimer PWM driver")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 334c7b13d3 ]
Commit 2cfe9bbec5 added support for the
RGB and green PWM controlled LEDs on the HiFive Unmatched board
managed by the leds-pwm-multicolor and leds-pwm drivers respectively.
All three colours of the RGB LED and the green LED run from different
lines of the same PWM, but with the same period so this works fine when
the LED drivers are loaded one after the other.
Unfortunately it does expose a race in the PWM driver when both LED
drivers are loaded at roughly the same time. Here is an example:
| Thread A | Thread B |
| led_pwm_mc_probe | led_pwm_probe |
| devm_fwnode_pwm_get | |
| pwm_sifive_request | |
| ddata->user_count++ | |
| | devm_fwnode_pwm_get |
| | pwm_sifive_request |
| | ddata->user_count++ |
| ... | ... |
| pwm_state_apply | pwm_state_apply |
| pwm_sifive_apply | pwm_sifive_apply |
Now both calls to pwm_sifive_apply will see that ddata->approx_period,
initially 0, is different from the requested period and the clock needs
to be updated. But since ddata->user_count >= 2 both calls will fail
with -EBUSY, which will then cause both LED drivers to fail to probe.
Fix it by letting the first call to pwm_sifive_apply update the clock
even when ddata->user_count != 1.
Fixes: 9e37a53eb0 ("pwm: sifive: Add a driver for SiFive SoC PWM")
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b74952aba6 ]
If the system does not come from reset (like when is booted via
kexec()), the peripheral might triger an IRQ before the data structures
are initialised.
Fixes:
[ 0.227710] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000f08
[ 0.227913] Call trace:
[ 0.227918] svs_isr+0x8c/0x538
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Link: https://lore.kernel.org/r/20221127-mtk-svs-v2-0-145b07663ea8@chromium.org
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8b3580df15 ]
While the acquired resources are tied to the lifetime of the RPC-IF core
device (through the use of managed resource functions), the actual
resource acquisition is triggered from the HyperBus and SPI child
drivers. Due to this mismatch, unbinding and rebinding the child
drivers manually fails with -EBUSY:
# echo rpc-if-hyperflash > /sys/bus/platform/drivers/rpc-if-hyperflash/unbind
# echo rpc-if-hyperflash > /sys/bus/platform/drivers/rpc-if-hyperflash/bind
rpc-if ee200000.spi: can't request region for resource [mem 0xee200000-0xee2001ff]
rpc-if-hyperflash: probe of rpc-if-hyperflash failed with error -16
The same is true for rpc-if-spi.
Fix this by moving all resource acquisition to the core driver's probe
routine.
Fixes: ca7d8b980b ("memory: add Renesas RPC-IF driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/c1012ef1de799e08a70817ab7313794e2d8d7bfb.1669213027.git.geert+renesas@glider.be
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 51de3fc9a8 ]
The rpcif structure is used as a common data structure, shared by the
RPC-IF core driver and by the HyperBus and SPI child drivers.
This poses several problems:
- Most structure members describe private core driver state, which
should not be accessible by the child drivers,
- The structure's lifetime is controlled by the child drivers,
complicating use by the core driver.
Fix this by moving the private core driver state to its own structure,
managed by the RPC-IF core driver, and store it in the core driver's
private data field. This requires absorbing the child's platform
device, as that was stored in the driver's private data field before.
Fixes: ca7d8b980b ("memory: add Renesas RPC-IF driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/09fbb6fa67d5a8cd48a08808c9afa2f6a499aa42.1669213027.git.geert+renesas@glider.be
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8d3c1fa3fa ]
When testing with a mixed zoned / convention device combination, there
are regular but not 100% reproducible failures in xfstests generic/113
where the __is_valid_data_blkaddr assert hits due to finding a hole.
This seems to be because f2fs_map_blocks can set this flag on a hole
when it was found in the extent cache.
Rework f2fs_iomap_begin to just check the special block numbers directly.
This has the added benefits of the WARN_ON showing which invalid block
address we found, and being properly error out on delalloc blocks that
are confusingly called unwritten but not actually suitable for direct
I/O.
Fixes: 1517c1a7a4 ("f2fs: implement iomap operations")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ddf75a86ab ]
hd44780_probe() allocates a memory chunk for hd with kzalloc() and
makes "lcd->drvdata->hd44780" point to it. When we call hd44780_remove(),
we should release all relevant memory and resource. But "lcd->drvdata
->hd44780" is not released, which will lead to a memory leak.
We should release the "lcd->drvdata->hd44780" in hd44780_remove() to fix
the memory leak bug.
Fixes: 718e05ed92 ("auxdisplay: Introduce hd44780_common.[ch]")
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 8c710f7525 upstream.
The tcindex classifier has served us well for about a quarter of a century
but has not been getting much TLC due to lack of known users. Most recently
it has become easy prey to syzkaller. For this reason, we are retiring it.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4bf7358816 upstream.
Port silent mode detection to the future (post make-4.4) versions of gnu make.
Makefile contains the following piece of make code to detect if option -s is
specified on the command line.
ifneq ($(findstring s,$(filter-out --%,$(MAKEFLAGS))),)
This code is executed by make at parse time and assumes that MAKEFLAGS
does not contain command line variable definitions.
Currently if the user defines a=s on the command line, then at build only
time MAKEFLAGS contains " -- a=s".
However, starting with commit dc2d963989b96161472b2cd38cef5d1f4851ea34
MAKEFLAGS contains command line definitions at both parse time and
build time.
This '-s' detection code then confuses a command line variable
definition which contains letter 's' with option -s.
$ # old make
$ make net/wireless/ocb.o a=s
CALL scripts/checksyscalls.sh
DESCEND objtool
$ # this a new make which defines makeflags at parse time
$ ~/src/gmake/make/l64/make net/wireless/ocb.o a=s
$
We can see here that the letter 's' from 'a=s' was confused with -s.
This patch checks for presence of -s using a method recommended by the
make manual here
https://www.gnu.org/software/make/manual/make.html#Testing-Flags.
Link: https://lists.gnu.org/archive/html/bug-make/2022-11/msg00190.html
Reported-by: Jan Palus <jpalus+gnu@fastmail.com>
Signed-off-by: Dmitry Goncharov <dgoncharov@users.sf.net>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 26edb30dd1 upstream.
Jan reported the new algorithm as merged might be problematic if the
queue being awaken becomes empty between the waitqueue_active inside
sbq_wake_ptr check and the wake up. If that happens, wake_up_nr will
not wake up any waiter and we loose too many wake ups. In order to
guarantee progress, we need to wake up at least one waiter here, if
there are any. This now requires trying to wake up from every queue.
Instead of walking through all the queues with sbq_wake_ptr, this call
moves the wake up inside that function. In a previous version of the
patch, I found that updating wake_index several times when walking
through queues had a measurable overhead. This ensures we only update
it once, at the end.
Fixes: 4f8126bb23 ("sbitmap: Use single per-bitmap counting to wake up queued tags")
Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20221115224553.23594-4-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 976570b4ec upstream.
When a queue is awaken, the wake_index written by sbq_wake_ptr currently
keeps pointing to the same queue. On the next wake up, it will thus
retry the same queue, which is unfair to other queues, and can lead to
starvation. This patch, moves the index update to happen before the
queue is returned, such that it will now try a different queue first on
the next wake up, improving fairness.
Fixes: 4f8126bb23 ("sbitmap: Use single per-bitmap counting to wake up queued tags")
Reported-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20221115224553.23594-2-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0aa2988e4f upstream.
Unconditionally calling radix_tree_preload_end() results in a OOPS
message as the preload is only conditionally called for
gfpflags_allow_blocking().
[ 20.267323] BUG: using smp_processor_id() in preemptible [00000000] code: fio/416
[ 20.267837] caller is brd_insert_page.part.0+0xbe/0x190 [brd]
[ 20.269436] Call Trace:
[ 20.269598] <TASK>
[ 20.269742] dump_stack_lvl+0x32/0x50
[ 20.269982] check_preemption_disabled+0xd1/0xe0
[ 20.270289] brd_insert_page.part.0+0xbe/0x190 [brd]
[ 20.270664] brd_submit_bio+0x33f/0xf40 [brd]
Use radix_tree_maybe_preload() which does preload only if
gfpflags_allow_blocking() is true but also takes the lock. Therefore,
unconditionally calling radix_tree_preload_end() should not create any
issues and the message disappears.
Fixes: 6ded703c56 ("brd: check for REQ_NOWAIT and set correct page allocation mask")
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Link: https://lore.kernel.org/r/20230217121442.33914-1-p.raghav@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aaa3c08ee0 upstream.
Even after commit 908d4bb7c5 ("qede: fix interrupt coalescing
configuration"), some entries of the coal_entry array may theoretically
be used uninitialized:
1. qede_alloc_fp_array() allocates QEDE_MAX_RSS_CNT entries for
coal_entry. The initial allocation uses kcalloc, so everything is
initialized.
2. The user sets a small number of queues (ethtool -L).
coal_entry is reallocated for the actual small number of queues.
3. The user sets a bigger number of queues.
coal_entry is reallocated bigger. The added entries are not
necessarily initialized.
In practice, the reallocations will actually keep using the originally
allocated region of memory, but we should not rely on it.
The reallocation is unnecessary. coal_entry can always have
QEDE_MAX_RSS_CNT entries.
Fixes: 908d4bb7c5 ("qede: fix interrupt coalescing configuration")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Nacked-by: Manish Chopra <manishc@marvell.com>
Acked-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>