commit 50252e4b5e upstream.
signalfd_poll() and binder_poll() are special in that they use a
waitqueue whose lifetime is the current task, rather than the struct
file as is normally the case. This is okay for blocking polls, since a
blocking poll occurs within one task; however, non-blocking polls
require another solution. This solution is for the queue to be cleared
before it is freed, by sending a POLLFREE notification to all waiters.
Unfortunately, only eventpoll handles POLLFREE. A second type of
non-blocking poll, aio poll, was added in kernel v4.18, and it doesn't
handle POLLFREE. This allows a use-after-free to occur if a signalfd or
binder fd is polled with aio poll, and the waitqueue gets freed.
Fix this by making aio poll handle POLLFREE.
A patch by Ramji Jiyani <ramjiyani@google.com>
(https://lore.kernel.org/r/20211027011834.2497484-1-ramjiyani@google.com)
tried to do this by making aio_poll_wake() always complete the request
inline if POLLFREE is seen. However, that solution had two bugs.
First, it introduced a deadlock, as it unconditionally locked the aio
context while holding the waitqueue lock, which inverts the normal
locking order. Second, it didn't consider that POLLFREE notifications
are missed while the request has been temporarily de-queued.
The second problem was solved by my previous patch. This patch then
properly fixes the use-after-free by handling POLLFREE in a
deadlock-free way. It does this by taking advantage of the fact that
freeing of the waitqueue is RCU-delayed, similar to what eventpoll does.
Fixes: 2c14fa838c ("aio: implement IOCB_CMD_POLL")
Cc: <stable@vger.kernel.org> # v4.18+
Link: https://lore.kernel.org/r/20211209010455.42744-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 185125206
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I748544276cf2fe214097751507d9c0ee4e3d3475
commit 363bee27e2 upstream.
Currently, aio_poll_wake() will always remove the poll request from the
waitqueue. Then, if aio_poll_complete_work() sees that none of the
polled events are ready and the request isn't cancelled, it re-adds the
request to the waitqueue. (This can easily happen when polling a file
that doesn't pass an event mask when waking up its waitqueue.)
This is fundamentally broken for two reasons:
1. If a wakeup occurs between vfs_poll() and the request being
re-added to the waitqueue, it will be missed because the request
wasn't on the waitqueue at the time. Therefore, IOCB_CMD_POLL
might never complete even if the polled file is ready.
2. When the request isn't on the waitqueue, there is no way to be
notified that the waitqueue is being freed (which happens when its
lifetime is shorter than the struct file's). This is supposed to
happen via the waitqueue entries being woken up with POLLFREE.
Therefore, leave the requests on the waitqueue until they are actually
completed (or cancelled). To keep track of when aio_poll_complete_work
needs to be scheduled, use new fields in struct poll_iocb. Remove the
'done' field which is now redundant.
Note that this is consistent with how sys_poll() and eventpoll work;
their wakeup functions do *not* remove the waitqueue entries.
Fixes: 2c14fa838c ("aio: implement IOCB_CMD_POLL")
Cc: <stable@vger.kernel.org> # v4.18+
Link: https://lore.kernel.org/r/20211209010455.42744-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 185125206
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic85396773d98ef3ccf48559462557e4faa3289c3
commit 9537bae0da upstream.
wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up
all exclusive waiters. Yet, POLLFREE *must* wake up all waiters. epoll
and aio poll are fortunately not affected by this, but it's very
fragile. Thus, the new function wake_up_pollfree() has been introduced.
Convert signalfd to use wake_up_pollfree().
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: d80e731eca ("epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211209010455.42744-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 185125206
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1d97ac9c9fbb28c164bd4b51deeefbbb139205e7
commit a880b28a71 upstream.
wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up
all exclusive waiters. Yet, POLLFREE *must* wake up all waiters. epoll
and aio poll are fortunately not affected by this, but it's very
fragile. Thus, the new function wake_up_pollfree() has been introduced.
Convert binder to use wake_up_pollfree().
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: f5cb779ba1 ("ANDROID: binder: remove waitqueue when thread exits.")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211209010455.42744-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 185125206
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8354c40ed73a7d88132a74a388704f0eb307a618
commit 42288cb44c upstream.
Several ->poll() implementations are special in that they use a
waitqueue whose lifetime is the current task, rather than the struct
file as is normally the case. This is okay for blocking polls, since a
blocking poll occurs within one task; however, non-blocking polls
require another solution. This solution is for the queue to be cleared
before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'.
However, that has a bug: wake_up_poll() calls __wake_up() with
nr_exclusive=1. Therefore, if there are multiple "exclusive" waiters,
and the wakeup function for the first one returns a positive value, only
that one will be called. That's *not* what's needed for POLLFREE;
POLLFREE is special in that it really needs to wake up everyone.
Considering the three non-blocking poll systems:
- io_uring poll doesn't handle POLLFREE at all, so it is broken anyway.
- aio poll is unaffected, since it doesn't support exclusive waits.
However, that's fragile, as someone could add this feature later.
- epoll doesn't appear to be broken by this, since its wakeup function
returns 0 when it sees POLLFREE. But this is fragile.
Although there is a workaround (see epoll), it's better to define a
function which always sends POLLFREE to all waiters. Add such a
function. Also make it verify that the queue really becomes empty after
all waiters have been woken up.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211209010455.42744-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 185125206
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I4f69da5bbbad53975024d027fa1bbe22522c6efe
Under some conditions, USB gadget devices can show allocated buffer
contents to a host. Fix this up by zero-allocating them so that any
extra data will all just be zeros.
Reported-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Tested-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 86ebbc11bb)
Bug: 210292367
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I72b4376cd4296a8b8af0ade2d702cd420146f3aa
Release resources when aborting a command. Make sure that aborted commands
are completed once by clearing the corresponding tag bit from
hba->outstanding_reqs. This patch is an improved version of commit
3ff1f6b6ba ("scsi: ufs: core: Improve SCSI abort handling").
Link: https://lore.kernel.org/r/20211203231950.193369-14-bvanassche@acm.org
Fixes: 7a3e97b0dc ("[SCSI] ufshcd: UFS Host controller driver")
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1fbaa02dfd git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next)
[Stanley: Resolved minor conflict in drivers/scsi/ufshcd.c]
Bug: 204438323
Change-Id: Ifdf7f016c0d1986fe905f13be8abbeb54af4bce5
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
hba->outstanding_tasks, which is read under host_lock spinlock, tells the
interrupt handler what task management tags are in use by the driver. The
doorbell register bits indicate which tags are in use by the hardware. A
doorbell bit that is 0 is because the bit has yet to be set by the driver,
or because the task is complete. It is only possible to disambiguate the 2
cases, if reading/writing the doorbell register is synchronized with
reading/writing hba->outstanding_tasks.
For that reason, reading REG_UTP_TASK_REQ_DOOR_BELL must be done under
spinlock.
Bug: 210094292
(cherry picked from commit 5cb37a2635)
Change-Id: I9a83393fe97682a271ec67834dc2d2888d3fbb60
Link: https://lore.kernel.org/r/20211108064815.569494-3-adrian.hunter@intel.com
Fixes: f5ef336fd2 ("scsi: ufs: core: Fix task management completion")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
__ufshcd_issue_tm_cmd() clears req->end_io_data after timing out, which
races with the completion function ufshcd_tmc_handler() which expects
req->end_io_data to have a value.
Note __ufshcd_issue_tm_cmd() and ufshcd_tmc_handler() are already
synchronized using hba->tmf_rqs and hba->outstanding_tasks under the
host_lock spinlock.
It is also not necessary (nor typical) to clear req->end_io_data because
the block layer does it before allocating out requests e.g. via
blk_get_request().
So fix by not clearing it.
Bug: 210094292
(cherry picked from commit 886fe2915c)
Change-Id: I2c6f8b81f2aed10a85c167aa97dcbe9496677de5
[Stanley: Resolved minor conflict in drivers/scsi/ufshcd.c]
Link: https://lore.kernel.org/r/20211108064815.569494-2-adrian.hunter@intel.com
Fixes: f5ef336fd2 ("scsi: ufs: core: Fix task management completion")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Sometimes USB hosts can ask for buffers that are too large from endpoint
0, which should not be allowed. If this happens for OUT requests, stall
the endpoint, but for IN requests, trim the request size to the endpoint
buffer size.
Co-developed-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 153a2d7e33)
Bug: 210292367
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9bbd6154177d7a1fb6c2e3a3dffa96634d85bb7f
(upstream commit 669bc5a188)
As we register two usb buses for each xHC, and systems with several
hosts are more and more common it is getting hard to follow the
flow of debug messages without knowing which bus they belong to
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Bug: 202901721
Signed-off-by: Puma Hsu <pumahsu@google.com>
Change-Id: I55428c864c57e5e10c71ae2e539ca086db31a52d
(upstream commit 0d9b9f533b)
Add more debugging messages to follow what happends to a URB internally
in special cases like URB cancel, halted endpoints and endpoint reset.
Helps tracking issues like URB never given back by host.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Bug: 202901721
Signed-off-by: Puma Hsu <pumahsu@google.com>
Change-Id: Ief6507db231a115f138c78f288929736a631a385
(upstream commit 2847c46c61)
This reverts commit 5d5323a6f3.
That commit effectively disabled Intel host initiated U1/U2 lpm for devices
with periodic endpoints.
Before that commit we disabled host initiated U1/U2 lpm if the exit latency
was larger than any periodic endpoint service interval, this is according
to xhci spec xhci 1.1 specification section 4.23.5.2
After that commit we incorrectly checked that service interval was smaller
than U1/U2 inactivity timeout. This is not relevant, and can't happen for
Intel hosts as previously set U1/U2 timeout = 105% * service interval.
Patch claimed it solved cases where devices can't be enumerated because of
bandwidth issues. This might be true but it's a side effect of accidentally
turning off lpm.
exit latency calculations have been revised since then
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Bug: 202901721
Signed-off-by: Puma Hsu <pumahsu@google.com>
Change-Id: I5d77ab7e34805730c94da9d2a0052fb6096a0b69
(upstream commit 94f339147f)
Only TDs with status TD_CLEARING_CACHE will be given back after
cache is cleared with a set TR deq command.
xhci_invalidate_cached_td() failed to set the TD_CLEARING_CACHE status
for some cancelled TDs as it assumed an endpoint only needs to clear the
TD it stopped on.
This isn't always true. For example with streams enabled an endpoint may
have several stream rings, each stopping on a different TDs.
Note that if an endpoint has several stream rings, the current code
will still only clear the cache of the stream pointed to by the last
cancelled TD in the cancel list.
This patch only focus on making sure all canceled TDs are given back,
avoiding hung task after device removal.
Another fix to solve clearing the caches of all stream rings with
cancelled TDs is needed, but not as urgent.
This issue was simultanously discovered and debugged by
by Tao Wang, with a slightly different fix proposal.
Fixes: 674f8438c1 ("xhci: split handling halted endpoints into two steps")
Cc: <stable@vger.kernel.org> #5.12
Reported-by: Tao Wang <wat@codeaurora.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Bug: 202901721
Signed-off-by: Puma Hsu <pumahsu@google.com>
Change-Id: I0ceff10453a99183d27bc53e64c2c193e0ac429a
Before commit fc0c209c14 ("clk: Allow parents to be specified without
string names") child clks couldn't find their parent until the parent
clk was added to a list in __clk_core_init(). After that commit, child
clks can reference their parent clks directly via a clk_hw pointer, or
they can lookup that clk_hw pointer via DT if the parent clk is
registered with an OF clk provider.
The common clk framework treats hw->core being non-NULL as "the clk is
registered" per the logic within clk_core_fill_parent_index():
parent = entry->hw->core;
/*
* We have a direct reference but it isn't registered yet?
* Orphan it and let clk_reparent() update the orphan status
* when the parent is registered.
*/
if (!parent)
Therefore we need to be extra careful to not set hw->core until the clk
is fully registered with the clk framework. Otherwise we can get into a
situation where a child finds a parent clk and we move the child clk off
the orphan list when the parent isn't actually registered, wrecking our
enable accounting and breaking critical clks.
Consider the following scenario:
CPU0 CPU1
---- ----
struct clk_hw clkBad;
struct clk_hw clkA;
clkA.init.parent_hws = { &clkBad };
clk_hw_register(&clkA) clk_hw_register(&clkBad)
... __clk_register()
hw->core = core
...
__clk_register()
__clk_core_init()
clk_prepare_lock()
__clk_init_parent()
clk_core_get_parent_by_index()
clk_core_fill_parent_index()
if (entry->hw) {
parent = entry->hw->core;
At this point, 'parent' points to clkBad even though clkBad hasn't been
fully registered yet. Ouch! A similar problem can happen if a clk
controller registers orphan clks that are referenced in the DT node of
another clk controller.
Let's fix all this by only setting the hw->core pointer underneath the
clk prepare lock in __clk_core_init(). This way we know that
clk_core_fill_parent_index() can't see hw->core be non-NULL until the
clk is fully registered.
Fixes: fc0c209c14 ("clk: Allow parents to be specified without string names")
Signed-off-by: Mike Tipton <quic_mdtipton@quicinc.com>
Link: https://lore.kernel.org/r/20211109043438.4639-1-quic_mdtipton@quicinc.com
[sboyd@kernel.org: Reword commit text, update comment]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Bug: 208605820
(cherry picked from commit 54baf56eaahttps://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git clk-next)
Change-Id: Iee7ea8a1ba3a95a4985c2e689bcc4484c33153f1
Signed-off-by: Mike Tipton <quic_mdtipton@quicinc.com>
Long ago there wasn't a FOLL_LONGTERM flag so this DAX check was done by
post-processing the VMA list.
These days it is trivial to just check each VMA to see if it is DAX before
processing it inside __get_user_pages() and return failure if a DAX VMA is
encountered with FOLL_LONGTERM.
Removing the allocation of the VMA list is a significant speed up for many
call sites.
Add an IS_ENABLED to vma_is_fsdax so that code generation is unchanged
when DAX is compiled out.
Remove the dummy version of __gup_longterm_locked() as !CONFIG_CMA already
makes memalloc_nocma_save(), check_and_migrate_cma_pages(), and
memalloc_nocma_restore() into a NOP.
Bug: 209719897
Link: https://lkml.kernel.org/r/0-v1-5551df3ed12e+b8-gup_dax_speedup_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Minchan Kim <minchan@google.com>
(cherry picked from commit 52650c8b46)
Change-Id: I8be099dc7b617916254c2650ff8a55a6b926a32e
Export clocksource_mmio_init and clocksource_mmio_readl_up
to support building clocksource driver as module.
Bug: 194108974
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
Change-Id: I63bab35efa6ca2c8b0c6283f6d42c13db66568af
clocksource driver may use sched_clock_register
to resigter itself as a sched_clock source.
Export it to support building such driver
as module, like timer-imx-tpm.c
Bug: 194108974
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
Change-Id: Id23f3da624a1e70fc1a44daf6f827c03dc1d053d
This information can be used to check how much time we need to give to issue
all the discard commands.
Bug: 206863097
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit fc4ae5492ca4afd7a8a9d261f4908b09f221d314
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev)
Change-Id: Ibd2f1d6c171f584ec9ca3817d9ea561db98f4693
Export symbol of the function wq_worker_comm() in kernel/workqueue.c for dlkm to get the description of the kworker process.
Bug: 208394207
Signed-off-by: zhengding chen <chenzhengding@oppo.com>
Change-Id: I2e7ddd52a15e22e99e6596f16be08243af1bb473
When binder transaction happened, We want to know it's transaction code
and it is oneway or not.
Bug: 208910215
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: Ic03e5481e96e120a1953a101895714db04ca2bdf
Remove executable bit on abi_gki_aarch64_mtk file.
This file was added in 55d7c4eca6, the mode of
abi_gki_aarch64.xml was fixed in eb02ea0e, this
patch fixes the other file that from the
55d7c4eca6 patch.
Bug: 149040612
Fixes: 55d7c4eca6 ("ANDROID: Update symbol list for mtk")
Signed-off-by: Pat Tjin <pattjin@google.com>
Change-Id: I3800136eb7d7f54a89508660ae40e662ee5ac571
We observed the following deadlock in the stress test under low
memory scenario:
Thread A Thread B
- erofs_shrink_scan
- erofs_try_to_release_workgroup
- erofs_workgroup_try_to_freeze -- A
- z_erofs_do_read_page
- z_erofs_collection_begin
- z_erofs_register_collection
- erofs_insert_workgroup
- xa_lock(&sbi->managed_pslots) -- B
- erofs_workgroup_get
- erofs_wait_on_workgroup_freezed -- A
- xa_erase
- xa_lock(&sbi->managed_pslots) -- B
To fix this, it needs to hold xa_lock before freezing the workgroup
since xarray will be touched then. So let's hold the lock before
accessing each workgroup, just like what we did with the radix tree
before.
[ Gao Xiang: Jianhua Hao also reports this issue at
https://lore.kernel.org/r/b10b85df30694bac8aadfe43537c897a@xiaomi.com ]
Link: https://lore.kernel.org/r/20211118135844.3559-1-huangjianan@oppo.com
Fixes: 64094a0441 ("erofs: convert workstn to XArray")
Reviewed-by: Chao Yu <chao@kernel.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Huang Jianan <huangjianan@oppo.com>
Reported-by: Jianhua Hao <haojianhua1@xiaomi.com>
Signed-off-by: Gao Xiang <xiang@kernel.org>
Signed-off-by: haojianhua1 <haojianhua1@xiaomi.com>
Change-Id: Icdfb520c073de8773d44cff64fd5f7a8499cdc85
Signed-off-by: haojianhua1 <haojianhua1@xiaomi.com>
Bug:208318427
(cherry picked from commit 57bbeacdbe)
Signed-off-by: Gao Xiang <xiang@kernel.org>
Change-Id: Ie282b4362898fa8e0c829b8c2454887e953a2610
Signed-off-by: haojianhua1 <haojianhua1@xiaomi.com>
Without initialization, it will be random data and hard for
vendor hook to decide.
Bug: 207739506
Change-Id: I278772d87eea38c03a40d4f0bef20ac8644e2ecd
Signed-off-by: Maria Yu <quic_aiquny@quicinc.com>
Nothing protects the access to the per_cpu variable sd_llc_id. When testing
the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
update_top_cache_domain(). One scenario being:
CPU1 CPU2
==================================================================
per_cpu(sd_llc_id, CPUX) => 0
partition_sched_domains_locked()
detach_destroy_domains()
cpus_share_cache(CPUX, CPUX) update_top_cache_domain(CPUX)
per_cpu(sd_llc_id, CPUX) => 0
per_cpu(sd_llc_id, CPUX) = CPUX
per_cpu(sd_llc_id, CPUX) => CPUX
return false
ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
is a warning triggered from ttwu_queue_wakelist().
Avoid a such race in cpus_share_cache() by always returning true when
this_cpu == that_cpu.
Fixes: 518cd62341 ("sched: Only queue remote wakeups when crossing cache boundaries")
Reported-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20211104175120.857087-1-vincent.donnefort@arm.com
Bug: 204726704
Change-Id: Ib6e59187b6d7d7dcabae84e3541d5fbe0dfc400a
(cherry picked from commit 42dc938a59)
Signed-off-by: Jing-Ting Wu <Jing-Ting.Wu@mediatek.com>
There are pclusters in runtime marked with Z_EROFS_PCLUSTER_TAIL
before actual I/O submission. Thus, the decompression chain can be
extended if the following pcluster chain hooks such tail pcluster.
As the related comment mentioned, if some page is made of a hooked
pcluster and another followed pcluster, it can be reused for in-place
I/O (since I/O should be submitted anyway):
_______________________________________________________________
| tail (partial) page | head (partial) page |
|_____PRIMARY_HOOKED___|____________PRIMARY_FOLLOWED____________|
However, it's by no means safe to reuse as pagevec since if such
PRIMARY_HOOKED pclusters finally move into bypass chain without I/O
submission. It's somewhat hard to reproduce with LZ4 and I just found
it (general protection fault) by ro_fsstressing a LZMA image for long
time.
I'm going to actively clean up related code together with multi-page
folio adaption in the next few months. Let's address it directly for
easier backporting for now.
Call trace for reference:
z_erofs_decompress_pcluster+0x10a/0x8a0 [erofs]
z_erofs_decompress_queue.isra.36+0x3c/0x60 [erofs]
z_erofs_runqueue+0x5f3/0x840 [erofs]
z_erofs_readahead+0x1e8/0x320 [erofs]
read_pages+0x91/0x270
page_cache_ra_unbounded+0x18b/0x240
filemap_get_pages+0x10a/0x5f0
filemap_read+0xa9/0x330
new_sync_read+0x11b/0x1a0
vfs_read+0xf1/0x190
Link: https://lore.kernel.org/r/20211103182006.4040-1-xiang@kernel.org
Fixes: 3883a79abd ("staging: erofs: introduce VLE decompression support")
Cc: <stable@vger.kernel.org> # 4.19+
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Bug: 206904737
Change-Id: Ib34644bf29f3f6dc0369cc51869c8b259f8d0f0a
(cherry picked from commit 86432a6dca)
Signed-off-by: Huang Jianan <huangjianan@oppo.com>
[ Upstream commit 2628844812 ]
In the endpoint interrupt functions
dwc3_gadget_endpoint_transfer_in_progress() and
dwc3_gadget_endpoint_trbs_complete() will dereference the endpoint
descriptor. But it could be cleared in __dwc3_gadget_ep_disable()
when accessory disconnected. So we need to check whether it is null
or not before dereferencing it.
Bug: 202829886
Bug: 204052921
Fixes: f09ddcfcb8 ("usb: dwc3: gadget: Prevent EP queuing while stopping transfers")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Jack Pham <quic_jackp@quicinc.com>
Signed-off-by: Albert Wang <albertccwang@google.com>
Link: https://lore.kernel.org/r/20211109092642.3507692-1-albertccwang@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Change-Id: I767f61f3c8840b7854a467ab6bf18c26c89757c2
Signed-off-by: Albert Wang <albertccwang@google.com>
Allow the following command to be run to make a build of fips140.ko
that has CONFIG_CRYPTO_FIPS140_MOD_EVAL_TESTING enabled:
BUILD_CONFIG=common/build.config.gki.aarch64.fips140_eval_testing ./build/build.sh
Bug: 188620248
Change-Id: I0e0be487974c6ad40f3135fc5fec6aa107aab78c
Signed-off-by: Eric Biggers <ebiggers@google.com>
To hot unplug a CPU, the idle task on that CPU calls a few layers of C
code before finally leaving the kernel. When KASAN is in use, poisoned
shadow is left around for each of the active stack frames, and when
shadow call stacks are in use. When shadow call stacks (SCS) are in use
the task's saved SCS SP is left pointing at an arbitrary point within
the task's shadow call stack.
When a CPU is offlined than onlined back into the kernel, this stale
state can adversely affect execution. Stale KASAN shadow can alias new
stackframes and result in bogus KASAN warnings. A stale SCS SP is
effectively a memory leak, and prevents a portion of the shadow call
stack being used. Across a number of hotplug cycles the idle task's
entire shadow call stack can become unusable.
We previously fixed the KASAN issue in commit:
e1b77c9298 ("sched/kasan: remove stale KASAN poison after hotplug")
... by removing any stale KASAN stack poison immediately prior to
onlining a CPU.
Subsequently in commit:
f1a0a376ca ("sched/core: Initialize the idle task with preemption disabled")
... the refactoring left the KASAN and SCS cleanup in one-time idle
thread initialization code rather than something invoked prior to each
CPU being onlined, breaking both as above.
We fixed SCS (but not KASAN) in commit:
63acd42c0d ("sched/scs: Reset the shadow stack when idle_task_exit")
... but as this runs in the context of the idle task being offlined it's
potentially fragile.
To fix these consistently and more robustly, reset the SCS SP and KASAN
shadow of a CPU's idle task immediately before we online that CPU in
bringup_cpu(). This ensures the idle task always has a consistent state
when it is running, and removes the need to so so when exiting an idle
task.
Whenever any thread is created, dup_task_struct() will give the task a
stack which is free of KASAN shadow, and initialize the task's SCS SP,
so there's no need to specially initialize either for idle thread within
init_idle(), as this was only necessary to handle hotplug cycles.
I've tested this on arm64 with:
* gcc 11.1.0, defconfig +KASAN_INLINE, KASAN_STACK
* clang 12.0.0, defconfig +KASAN_INLINE, KASAN_STACK, SHADOW_CALL_STACK
... offlining and onlining CPUS with:
| while true; do
| for C in /sys/devices/system/cpu/cpu*/online; do
| echo 0 > $C;
| echo 1 > $C;
| done
| done
Fixes: f1a0a376ca ("sched/core: Initialize the idle task with preemption disabled")
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Qian Cai <quic_qiancai@quicinc.com>
Link: https://lore.kernel.org/lkml/20211115113310.35693-1-mark.rutland@arm.com/
Bug: 207346964
(cherry picked from commit dce1ca0525https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/urgent)
Signed-off-by: Yee Lee <yee.lee@mediatek.com>
Change-Id: Ia2db6477afddcc01fae00c6eb925fb3ec2791130
this critical region should be protected by pool->mutex.
Bug: 207658347
Fixes: e7dac4c323 ("ANDROID: dma-buf: heaps: Add a shrinker controlled page pool")
Signed-off-by: liuhailong <liuhailong@oppo.com>
Signed-off-by: xieliujie <xieliujie@oppo.com>
Change-Id: I6f129926c96176258a965964c24602fc647db61e