Currently, mte_set_mem_tag_range() and mte_zero_clear_page_tags() use
DC {GVA,GZVA} unconditionally. But, they should make sure that
DCZID_EL0.DZP, which indicates whether or not use of those instructions
is prohibited, is zero when using those instructions.
Use ST{G,ZG,Z2G} instead when DCZID_EL0.DZP == 1.
Fixes: 013bb59dbb ("arm64: mte: handle tags zeroing at page allocation time")
Fixes: 3d0cca0b02 ("kasan: speed up mte_set_mem_tag_range")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Link: https://lore.kernel.org/r/20211206004736.1520989-3-reijiw@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 685e2564da)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I1e7615a060d86aa743426334ec79d69b4299c7e8
For previous version, it uses 'sg_table.nent's to traverse sg_table in pages
free flow.
However, 'sg_table.nents' is reassigned in 'dma_map_sg', it means the number of
created entries in the DMA adderess space.
So, use 'sg_table.nents' in pages free flow will case some pages can't be freed.
Here we should use sg_table.orig_nents to free pages memory, but use the
sgtable helper 'for each_sgtable_sg'(, instead of the previous rather common
helper 'for_each_sg' which maybe cause memory leak) is much better.
Fixes: d963ab0f15 ("dma-buf: system_heap: Allocate higher order pages if available")
Signed-off-by: Guangming <Guangming.Cao@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: <stable@vger.kernel.org> # 5.11.*
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211126074904.88388-1-guangming.cao@mediatek.com
(cherry picked from commit 679d94cd7d)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I90e1b7e9ea5d1fbabc86a0c1eddbdd2d8887e086
As Vincent reports in:
https://lore.kernel.org/r/20211118163417.21617-1-vincent.whitchurch@axis.com
The put_user() in schedule_tail() can get stuck in a livelock, similar
to a problem recently fixed on riscv in commit:
285a76bb2c ("riscv: evaluate put_user() arg before enabling user access")
In __raw_put_user() we have a critical section between
uaccess_ttbr0_enable() and uaccess_ttbr0_disable() where we cannot
safely call into the scheduler without having taken an exception, as
schedule() and other scheduling functions will not save/restore the
TTBR0 state. If either of the `x` or `ptr` arguments to __raw_put_user()
contain a blocking call, we may call into the scheduler within the
critical section. This can result in two problems:
1) The access within the critical section will occur without the
required TTBR0 tables installed. This will fault, and where the
required tables permit access, the access will be retried without the
required tables, resulting in a livelock.
2) When TTBR0 SW PAN is in use, check_and_switch_context() does not
modify TTBR0, leaving a stale value installed. The mappings of the
blocked task will erroneously be accessible to regular accesses in
the context of the new task. Additionally, if the tables are
subsequently freed, local TLB maintenance required to reuse the ASID
may be lost, potentially resulting in TLB corruption (e.g. in the
presence of CnP).
The same issue exists for __raw_get_user() in the critical section
between uaccess_ttbr0_enable() and uaccess_ttbr0_disable().
A similar issue exists for __get_kernel_nofault() and
__put_kernel_nofault() for the critical section between
__uaccess_enable_tco_async() and __uaccess_disable_tco_async(), as the
TCO state is not context-switched by direct calls into the scheduler.
Here the TCO state may be lost from the context of the current task,
resulting in unexpected asynchronous tag check faults. It may also be
leaked to another task, suppressing expected tag check faults.
To fix all of these cases, we must ensure that we do not directly call
into the scheduler in their respective critical sections. This patch
reworks __raw_put_user(), __raw_get_user(), __get_kernel_nofault(), and
__put_kernel_nofault(), ensuring that parameters are evaluated outside
of the critical sections. To make this requirement clear, comments are
added describing the problem, and line spaces added to separate the
critical sections from other portions of the macros.
For __raw_get_user() and __raw_put_user() the `err` parameter is
conditionally assigned to, and we must currently evaluate this in the
critical section. This behaviour is relied upon by the signal code,
which uses chains of put_user_error() and get_user_error(), checking the
return value at the end. In all cases, the `err` parameter is a plain
int rather than a more complex expression with a blocking call, so this
is safe.
In future we should try to clean up the `err` usage to remove the
potential for this to be a problem.
Aside from the changes to time of evaluation, there should be no
functional change as a result of this patch.
Reported-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/r/20211118163417.21617-1-vincent.whitchurch@axis.com
Fixes: f253d827f3 ("arm64: uaccess: refactor __{get,put}_user")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20211122125820.55286-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 94902d849e)
[connoro: adjust __raw_{get,put}_user comments to reflect 5.10 code]
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I1c7f95b34b3bc53179ddaa7cb1fb38cfd9cc2e77
The id argument of ARM64_FTR_REG_OVERRIDE() is used for two purposes:
one as the system register encoding (used for the sys_id field of
__ftr_reg_entry), and the other as the register name (stringified
and used for the name field of arm64_ftr_reg), which is debug
information. The id argument is supposed to be a macro that
indicates an encoding of the register (eg. SYS_ID_AA64PFR0_EL1, etc).
ARM64_FTR_REG(), which also has the same id argument,
uses ARM64_FTR_REG_OVERRIDE() and passes the id to the macro.
Since the id argument is completely macro-expanded before it is
substituted into a macro body of ARM64_FTR_REG_OVERRIDE(),
the stringified id in the body of ARM64_FTR_REG_OVERRIDE is not
a human-readable register name, but a string of numeric bitwise
operations.
Fix this so that human-readable register names are available as
debug information.
Fixes: 8f266a5d87 ("arm64: cpufeature: Add global feature override facility")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211101045421.2215822-1-reijiw@google.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 9dc232a8ab)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I53c28eabd5a7e5499cf62a17e4d4d826130c2562
While commit 097b9146c0 ("net: fix up truesize of cloned
skb in skb_prepare_for_shift()") fixed immediate issues found
when KFENCE was enabled/tested, there are still similar issues,
when tcp_trim_head() hits KFENCE while the master skb
is cloned.
This happens under heavy networking TX workloads,
when the TX completion might be delayed after incoming ACK.
This patch fixes the WARNING in sk_stream_kill_queues
when sk->sk_mem_queued/sk->sk_forward_alloc are not zero.
Fixes: d3fb45f370 ("mm, kfence: insert KFENCE hooks for SLAB")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20211102004555.1359210-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit c4777efa75)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I5e456705bd01396c05c79009aeba36e00829e037
In RHEL's gating selftests we've encountered memory corruption in the
uffd event test even with upstream kernel:
# ./userfaultfd anon 128 4
nr_pages: 32768, nr_pages_per_cpu: 32768
bounces: 3, mode: rnd racing read, userfaults: 6240 missing (6240) 14729 wp (14729)
bounces: 2, mode: racing read, userfaults: 1444 missing (1444) 28877 wp (28877)
bounces: 1, mode: rnd read, userfaults: 6055 missing (6055) 14699 wp (14699)
bounces: 0, mode: read, userfaults: 82 missing (82) 25196 wp (25196)
testing uffd-wp with pagemap (pgsize=4096): done
testing uffd-wp with pagemap (pgsize=2097152): done
testing events (fork, remap, remove): ERROR: nr 32427 memory corruption 0 1 (errno=0, line=963)
ERROR: faulting process failed (errno=0, line=1117)
It can be easily reproduced when global thp enabled, which is the
default for RHEL.
It's also known as a side effect of commit 0db282ba2c ("selftest: use
mmap instead of posix_memalign to allocate memory", 2021-07-23), which
is imho right itself on using mmap() to make sure the addresses will be
untagged even on arm.
The problem is, for each test we allocate buffers using two
allocate_area() calls. We assumed these two buffers won't affect each
other, however they could, because mmap() could have found that the two
buffers are near each other and having the same VMA flags, so they got
merged into one VMA.
It won't be a big problem if thp is not enabled, but when thp is
agressively enabled it means when initializing the src buffer it could
accidentally setup part of the dest buffer too when there's a shared THP
that overlaps the two regions. Then some of the dest buffer won't be
able to be trapped by userfaultfd missing mode, then it'll cause memory
corruption as described.
To fix it, do release_pages() after initializing the src buffer.
Since the previous two release_pages() calls are after
uffd_test_ctx_clear() which will unmap all the buffers anyway (which is
stronger than release pages; as unmap() also tear town pgtables), drop
them as they shouldn't really be anything useful.
We can mark the Fixes tag upon 0db282ba2c as it's reported to only
happen there, however the real "Fixes" IMHO should be 8ba6e86408, as
before that commit we'll always do explicit release_pages() before
registration of uffd, and 8ba6e86408 changed that logic by adding
extra unmap/map and we didn't release the pages at the right place.
Meanwhile I don't have a solid glue anyway on whether posix_memalign()
could always avoid triggering this bug, hence it's safer to attach this
fix to commit 8ba6e86408.
Link: https://lkml.kernel.org/r/20210923232512.210092-1-peterx@redhat.com
Fixes: 8ba6e86408 ("userfaultfd/selftests: reinitialize test context in each test")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1994931
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Li Wang <liwan@redhat.com>
Tested-by: Li Wang <liwang@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 8913970c19)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I5f5c06c9e3be4a6e521415edb785ddcc0be81b21
The KVM page-table library refcounts the pages of concatenated stage-2
PGDs individually. However, when running KVM in protected mode, the
host's stage-2 PGD is currently managed by EL2 as a single high-order
compound page, which can cause the refcount of the tail pages to reach 0
when they shouldn't, hence corrupting the page-table.
Fix this by introducing a new hyp_split_page() helper in the EL2 page
allocator (matching the kernel's split_page() function), and make use of
it from host_s2_zalloc_pages_exact().
Fixes: 1025c8c0c6 ("KVM: arm64: Wrap the host with a stage 2")
Acked-by: Will Deacon <will@kernel.org>
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005090155.734578-5-qperret@google.com
(cherry picked from commit 1d58a17ef5)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: Idf0d8328710517aee822f77a2097295734c78e7a
tx-fifo-resize is now added by default by the dwc3-qcom driver
to the SNPS DWC3 child node.
So, lets drop the tx-fifo-resize property from dwc3-qcom nodes
as having it there will cause the dwc3-qcom driver to error and
abort probe with:
[ 1.362938] dwc3-qcom 8af8800.usb: unable to add property
[ 1.368405] dwc3-qcom 8af8800.usb: failed to register DWC3 Core, err=-17
Fixes: cefdd52fa0 ("usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default")
Signed-off-by: Robert Marko <robimarko@gmail.com>
Link: https://lore.kernel.org/r/20210902220325.1783567-1-robimarko@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit da546d6b74)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I0ffb00d40b43772af3c226141183ca47325e748f
This reverts commit cefdd52fa0.
On sc7180-trogdor class devices with 'fw_devlink=permissive' and KASAN
enabled, you'll see a Use-After-Free reported at bootup.
The root of the problem is that dwc3_qcom_of_register_core() is adding
a devm-allocated "tx-fifo-resize" property to its device tree node
using of_add_property().
The issue is that of_add_property() makes a _permanent_ addition to
the device tree that lasts until reboot. That means allocating memory
for the property using "devm" managed memory is a terrible idea since
that memory will be freed upon probe deferral or device unbinding.
Let's revert the patch since the system is still functional without
it. The fact that of_add_property() makes a permanent change is extra
fodder for those folks who were aruging that the device tree isn't
really the right way to pass information between parts of the
driver. It is an exercise left to the reader to submit a patch
re-adding the new feature in a way that makes everyone happier.
Fixes: cefdd52fa0 ("usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20211207094327.1.Ie3cde3443039342e2963262a4c3ac36dc2c08b30@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6a97cee39d)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I69a18d3518e8cc9c43e93225f7427642b42a931b
USB TCPCI Spec, 4.4.3 Mask Registers:
"A masked register will still indicate in the ALERT register, but shall
not set the Alert# pin low."
Thus, the Extended Status will still indicate in ALERT register if vSafe0V
is detected by TCPC even though being masked. In current code, howerer,
this event will not be handled in detection time. Rather it will be
handled when next ALERT event coming(CC evnet, PD event, etc).
Tcpm might transition to a wrong state in this situation. Thus, the vSafe0V
event should not be handled when it's masked.
Fixes: 766c485b86 ("usb: typec: tcpci: Add support to report vSafe0V")
cc: <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20210926101415.3775058-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 05300871c0)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: Iad4f5330bf080f5f00b8599064c23e6d59e0ad51
When we have a dependency of the form:
Device-A -> Device-C
Device-B
Device-C -> Device-B
Where,
* Indentation denotes "child of" parent in previous line.
* X -> Y denotes X is consumer of Y based on firmware (Eg: DT).
We have cyclic dependency: device-A -> device-C -> device-B -> device-A
fw_devlink current treats device-C -> device-B dependency as an invalid
dependency and doesn't enforce it but leaves the rest of the
dependencies as is.
While the current behavior is necessary, it is not sufficient if the
false dependency in this example is actually device-A -> device-C. When
this is the case, device-C will correctly probe defer waiting for
device-B to be added, but device-A will be incorrectly probe deferred by
fw_devlink waiting on device-C to probe successfully. Due to this, none
of the devices in the cycle will end up probing.
To fix this, we need to go relax all the dependencies in the cycle like
we already do in the other instances where fw_devlink detects cycles.
A real world example of this was reported[1] and analyzed[2].
[1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.com/
[2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5QdNFQ@mail.gmail.com/
Fixes: f9aa460672 ("driver core: Refactor fw_devlink feature")
Cc: stable <stable@vger.kernel.org>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210915170940.617415-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 2de9d8e0d2)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: Ib8453b5aa3320bb0da9f775702e65021623ca4d8
Since the commit e5efaeb8a8 ("bootconfig: Support mixing
a value and subkeys under a key") allows to co-exist a value
node and key nodes under a node, xbc_node_for_each_child()
is not only returning key node but also a value node.
In the boot-time tracing using xbc_node_for_each_child() to
iterate the events, groups and instances, but those must be
key nodes. Thus it must use xbc_node_for_each_subkey().
Link: https://lkml.kernel.org/r/163112988361.74896.2267026262061819145.stgit@devnote2
Fixes: e5efaeb8a8 ("bootconfig: Support mixing a value and subkeys under a key")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
(cherry picked from commit cfd799837d)
Bug: 187129171
Signed-off-by: Connor O'Brien <connoro@google.com>
Change-Id: I9c6a525f2a6a09f95298d3cf8b31495d13fec91f
Some UDCs may return an error during pullup disable as part of the
unbind path for a USB configuration. This will lead to a scenario
where the disable() callback is skipped, whereas the unbind() still
occurs. If this happens, the u_serial driver will continue to fail
subsequent binds, due to an already existing entry in the ports array.
Ensure that gserial_disconnect() is called during the f_serial unbind,
so the ports entry is properly cleared.
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Link: https://lore.kernel.org/r/20220111064850.24311-1-quic_wcheng@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 215650635
(cherry picked from commit d6dd18efd0)
Change-Id: I6d345345f5a86d73def1634d9860c598542e42a1
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
The following deadlock has been observed on a test setup:
- All tags allocated
- The SCSI error handler calls ufshcd_eh_host_reset_handler()
- ufshcd_eh_host_reset_handler() queues work that calls
ufshcd_err_handler()
- ufshcd_err_handler() locks up as follows:
Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt
Call trace:
__switch_to+0x298/0x5d8
__schedule+0x6cc/0xa94
schedule+0x12c/0x298
blk_mq_get_tag+0x210/0x480
__blk_mq_alloc_request+0x1c8/0x284
blk_get_request+0x74/0x134
ufshcd_exec_dev_cmd+0x68/0x640
ufshcd_verify_dev_init+0x68/0x35c
ufshcd_probe_hba+0x12c/0x1cb8
ufshcd_host_reset_and_restore+0x88/0x254
ufshcd_reset_and_restore+0xd0/0x354
ufshcd_err_handler+0x408/0xc58
process_one_work+0x24c/0x66c
worker_thread+0x3e8/0xa4c
kthread+0x150/0x1b4
ret_from_fork+0x10/0x30
Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved
request.
Link: https://lore.kernel.org/r/20211203231950.193369-10-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 945c3cca05 git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next)
Bug: 204438323
Bug: 218587336
Change-Id: Ib8027a51cc4b7bec7ddd69719f0f7f4a6e8dfb3a
Signed-off-by: Bart Van Assche <bvanassche@google.com>
8250_core registers 4 ISA uart ports by default, which can cause
problems on some devices which don't have them. This change doesn't
break earlycon=uart8250, but it will cause the 8250_of and 8250_pci sub
drivers to be unable to register ports. Boards that really need the full
8250 driver to take over from earlycon can use the "8250.nr_uarts=X"
kernel command line option to restore the ports allocation.
Bug: 216312411
Signed-off-by: Alistair Delva <adelva@google.com>
Change-Id: I04715394b32bd98544657101de4537df34554ea9
[Lee: Fixed merge conflict]
Signed-off-by: Lee Jones <lee.jones@linaro.org>
commit 9aa422ad32 upstream.
The function tipc_mon_rcv() allows a node to receive and process
domain_record structs from peer nodes to track their views of the
network topology.
This patch verifies that the number of members in a received domain
record does not exceed the limit defined by MAX_MON_DOMAIN, something
that may otherwise lead to a stack overflow.
tipc_mon_rcv() is called from the function tipc_link_proto_rcv(), where
we are reading a 32 bit message data length field into a uint16. To
avert any risk of bit overflow, we add an extra sanity check for this in
that function. We cannot see that happen with the current code, but
future designers being unaware of this risk, may introduce it by
allowing delivery of very large (> 64k) sk buffers from the bearer
layer. This potential problem was identified by Eric Dumazet.
This fixes CVE-2022-0435
Reported-by: Samuel Page <samuel.page@appgate.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: 35c55c9877 ("tipc: add neighbor monitoring framework")
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Samuel Page <samuel.page@appgate.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3c7e594355)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5da5bc6880456ec91e6d3f3a283d2c24b6cc269c
This is used for go/bandwidth-limiting.
Test: TreeHugger
Bug: 157552970
Signed-off-by: Patrick Rohr <prohr@google.com>
Change-Id: I89831743c63dff7d62e41e2fd457e6e628d7d0a2
(cherry picked from commit 8843b0a47e)
With thermal frameworks of-thermal interface, thermal zone parameters can
be defined in devicetree. This includes cooling device mitigation levels
for a thermal zone. To take advantage of this, cooling device should use
the thermal_of_cooling_device_register API to register a cooling device.
Use thermal_of_cooling_device_register API to register the power supply
cooling device. This enables power supply cooling device be included in the
thermal zone parameter in devicetree.
Signed-off-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com>
Bug: 211709650
Link: https://lore.kernel.org/linux-pm/1640162489-7847-2-git-send-email-quic_manafm@quicinc.com/
Change-Id: Ie0d527543adb8590ec52df96bf3e4d0f1f022d0a
Signed-off-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com>
This reverts commit 9c63be2ada.
Reason for revert: android12 still relies on OTH bits being set.
Bug: 218458907
Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
Change-Id: Idac3d9a515188d718b7bf5c01105531f2e9bacdc
The cgroup release_agent is called with call_usermodehelper. The function
call_usermodehelper starts the release_agent with a full set fo capabilities.
Therefore require capabilities when setting the release_agaent.
Reported-by: Tabitha Sable <tabitha.c.sable@gmail.com>
Tested-by: Tabitha Sable <tabitha.c.sable@gmail.com>
Fixes: 81a6a5cdd2 ("Task Control Groups: automatic userspace notification of idle cgroups")
Cc: stable@vger.kernel.org # v2.6.24+
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry picked from commit 24f6008564)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: If663568d24f781c58c882d69a5b81a9327c0e6fe
f2fs rw_semaphores work better if writers can starve readers,
especially for the checkpoint thread, because writers are strictly
more important than reader threads. This prevents significant priority
inversion between low-priority readers that blocked while trying to
acquire the read lock and a second acquisition of the write lock that
might be blocking high priority work.
Bug: 214413989
Signed-off-by: Tim Murray <timmurray@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit e4544b63a7
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev)
Change-Id: Ia0eb86447488c5ba9845a6b2eb98652200e08281
This patch tries to mitigate lock contention between f2fs_write_checkpoint and
f2fs_get_node_info along with nat_tree_lock.
The idea is, if checkpoint is currently running, other threads that try to grab
nat_tree_lock would be better to wait for checkpoint.
Bug: 214413989
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit a9419b63bf
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master)
Change-Id: I7e253e8bf26012e451330ba6ce00bfc3839bdda3
Now that RCU_BOOST handles CFS threads, enable it.
Bug: 217236054
Test: ensure CFS threads are boosted, TH
Signed-off-by: Tim Murray <timmurray@google.com>
Change-Id: Idd02467f6caad063e14aa5496617b8bbaf0e9ab1
Currently rcu_preempt_deferred_qs_irqrestore() releases rnp->boost_mtx
before reporting the expedited quiescent state. Under heavy real-time
load, this can result in this function being preempted before the
quiescent state is reported, which can in turn prevent the expedited grace
period from completing. Tim Murray reports that the resulting expedited
grace periods can take hundreds of milliseconds and even more than one
second, when they should normally complete in less than a millisecond.
This patch follows Neeraj's suggestion (seconded by Tim and by Uladzislau
Rezki) of simply reversing the two operations.
Reported-by: Tim Murray <timmurray@google.com>
Reported-by: Joel Fernandes <joelaf@google.com>
Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Bug: 217236054
Link: https://lore.kernel.org/lkml/20220204232406.814-9-paulmck@kernel.org/
Signed-off-by: Tim Murray <timmurray@google.com>
Change-Id: Ic0f0330b65a0c776a563e03e4b59bf4ab4fbccf6
Consider a case where ffs_func_eps_disable is called from
ffs_func_disable as part of composition switch and at the
same time ffs_epfile_release get called from userspace.
ffs_epfile_release will free up the read buffer and call
ffs_data_closed which in turn destroys ffs->epfiles and
mark it as NULL. While this was happening the driver has
already initialized the local epfile in ffs_func_eps_disable
which is now freed and waiting to acquire the spinlock. Once
spinlock is acquired the driver proceeds with the stale value
of epfile and tries to free the already freed read buffer
causing use-after-free.
Following is the illustration of the race:
CPU1 CPU2
ffs_func_eps_disable
epfiles (local copy)
ffs_epfile_release
ffs_data_closed
if (last file closed)
ffs_data_reset
ffs_data_clear
ffs_epfiles_destroy
spin_lock
dereference epfiles
Fix this races by taking epfiles local copy & assigning it under
spinlock and if epfiles(local) is null then update it in ffs->epfiles
then finally destroy it.
Extending the scope further from the race, protecting the ep related
structures, and concurrent accesses.
Fixes: a9e6f83c2d (usb: gadget: f_fs: stop sleeping in ffs_func_eps_disable)
Reviewed-by: John Keeping <john@metanate.com>
Signed-off-by: Pratham Pratap <quic_ppratap@quicinc.com>
Co-developed-by: Udipto Goswami <quic_ugoswami@quicinc.com>
Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
Link: https://lore.kernel.org/r/1643256595-10797-1-git-send-email-quic_ugoswami@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ebe2b1add1https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-linus)
BUG: 217829161
Change-Id: Iab64af51aece85df3208afd7b6cd108b955eae45
Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
commit cfd0d84ba2 upstream.
In 4.13, commit 74310e06be ("android: binder: Move buffer out of area shared with user space")
fixed a kernel structure visibility issue. As part of that patch,
sizeof(void *) was used as the buffer size for 0-length data payloads so
the driver could detect abusive clients sending 0-length asynchronous
transactions to a server by enforcing limits on async_free_size.
Unfortunately, on the "free" side, the accounting of async_free_space
did not add the sizeof(void *) back. The result was that up to 8-bytes of
async_free_space were leaked on every async transaction of 8-bytes or
less. These small transactions are uncommon, so this accounting issue
has gone undetected for several years.
The fix is to use "buffer_size" (the allocated buffer size) instead of
"size" (the logical buffer size) when updating the async_free_space
during the free operation. These are the same except for this
corner case of asynchronous transactions with payloads < 8 bytes.
Fixes: 74310e06be ("android: binder: Move buffer out of area shared with user space")
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable@vger.kernel.org # 4.14+
Link: https://lore.kernel.org/r/20211220190150.2107077-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Change-Id: I25b00d31a7a7009888ec6e92e4fc43ee1b14e401
Although it is usually safe to invoke synchronize_rcu_expedited() from a
preemption-enabled CPU-hotplug notifier, if it is invoked from a notifier
between CPUHP_AP_RCUTREE_ONLINE and CPUHP_AP_ACTIVE, its attempts to
invoke a workqueue handler will hang due to RCU waiting on a CPU that
the scheduler is not paying attention to. This commit therefore expands
use of the existing workqueue-independent synchronize_rcu_expedited()
from early boot to also include CPUs that are being hotplugged.
Bug: 216238044
Link: https://lore.kernel.org/lkml/7359f994-8aaf-3cea-f5cf-c0d3929689d6@quicinc.com/
Reported-by: Mukesh Ojha <quic_mojha@quicinc.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
(cherry picked from commit 710f460c395af6b81df1c81043308aaa60d5e25c
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next)
Change-Id: I3f81dee6deaf6a4504aec31e058785dc8cee6a3f
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Add restricted vendor hook for arch_setup_dma_ops to allow vendor
enhancements. This needs to be restricted vendor hook as it is
doing GFP_KERNEL allocation which are non-atomic in nature.
Bug: 214353193
Change-Id: I45f8f0404a247b67fd07a6831ff813bbc50fbca2
Signed-off-by: Charan Teja Reddy <quic_charante@quicinc.com>