select_fallback_rq() must return a cpu that is valid for the task.
However, when nid is not -1, it skips checking for
task_cpu_possible_mask().
This causes a problem when execve-ing 32 bit apps on an asymmetric
system where not all cpus are 32 bit capable. During execve-ing
the task is marked as 32 bit long before its affinity mask is
restricted.
If the cpu goes offline during this time, select_fallback_rq()
could return a 64 bit only cpu, which __migrate_tasks()/
is_cpu_allowed() rejects.
migrate_tasks() will therefore continue to pick the same task
repeatedly, where __migrate_tasks() rejects the cpu chosen
by select_fallback_rq() every time, leading to an infinite loop.
Correct the issue by updating select_fallback_rq() for the case
where nid is not -1, ensuring that the returned cpu is always
valid for this task.
Bug: 192050156
Change-Id: Ia073a8395a02485f6d1c1daa0f3ce9e2029cb1f4
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
In order to update cpufreq, vendor modules invoke cpufreq_update_util(),
but when we build our modules, report error:
ERROR: modpost: "cpufreq_update_util_data" [xxx.ko] undefined!
Bug: 192218676
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: Ib1da70229f04b08d8d812d065021dec0bf891e0e
This is technically a backwards incompatible change in behaviour, but I'm
going to argue that it is very unlikely to break things, and likely to fix
*far* more then it breaks.
In no particular order, various reasons follow:
(a) I've long had a bug assigned to myself to debug a super rare kernel crash
on Android Pixel phones which can (per stacktrace) be traced back to BPF clat
IPv6 to IPv4 protocol conversion causing some sort of ugly failure much later
on during transmit deep in the GSO engine, AFAICT precisely because of this
change to gso_size, though I've never been able to manually reproduce it. I
believe it may be related to the particular network offload support of attached
USB ethernet dongle being used for tethering off of an IPv6-only cellular
connection. The reason might be we end up with more segments than max permitted,
or with a GSO packet with only one segment... (either way we break some
assumption and hit a BUG_ON)
(b) There is no check that the gso_size is > 20 when reducing it by 20, so we
might end up with a negative (or underflowing) gso_size or a gso_size of 0.
This can't possibly be good. Indeed this is probably somehow exploitable (or
at least can result in a kernel crash) by delivering crafted packets and perhaps
triggering an infinite loop or a divide by zero... As a reminder: gso_size (MSS)
is related to MTU, but not directly derived from it: gso_size/MSS may be
significantly smaller then one would get by deriving from local MTU. And on
some NICs (which do loose MTU checking on receive, it may even potentially be
larger, for example my work pc with 1500 MTU can receive 1520 byte frames [and
sometimes does due to bugs in a vendor plat46 implementation]). Indeed even just
going from 21 to 1 is potentially problematic because it increases the number
of segments by a factor of 21 (think DoS, or some other crash due to too many
segments).
(c) It's always safe to not increase the gso_size, because it doesn't result in
the max packet size increasing. So the skb_increase_gso_size() call was always
unnecessary for correctness (and outright undesirable, see later). As such the
only part which is potentially dangerous (ie. could cause backwards compatibility
issues) is the removal of the skb_decrease_gso_size() call.
(d) If the packets are ultimately destined to the local device, then there is
absolutely no benefit to playing around with gso_size. It only matters if the
packets will egress the device. ie. we're either forwarding, or transmitting
from the device.
(e) This logic only triggers for packets which are GSO. It does not trigger for
skbs which are not GSO. It will not convert a non-GSO MTU sized packet into a
GSO packet (and you don't even know what the MTU is, so you can't even fix it).
As such your transmit path must *already* be able to handle an MTU 20 bytes
larger then your receive path (for IPv4 to IPv6 translation) - and indeed 28
bytes larger due to IPv4 fragments. Thus removing the skb_decrease_gso_size()
call doesn't actually increase the size of the packets your transmit side must
be able to handle. ie. to handle non-GSO max-MTU packets, the IPv4/IPv6 device/
route MTUs must already be set correctly. Since for example with an IPv4 egress
MTU of 1500, IPv4 to IPv6 translation will already build 1520 byte IPv6 frames,
so you need a 1520 byte device MTU. This means if your IPv6 device's egress
MTU is 1280, your IPv4 route must be 1260 (and actually 1252, because of the
need to handle fragments). This is to handle normal non-GSO packets. Thus the
reduction is simply not needed for GSO packets, because when they're correctly
built, they will already be the right size.
(f) TSO/GSO should be able to exactly undo GRO: the number of packets (TCP
segments) should not be modified, so that TCP's MSS counting works correctly
(this matters for congestion control). If protocol conversion changes the
gso_size, then the number of TCP segments may increase or decrease. Packet loss
after protocol conversion can result in partial loss of MSS segments that the
sender sent. How's the sending TCP stack going to react to receiving ACKs/SACKs
in the middle of the segments it sent?
(g) skb_{decrease,increase}_gso_size() are already no-ops for GSO_BY_FRAGS
case (besides triggering WARN_ON_ONCE). This means you already cannot guarantee
that gso_size (and thus resulting packet MTU) is changed. ie. you must assume
it won't be changed.
(h) changing gso_size is outright buggy for UDP GSO packets, where framing
matters (I believe that's also the case for SCTP, but it's already excluded
by [g]). So the only remaining case is TCP, which also doesn't want it
(see [f]).
(i) see also the reasoning on the previous attempt at fixing this
(commit fa7b83bf3b) which shows that the current
behaviour causes TCP packet loss:
In the forwarding path GRO -> BPF 6 to 4 -> GSO for TCP traffic, the
coalesced packet payload can be > MSS, but < MSS + 20.
bpf_skb_proto_6_to_4() will upgrade the MSS and it can be > the payload
length. After then tcp_gso_segment checks for the payload length if it
is <= MSS. The condition is causing the packet to be dropped.
tcp_gso_segment():
[...]
mss = skb_shinfo(skb)->gso_size;
if (unlikely(skb->len <= mss)) goto out;
[...]
Thus changing the gso_size is simply a very bad idea. Increasing is unnecessary
and buggy, and decreasing can go negative.
Fixes: 6578171a7f ("bpf: add bpf_skb_change_proto helper")
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dongseok Yi <dseok.yi@samsung.com>
Cc: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/bpf/CANP3RGfjLikQ6dg=YpBU0OeHvyv7JOki7CyOUS9modaXAi-9vQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20210617000953.2787453-2-zenczykowski@gmail.com
(cherry picked from commit 364745fbe9https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=364745fbe981a4370f50274475da4675661104df )
Test: builds, TreeHugger
Bug: 188690383
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I0ef3174cbd3caaa42d5779334a9c0bfdc9ab81f5
This patch avoids KMI, so later should be reverted to sync with upstream in
next KMI update.
Bug: 192095860
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: If861ecde381f23b5d5f18005063ec55356673fdb
This reverts commit ae618c699c.
It causes early device being stuck.
Bug: 192088222
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I5117e7f8aaa9433b74f1531619cdc1b687d02b41
Logging callback symbolic name is generating too many different
messages making Abort analysis miss big trends.
Stick to console reported message providing sufficient information.
Bug: 120445600
Signed-off-by: Thierry Strudel <tstrudel@google.com>
Change-Id: Ic0ea662a60919454060e3a085aeabd8a4099e0b4
Add the rwsem list add vendor hook symbol which is needed for
vendor modules.
Bug: 192041655
Signed-off-by: Huang Yiwei <hyiwei@codeaurora.org>
Change-Id: I838fbb9d067d940e962eff94e8c875c30e153ee1
When watchdog or anr occur, we need to read
dev/binderfs/binder_logs/proc/pid or dev/binderfs/binder_logs/state
node to know the time-consuming information of the binder call.
We need to add the time-consuming information of binder transaction.
Bug: 190413570
Signed-off-by: zhang chuang <zhangchuang3@xiaomi.com>
Change-Id: I0423d4e821d5cd725a848584133dc7245cbc233a
* aosp/upstream-f2fs-stable-linux-5.10.y:
Revert "f2fs: avoid attaching SB_ACTIVE flag during mount/remount"
f2fs: remove false alarm on iget failure during GC
f2fs: enable extent cache for compression files in read-only
f2fs: fix to avoid adding tab before doc section
f2fs: introduce f2fs_casefolded_name slab cache
f2fs: swap: support migrating swapfile in aligned write mode
f2fs: swap: remove dead codes
f2fs: compress: add compress_inode to cache compressed blocks
f2fs: clean up /sys/fs/f2fs/<disk>/features
f2fs: add pin_file in feature list
f2fs: Advertise encrypted casefolding in sysfs
f2fs: Show casefolding support only when supported
f2fs: support RO feature
f2fs: logging neatening
Bug: 186107892
Bug: 190759634
Bug: 190517210
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: I8eb93d8a43304b98166676da52a9c2434b15b942
Iova is identified in the alloc vendor hook by its pfn (iova->pfn_lo)
and in free vendor hook by its address (iova->pfn_lo << iova_shift(iovad)).
Change alloc vendor hook to use address as well for consistency.
Bug: 187861158
Change-Id: I8255f3e5899008b80a9f8ed960e2ba948ba13cc2
Signed-off-by: Guangming Cao <Guangming.Cao@mediatek.com>
Pre and post tracepoints in force_compatible_cpus_allowed_ptr() need
to be restricted hooks so that they can sleep.
The old non-restricted versions need to stay in place temporarily for
KMI stability. They will be removed by aosp/1742588.
Bug: 187917024
Change-Id: If630554b1c8fa2e8ccb79c89945c55e17756e6a8
Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>
Update ETE yaml file to bring it inline with upstream.
The change is minor and updates the ports property.
Bug: 174685394
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: Ia5bb2ddcbdfa171e998f4f023a0a164ac14f5dd3
Mark the buffer as truncated in irq + minor cleanups picked along the
way.
Bug: 174685394
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I36111a14ef066b579a28c39e265f3d601a82e18a
Make the comment in line with the upstream version.
Bug: 174685394
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I8c239d37c96c0c8de1261df6d2a5a2defd43ccfb
At the moment, we check the availability of SPE on the given
CPU (i.e, SPE is implemented and is allowed at the host) during
every guest entry. This can be optimized a bit by moving the
check to vcpu_load time and recording the availability of the
feature on the current CPU via a new flag. This will also be useful
for adding the TRBE support.
Bug: 174685394
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Alexandru Elisei <Alexandru.Elisei@arm.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405164307.1720226-7-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
(cherry picked from commit d2602bb4f5)
[Conflict in: arch/arm64/kvm/hyp/nvhe/debug-sr.c
Trivial conflict because of an additional __debug_save_trace() call
since a previous version of the series was backported already.]
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I4ca2bb14f7b647a1e118ea4a8c4313154a482685
Add SPRD_TIMER config into GKI for the same reason as other vendors have
described before.
Bug: 188373235
Change-Id: I70f1296f11759d9010252a52a88f9329744172f6
Signed-off-by: Orson Zhai <orson.zhai@unisoc.com>
Since LLVM commit 3787ee4, the '-stack-alignment' flag has been dropped
[1], leading to the following error message when building a LTO kernel
with Clang-13 and LLD-13:
ld.lld: error: -plugin-opt=-: ld.lld: Unknown command line argument
'-stack-alignment=8'. Try 'ld.lld --help'
ld.lld: Did you mean '--stackrealign=8'?
It also appears that the '-code-model' flag is not necessary anymore
starting with LLVM-9 [2].
Drop '-code-model' and make '-stack-alignment' conditional on LLD < 13.0.0.
These flags were necessary because these flags were not encoded in the
IR properly, so the link would restart optimizations without them. Now
there are properly encoded in the IR, and these flags exposing
implementation details are no longer necessary.
[1] https://reviews.llvm.org/D103048
[2] https://reviews.llvm.org/D52322
Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1377
Signed-off-by: Tor Vic <torvic9@mailbox.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/f2c018ee-5999-741e-58d4-e482d5246067@mailbox.org
(cherry picked from commit 2398ce8015)
Change-Id: Icebcd5669e851e2c26e9f807b9d1aeb86e95dcea
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
longterm_pinner/alloc_contig_failed should be 444
for reading for everyone and threshold/failure_traking
should be 644 to allow read/write root.
Bug: 191827366
Signed-off-by: Minchan Kim <minchan@google.com>
Change-Id: I5493c6ae6d39b692282e5428416da80a7d555aa0
PSI accounts stalls for each cgroup separately and aggregates it at each
level of the hierarchy. This causes additional overhead with psi_avgs_work
being called for each cgroup in the hierarchy. psi_avgs_work has been
highly optimized, however on systems with large number of cgroups the
overhead becomes noticeable.
Systems which use PSI only at the system level could avoid this overhead
if PSI can be configured to skip per-cgroup stall accounting.
Add "cgroup_disable=pressure" kernel command-line option to allow
requesting system-wide only pressure stall accounting. When set, it
keeps system-wide accounting under /proc/pressure/ but skips accounting
for individual cgroups and does not expose PSI nodes in cgroup hierarchy.
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/patchwork/patch/1435705
(cherry picked from commit 3958e2d0c3https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git tj)
Bug: 178872719
Bug: 191734423
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ifc8fbc52f9a1131d7c2668edbb44c525c76c3360
Some interrupts (such as the rescheduling IPI) rely on not going through
the irq_enter()/irq_exit() calls. To distinguish such interrupts, add
a new IRQ flag that allows the low-level handling code to sidestep the
enter()/exit() calls.
Only the architecture code is expected to use this. It will do the wrong
thing on normal interrupts. Note that this is a band-aid until we can
move to some more correct infrastructure (such as kernel/entry/common.c).
Bug: 191808738
Link: https://lore.kernel.org/lkml/20201124141449.572446-3-maz@kernel.org/
Change-Id: I0609a8b689219ba9e769c8b9f7fcf1e77a0ff1ca
Signed-off-by: Marc Zyngier <maz@kernel.org>
[minor port to 5.10]
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
Some arch-specific flags need to be set/cleared, but not exposed to
random device drivers. Introduce a new helper (__irq_modify_status())
that takes an arbitrary mask, and rewrite irq_modify_status() to use
this new helper.
No functionnal change.
Bug: 191808738
Link: https://lore.kernel.org/lkml/20201124141449.572446-5-maz@kernel.org/
Change-Id: I2c2c0d6599d0ab39fad22462bf4c87694362fba8
Signed-off-by: Marc Zyngier <maz@kernel.org>
[minor port to 5.10]
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
Add the vendor hook to qos.c, because of some special cases related to
our feature. we add the hook at freq_qos_add_request and remove_request
to make sure we can go to our own qos process logic.
Bug: 187458531
Signed-off-by: heshuai1 <heshuai1@xiaomi.com>
Change-Id: I1fb8fd6134432ecfb44ad242c66ccd8280ab9b43
Map the permission gating RTM_GETNEIGH/RTM_GETNEIGHTBL messages to a
new permission so that it can be distinguished from the other netlink
route permissions in selinux policy. The new permission is triggered by
a flag set in system images T and up.
This change is intended to be backported to all kernels that a T system
image can run on top of.
Bug: 171572148
Test: atest NetworkInterfaceTest
Test: atest CtsSelinuxTargetSdkCurrentTestCases
Test: atest bionic-unit-tests-static
Test: On Cuttlefish, run combinations of:
- Policy bit set or omitted (see https://r.android.com/1701847)
- This patch applied or omitted
- App having nlmsg_readneigh permission or not
Verify that only the combination of this patch + the policy bit being
set + the app not having the nlmsg_readneigh permission prevents the
app from sending RTM_GETNEIGH messages.
Change-Id: I4bcfce4decb34ea9388eeedfc4be67403de8a980
Signed-off-by: Bram Bonné <brambonne@google.com>
(cherry picked from commit fac07550bd)
This patch removes setting SBI_NEED_FSCK when GC gets an error on f2fs_iget,
since f2fs_iget can give ENOMEM and others by race condition.
If we set this critical fsck flag, we'll get EIO during fsync via the below
code path.
In f2fs_inplace_write_data(),
if (is_sbi_flag_set(sbi, SBI_NEED_FSCK) || f2fs_cp_error(sbi)) {
err = -EIO;
goto drop_bio;
}
Fixes: 9557727876 ("f2fs: drop inplace IO if fs status is abnormal")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
When CONFIG_DMABUF_SYSFS_STATS=y, dma_buf_do_mmap() will call a
dma-bufs mmap() callback before overriding the open() and close()
callbacks for the dma-buf's vm_operations_stuct, such that
dma_buf_vma_open() and dma_buf_vma_close() are used instead. Each of
the two aforementioend callbacks assumes that the vma->vm_file pointer
they use points to a dma-buf file.
However, it is possible that during the invocation of the dma-buf's
mmap() callback, that the vma->vm_file pointer changes to store
something other than a dma-buf file (it is permissible to do this so
long as certain conditions are met, see the callsite of call_mmap in
mm/mmap.c as of commit a2e00b4b5d ("FROMGIT: mm/slub: add taint
after the errors are printed"). This means that when
dma_buf_vma_open() and dma_buf_vma_close() run, that their accesses to
vma->vm_file will cease to be semantically valid.
Accordingly, only override the open() and close() vm_operations_struct
callbacks if a dma-buf's mmap() callback preserves the vma->vm_file
across the mmap() callback.
Bug: 191742286
Fixes: 9132fbe545 ("ANDROID: dmabuf: Add mmap_count to struct dmabuf")
Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
Change-Id: I4f80ade7f0bc85e2cb9219478550dcb6bbb29f3e
Mediatek UFS reset function relies on Reset Control provided by
reset-ti-syscon. To make Mediatek Reset Control work properly, select
reset-ti-syscon to ensure it is being built.
In addition, establish device_link to wait until reset-ti-syscon
initialization is complete during UFS probing.
Bug: 191731858
Change-Id: I14f20e0a5ae5b2fed213db71729180dfe43d0034
(cherry picked from commit de48898d0c
git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next)
Link: https://lore.kernel.org/r/1622601720-22466-1-git-send-email-peter.wang@mediatek.com
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>