When an encryption policy has the IV_INO_LBLK_32 flag set, the IV
generation method involves hashing the inode number. This is different
from fscrypt's other IV generation methods, where the inode number is
either not used at all or is included directly in the IVs.
Therefore, in principle IV_INO_LBLK_32 can work with any length inode
number. However, currently fscrypt gets the inode number from
inode::i_ino, which is 'unsigned long'. So currently the implementation
limit is actually 32 bits (like IV_INO_LBLK_64), since longer inode
numbers will have been truncated by the VFS on 32-bit platforms.
Fix fscrypt_supported_v2_policy() to enforce the correct limit.
This doesn't actually matter currently, since only ext4 and f2fs support
IV_INO_LBLK_32, and they both only support 32-bit inode numbers. But we
might as well fix it in case it matters in the future.
Ideally inode::i_ino would instead be made 64-bit, but for now it's not
needed. (Note, this limit does *not* prevent filesystems with 64-bit
inode numbers from adding fscrypt support, since IV_INO_LBLK_* support
is optional and is useful only on certain hardware.)
Fixes: e3b1078bed ("fscrypt: add support for IV_INO_LBLK_32 policies")
Reported-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20200824203841.1707847-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
(cherry picked from commit 5e895bd4d5)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0aa264c7e919cf2055b539953bef63a99d347d8d
Some of the newly added code in the etm4x driver is inside of an #ifdef,
and some other code is outside of it, leading to a harmless warning when
CONFIG_CPU_PM is disabled:
drivers/hwtracing/coresight/coresight-etm4x.c:68:13: error: 'etm4_os_lock' defined but not used [-Werror=unused-function]
static void etm4_os_lock(struct etmv4_drvdata *drvdata)
^~~~~~~~~~~~
To avoid the warning and simplify the the #ifdef checks, use
IS_ENABLED() instead, so the compiler can drop the unused functions
without complaining.
Fixes: f188b5e76a ("coresight: etm4x: Save/restore state across CPU low power states")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[Fixed capital 'f' in title]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20191213223107.1484-2-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 500589d8bd)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I307db5366f27a8e67e13bba2f8614d21d8152c99
If the specified/hinted sink is not reachable from a subset of the CPUs,
we could end up unable to trace the event on those CPUs. This
is the best effort we could do until we support 1:1 configurations.
Fail gracefully in such cases avoiding a WARN_ON, which can be easily
triggered by the user on certain platforms (Arm N1SDP), with the following
trace paths :
CPU0
\
-- Funnel0 --> ETF0 -->
/ \
CPU1 \
MainFunnel
CPU2 /
\ /
-- Funnel1 --> ETF1 -->
/
CPU1
$ perf record --per-thread -e cs_etm/@ETF0/u -- <app>
could trigger the following WARNING, when the event is scheduled
on CPU2.
[10919.513250] ------------[ cut here ]------------
[10919.517861] WARNING: CPU: 2 PID: 24021 at
drivers/hwtracing/coresight/coresight-etm-perf.c:316 etm_event_start+0xf8/0x100
...
[10919.564403] CPU: 2 PID: 24021 Comm: perf Not tainted 5.8.0+ #24
[10919.570308] pstate: 80400089 (Nzcv daIf +PAN -UAO BTYPE=--)
[10919.575865] pc : etm_event_start+0xf8/0x100
[10919.580034] lr : etm_event_start+0x80/0x100
[10919.584202] sp : fffffe001932f940
[10919.587502] x29: fffffe001932f940 x28: fffffc834995f800
[10919.592799] x27: 0000000000000000 x26: fffffe0011f3ced0
[10919.598095] x25: fffffc837fce244c x24: fffffc837fce2448
[10919.603391] x23: 0000000000000002 x22: fffffc8353529c00
[10919.608688] x21: fffffc835bb31000 x20: 0000000000000000
[10919.613984] x19: fffffc837fcdcc70 x18: 0000000000000000
[10919.619281] x17: 0000000000000000 x16: 0000000000000000
[10919.624577] x15: 0000000000000000 x14: 00000000000009f8
[10919.629874] x13: 00000000000009f8 x12: 0000000000000018
[10919.635170] x11: 0000000000000000 x10: 0000000000000000
[10919.640467] x9 : fffffe00108cd168 x8 : 0000000000000000
[10919.645763] x7 : 0000000000000020 x6 : 0000000000000001
[10919.651059] x5 : 0000000000000002 x4 : 0000000000000001
[10919.656356] x3 : 0000000000000000 x2 : 0000000000000000
[10919.661652] x1 : fffffe836eb40000 x0 : 0000000000000000
[10919.666949] Call trace:
[10919.669382] etm_event_start+0xf8/0x100
[10919.673203] etm_event_add+0x40/0x60
[10919.676765] event_sched_in.isra.134+0xcc/0x210
[10919.681281] merge_sched_in+0xb0/0x2a8
[10919.685017] visit_groups_merge.constprop.140+0x15c/0x4b8
[10919.690400] ctx_sched_in+0x15c/0x170
[10919.694048] perf_event_sched_in+0x6c/0xa0
[10919.698130] ctx_resched+0x60/0xa0
[10919.701517] perf_event_exec+0x288/0x2f0
[10919.705425] begin_new_exec+0x4c8/0xf58
[10919.709247] load_elf_binary+0x66c/0xf30
[10919.713155] exec_binprm+0x15c/0x450
[10919.716716] __do_execve_file+0x508/0x748
[10919.720711] __arm64_sys_execve+0x40/0x50
[10919.724707] do_el0_svc+0xf4/0x1b8
[10919.728095] el0_sync_handler+0xf8/0x124
[10919.732003] el0_sync+0x140/0x180
Even though we don't support using separate sinks for the ETMs yet (e.g,
for 1:1 configurations), we should at least honor the user's choice and
handle the limitations gracefully, by simply skipping the tracing on ETMs
which can't reach the requested sink.
Fixes: f9d81a657b ("coresight: perf: Allow tracing on hotplugged CPUs")
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200916191737.4001561-11-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 859d510e58)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I924ae31b1b5f3bd87543527549a504f1607d3a0e
In commit f188b5e76a ("coresight: etm4x: Save/restore state
across CPU low power states"), mistakenly TRCVMIDCCTLR1 register
value was saved in trcvmidcctlr0 state variable which is used to
store TRCVMIDCCTLR0 register value in etm4x_cpu_save() and then
same value is written back to both TRCVMIDCCTLR0 and TRCVMIDCCTLR1
in etm4x_cpu_restore(). There is already a trcvmidcctlr1 state
variable available for TRCVMIDCCTLR1, so use it.
Fixes: f188b5e76a ("coresight: etm4x: Save/restore state across CPU low power states")
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200928163513.70169-26-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3477326277)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1a5951b58f2c412d81e1ee1ec52393726d61c9e5
Changes in 4.19.157
powercap: restrict energy meter to root access
Linux 4.19.157
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie418582e2db8a7f2f1eb4f1034fe530e0132afe3
commit 949dd0104c upstream.
Remove non-privileged user access to power data contained in
/sys/class/powercap/intel-rapl*/*/energy_uj
Non-privileged users currently have read access to power data and can
use this data to form a security attack. Some privileged
drivers/applications need read access to this data, but don't expose it
to non-privileged users.
For example, thermald uses this data to ensure that power management
works correctly. Thus removing non-privileged access is preferred over
completely disabling this power reporting capability with
CONFIG_INTEL_RAPL=n.
Fixes: 95677a9a38 ("PowerCap: Fix mode for energy counter")
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 4c451dba25.
AOSP's distribution of GNU binutils always had a curious target triple
prefix on the binaries. Now that GNU binutils is deprecated for Android
Common Kernels, we can now remove this out of tree workaround. Now
building Android kernels with LLVM matches upstream (see
Documentation/kbuild/llvm.rst).
Bug: 118439987
Bug: 120440614
Bug: 141693040
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Change-Id: Iecaa3264a440f795f2f3a44bdf74fe28ad4ed1cc
Changes in 4.19.156
drm/i915: Break up error capture compression loops with cond_resched()
tipc: fix use-after-free in tipc_bcast_get_mode
ptrace: fix task_join_group_stop() for the case when current is traced
cadence: force nonlinear buffers to be cloned
chelsio/chtls: fix memory leaks caused by a race
chelsio/chtls: fix always leaking ctrl_skb
gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP
gianfar: Account for Tx PTP timestamp in the skb headroom
net: usb: qmi_wwan: add Telit LE910Cx 0x1230 composition
sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian platforms
sfp: Fix error handing in sfp_probe()
blktrace: fix debugfs use after free
btrfs: extent_io: Kill the forward declaration of flush_write_bio
btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up
Revert "btrfs: flush write bio if we loop in extent_write_cache_pages"
btrfs: flush write bio if we loop in extent_write_cache_pages
btrfs: extent_io: Handle errors better in extent_write_full_page()
btrfs: extent_io: Handle errors better in btree_write_cache_pages()
btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()
Btrfs: fix unwritten extent buffers and hangs on future writeback attempts
btrfs: Don't submit any btree write bio if the fs has errors
btrfs: Move btrfs_check_chunk_valid() to tree-check.[ch] and export it
btrfs: tree-checker: Make chunk item checker messages more readable
btrfs: tree-checker: Make btrfs_check_chunk_valid() return EUCLEAN instead of EIO
btrfs: tree-checker: Check chunk item at tree block read time
btrfs: tree-checker: Verify dev item
btrfs: tree-checker: Fix wrong check on max devid
btrfs: tree-checker: Enhance chunk checker to validate chunk profile
btrfs: tree-checker: Verify inode item
btrfs: tree-checker: fix the error message for transid error
Fonts: Replace discarded const qualifier
ALSA: usb-audio: Add implicit feedback quirk for Zoom UAC-2
ALSA: usb-audio: add usb vendor id as DSD-capable for Khadas devices
ALSA: usb-audio: Add implicit feedback quirk for Qu-16
ALSA: usb-audio: Add implicit feedback quirk for MODX
mm: mempolicy: fix potential pte_unmap_unlock pte error
lib/crc32test: remove extra local_irq_disable/enable
kthread_worker: prevent queuing delayed work from timer_fn when it is being canceled
mm: always have io_remap_pfn_range() set pgprot_decrypted()
gfs2: Wake up when sd_glock_disposal becomes zero
ring-buffer: Fix recursion protection transitions between interrupt context
ftrace: Fix recursion check for NMI test
ftrace: Handle tracing when switching between context
tracing: Fix out of bounds write in get_trace_buf
futex: Handle transient "ownerless" rtmutex state correctly
ARM: dts: sun4i-a10: fix cpu_alert temperature
x86/kexec: Use up-to-dated screen_info copy to fill boot params
of: Fix reserved-memory overlap detection
blk-cgroup: Fix memleak on error path
blk-cgroup: Pre-allocate tree node on blkg_conf_prep
scsi: core: Don't start concurrent async scan on same host
vsock: use ns_capable_noaudit() on socket create
drm/vc4: drv: Add error handding for bind
ACPI: NFIT: Fix comparison to '-ENXIO'
vt: Disable KD_FONT_OP_COPY
fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent
serial: 8250_mtk: Fix uart_get_baud_rate warning
serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init
USB: serial: cyberjack: fix write-URB completion race
USB: serial: option: add Quectel EC200T module support
USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231
USB: serial: option: add Telit FN980 composition 0x1055
USB: Add NO_LPM quirk for Kingston flash drive
usb: mtu3: fix panic in mtu3_gadget_stop()
ARC: stack unwinding: avoid indefinite looping
Revert "ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE"
PM: runtime: Resume the device earlier in __device_release_driver()
perf/core: Fix a memory leak in perf_event_parse_addr_filter()
tools: perf: Fix build error in v4.19.y
net: dsa: read mac address from DT for slave device
arm64: dts: marvell: espressobin: Add ethernet switch aliases
Linux 4.19.156
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I87af8871465f54de0332fa74bc1f342b7fe99061
commit b64d814257 upstream.
Espressobin boards have 3 ethernet ports and some of them got assigned more
then one MAC address. MAC addresses are stored in U-Boot environment.
Since commit a2c7023f70 ("net: dsa: read mac address from DT for slave
device") kernel can use MAC addresses from DT for particular DSA port.
Currently Espressobin DTS file contains alias just for ethernet0.
This patch defines additional ethernet aliases in Espressobin DTS files, so
bootloader can fill correct MAC address for DSA switch ports if more MAC
addresses were specified.
DT alias ethernet1 is used for wan port, DT aliases ethernet2 and ethernet3
are used for lan ports for both Espressobin revisions (V5 and V7).
Fixes: 5253cb8c00 ("arm64: dts: marvell: espressobin: add ethernet alias")
Cc: <stable@vger.kernel.org> # a2c7023f70: dsa: read mac address
Signed-off-by: Pali Rohár <pali@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
[pali: Backported Espressobin rev V5 changes to 5.4 and 4.19 versions]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
perf may fail to build in v4.19.y with the following error.
util/evsel.c: In function ‘perf_evsel__exit’:
util/util.h:25:28: error:
passing argument 1 of ‘free’ discards ‘const’ qualifier from pointer target type
This is observed (at least) with gcc v6.5.0. The underlying problem is
the following statement.
zfree(&evsel->pmu_name);
evsel->pmu_name is decared 'const *'. zfree in turn is defined as
#define zfree(ptr) ({ free(*ptr); *ptr = NULL; })
and thus passes the const * to free(). The problem is not seen
in the upstream kernel since zfree() has been rewritten there.
The problem has been introduced into v4.19.y with the backport of upstream
commit d4953f7ef1 (perf parse-events: Fix 3 use after frees found with
clang ASAN).
One possible fix of this problem would be to not declare pmu_name
as const. This patch chooses to typecast the parameter of zfree()
to void *, following the guidance from the upstream kernel which
does the same since commit 7f7c536f23 ("tools lib: Adopt
zalloc()/zfree() from tools/perf")
Fixes: a0100a3630 ("perf parse-events: Fix 3 use after frees found with clang ASAN")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7bdb157cde upstream.
As shown through runtime testing, the "filename" allocation is not
always freed in perf_event_parse_addr_filter().
There are three possible ways that this could happen:
- It could be allocated twice on subsequent iterations through the loop,
- or leaked on the success path,
- or on the failure path.
Clean up the code flow to make it obvious that 'filename' is always
freed in the reallocation path and in the two return paths as well.
We rely on the fact that kfree(NULL) is NOP and filename is initialized
with NULL.
This fixes the leak. No other side effects expected.
[ Dan Carpenter: cleaned up the code flow & added a changelog. ]
[ Ingo Molnar: updated the changelog some more. ]
Fixes: 375637bc52 ("perf/core: Introduce address range filtering")
Signed-off-by: "kiyin(尹亮)" <kiyin@tencent.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "Srivatsa S. Bhat" <srivatsa@csail.mit.edu>
Cc: Anthony Liguori <aliguori@amazon.com>
--
kernel/events/core.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9226c504e3 upstream.
Since the device is resumed from runtime-suspend in
__device_release_driver() anyway, it is better to do that before
looking for busy managed device links from it to consumers, because
if there are any, device_links_unbind_consumers() will be called
and it will cause the consumer devices' drivers to unbind, so the
consumer devices will be runtime-resumed. In turn, resuming each
consumer device will cause the supplier to be resumed and when the
runtime PM references from the given consumer to it are dropped, it
may be suspended. Then, the runtime-resume of the next consumer
will cause the supplier to resume again and so on.
Update the code accordingly.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fixes: 9ed9895370 ("driver core: Functional dependencies tracking support")
Cc: All applicable <stable@vger.kernel.org> # All applicable
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 00fdec98d9.
(but only from 5.2 and prior kernels)
The original commit was a preventive fix based on code-review and was
auto-picked for stable back-port (for better or worse).
It was OK for v5.3+ kernels, but turned up needing an implicit change
68e5c6f073 "(ARC: entry: EV_Trap expects r10 (vs. r9) to have
exception cause)" merged in v5.3 which itself was not backported.
So to summarize the stable backport of this patch for v5.2 and prior
kernels is busted and it won't boot.
The obvious solution is backport 68e5c6f073 but that is a pain as
it doesn't revert cleanly and each of affected kernels (so far v4.19,
v4.14, v4.9, v4.4) needs a slightly different massaged varaint.
So the easier fix is to simply revert the backport from 5.2 and prior.
The issue was not a big deal as it would cause strace to sporadically
not work correctly.
Waldemar Brodkorb first reported this when running ARC uClibc regressions
on latest stable kernels (with offending backport). Once he bisected it,
the analysis was trivial, so thx to him for this.
Reported-by: Waldemar Brodkorb <wbx@uclibc-ng.org>
Bisected-by: Waldemar Brodkorb <wbx@uclibc-ng.org>
Cc: stable <stable@vger.kernel.org> # 5.2 and prior
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 985616f045 upstream.
The write-URB busy flag was being cleared before the completion handler
was done with the URB, something which could lead to corrupt transfers
due to a racing write request if the URB is resubmitted.
Fixes: 507ca9bc04 ("[PATCH] USB: add ability for usb-serial drivers to determine if their write urb is currently being used.")
Cc: stable <stable@vger.kernel.org> # 2.6.13
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 912ab37c79 upstream.
Mediatek 8250 port supports speed higher than uartclk / 16. If the baud
rates in both the new and the old termios setting are higher than
uartclk / 16, the WARN_ON in uart_get_baud_rate() will be triggered.
Passing NULL as the old termios so uart_get_baud_rate() will use
uartclk / 16 - 1 as the new baud rate which will be replaced by the
original baud rate later by tty_termios_encode_baud_rate() in
mtk8250_set_termios().
Fixes: 551e553f0d ("serial: 8250_mtk: Fix high-speed baud rates clamping")
Signed-off-by: Claire Chang <tientzu@chromium.org>
Link: https://lore.kernel.org/r/20201102120749.374458-1-tientzu@chromium.org
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c4e0dff20 upstream.
It's buggy:
On Fri, Nov 06, 2020 at 10:30:08PM +0800, Minh Yuan wrote:
> We recently discovered a slab-out-of-bounds read in fbcon in the latest
> kernel ( v5.10-rc2 for now ). The root cause of this vulnerability is that
> "fbcon_do_set_font" did not handle "vc->vc_font.data" and
> "vc->vc_font.height" correctly, and the patch
> <https://lkml.org/lkml/2020/9/27/223> for VT_RESIZEX can't handle this
> issue.
>
> Specifically, we use KD_FONT_OP_SET to set a small font.data for tty6, and
> use KD_FONT_OP_SET again to set a large font.height for tty1. After that,
> we use KD_FONT_OP_COPY to assign tty6's vc_font.data to tty1's vc_font.data
> in "fbcon_do_set_font", while tty1 retains the original larger
> height. Obviously, this will cause an out-of-bounds read, because we can
> access a smaller vc_font.data with a larger vc_font.height.
Further there was only one user ever.
- Android's loadfont, busybox and console-tools only ever use OP_GET
and OP_SET
- fbset documentation only mentions the kernel cmdline font: option,
not anything else.
- systemd used OP_COPY before release 232 published in Nov 2016
Now unfortunately the crucial report seems to have gone down with
gmane, and the commit message doesn't say much. But the pull request
hints at OP_COPY being broken
https://github.com/systemd/systemd/pull/3651
So in other words, this never worked, and the only project which
foolishly every tried to use it, realized that rather quickly too.
Instead of trying to fix security issues here on dead code by adding
missing checks, fix the entire thing by removing the functionality.
Note that systemd code using the OP_COPY function ignored the return
value, so it doesn't matter what we're doing here really - just in
case a lone server somewhere happens to be extremely unlucky and
running an affected old version of systemd. The relevant code from
font_copy_to_all_vcs() in systemd was:
/* copy font from active VT, where the font was uploaded to */
cfo.op = KD_FONT_OP_COPY;
cfo.height = vcs.v_active-1; /* tty1 == index 0 */
(void) ioctl(vcfd, KDFONTOP, &cfo);
Note this just disables the ioctl, garbage collecting the now unused
callbacks is left for -next.
v2: Tetsuo found the old mail, which allowed me to find it on another
archive. Add the link too.
Acked-by: Peilin Ye <yepeilin.cs@gmail.com>
Reported-by: Minh Yuan <yuanmingbuaa@gmail.com>
References: https://lists.freedesktop.org/archives/systemd-devel/2016-June/036935.html
References: https://github.com/systemd/systemd/pull/3651
Cc: Greg KH <greg@kroah.com>
Cc: Peilin Ye <yepeilin.cs@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://lore.kernel.org/r/20201108153806.3140315-1-daniel.vetter@ffwll.ch
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 831e3405c2 ]
The current scanning mechanism is supposed to fall back to a synchronous
host scan if an asynchronous scan is in progress. However, this rule isn't
strictly respected, scsi_prep_async_scan() doesn't hold scan_mutex when
checking shost->async_scan. When scsi_scan_host() is called concurrently,
two async scans on same host can be started and a hang in do_scan_async()
is observed.
Fixes this issue by checking & setting shost->async_scan atomically with
shost->scan_mutex.
Link: https://lore.kernel.org/r/20201010032539.426615-1-ming.lei@redhat.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f255c19b3a ]
Similarly to commit 457e490f2b ("blkcg: allocate struct blkcg_gq
outside request queue spinlock"), blkg_create can also trigger
occasional -ENOMEM failures at the radix insertion because any
allocation inside blkg_create has to be non-blocking, making it more
likely to fail. This causes trouble for userspace tools trying to
configure io weights who need to deal with this condition.
This patch reduces the occurrence of -ENOMEMs on this path by preloading
the radix tree element on a GFP_KERNEL context, such that we guarantee
the later non-blocking insertion won't fail.
A similar solution exists in blkcg_init_queue for the same situation.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca05f33316 ]
The reserved-memory overlap detection code fails to detect overlaps if
either of the regions starts at address 0x0. The code explicitly checks
for and ignores such regions, apparently in order to ignore dynamically
allocated regions which have an address of 0x0 at this point. These
dynamically allocated regions also have a size of 0x0 at this point, so
fix this by removing the check and sorting the dynamically allocated
regions ahead of any static regions at address 0x0.
For example, there are two overlaps in this case but they are not
currently reported:
foo@0 {
reg = <0x0 0x2000>;
};
bar@0 {
reg = <0x0 0x1000>;
};
baz@1000 {
reg = <0x1000 0x1000>;
};
quux {
size = <0x1000>;
};
but they are after this patch:
OF: reserved mem: OVERLAP DETECTED!
bar@0 (0x00000000--0x00001000) overlaps with foo@0 (0x00000000--0x00002000)
OF: reserved mem: OVERLAP DETECTED!
foo@0 (0x00000000--0x00002000) overlaps with baz@1000 (0x00001000--0x00002000)
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/r/ded6fd6b47b58741aabdcc6967f73eca6a3f311e.1603273666.git-series.vincent.whitchurch@axis.com
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit afc18069a2 ]
kexec_file_load() currently reuses the old boot_params.screen_info,
but if drivers have change the hardware state, boot_param.screen_info
could contain invalid info.
For example, the video type might be no longer VGA, or the frame buffer
address might be changed. If the kexec kernel keeps using the old screen_info,
kexec'ed kernel may attempt to write to an invalid framebuffer
memory region.
There are two screen_info instances globally available, boot_params.screen_info
and screen_info. Later one is a copy, and is updated by drivers.
So let kexec_file_load use the updated copy.
[ mingo: Tidied up the changelog. ]
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20201014092429.1415040-2-kasong@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 9f5d1c336a upstream.
Gratian managed to trigger the BUG_ON(!newowner) in fixup_pi_state_owner().
This is one possible chain of events leading to this:
Task Prio Operation
T1 120 lock(F)
T2 120 lock(F) -> blocks (top waiter)
T3 50 (RT) lock(F) -> boosts T1 and blocks (new top waiter)
XX timeout/ -> wakes T2
signal
T1 50 unlock(F) -> wakes T3 (rtmutex->owner == NULL, waiter bit is set)
T2 120 cleanup -> try_to_take_mutex() fails because T3 is the top waiter
and the lower priority T2 cannot steal the lock.
-> fixup_pi_state_owner() sees newowner == NULL -> BUG_ON()
The comment states that this is invalid and rt_mutex_real_owner() must
return a non NULL owner when the trylock failed, but in case of a queued
and woken up waiter rt_mutex_real_owner() == NULL is a valid transient
state. The higher priority waiter has simply not yet managed to take over
the rtmutex.
The BUG_ON() is therefore wrong and this is just another retry condition in
fixup_pi_state_owner().
Drop the locks, so that T3 can make progress, and then try the fixup again.
Gratian provided a great analysis, traces and a reproducer. The analysis is
to the point, but it confused the hell out of that tglx dude who had to
page in all the futex horrors again. Condensed version is above.
[ tglx: Wrote comment and changelog ]
Fixes: c1e2f0eaf0 ("futex: Avoid violating the 10th rule of futex")
Reported-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87a6w6x7bb.fsf@ni.com
Link: https://lore.kernel.org/r/87sg9pkvf7.fsf@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1acb4ac1a upstream.
The nesting count of trace_printk allows for 4 levels of nesting. The
nesting counter starts at zero and is incremented before being used to
retrieve the current context's buffer. But the index to the buffer uses the
nesting counter after it was incremented, and not its original number,
which in needs to do.
Link: https://lkml.kernel.org/r/20201029161905.4269-1-hqjagain@gmail.com
Cc: stable@vger.kernel.org
Fixes: 3d9622c12c ("tracing: Add barrier to trace_printk() buffer nesting modification")
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 726b3d3f14 upstream.
When an interrupt or NMI comes in and switches the context, there's a delay
from when the preempt_count() shows the update. As the preempt_count() is
used to detect recursion having each context have its own bit get set when
tracing starts, and if that bit is already set, it is considered a recursion
and the function exits. But if this happens in that section where context
has changed but preempt_count() has not been updated, this will be
incorrectly flagged as a recursion.
To handle this case, create another bit call TRANSITION and test it if the
current context bit is already set. Flag the call as a recursion if the
TRANSITION bit is already set, and if not, set it and continue. The
TRANSITION bit will be cleared normally on the return of the function that
set it, or if the current context bit is clear, set it and clear the
TRANSITION bit to allow for another transition between the current context
and an even higher one.
Cc: stable@vger.kernel.org
Fixes: edc15cafcb ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee11b93f95 upstream.
The code that checks recursion will work to only do the recursion check once
if there's nested checks. The top one will do the check, the other nested
checks will see recursion was already checked and return zero for its "bit".
On the return side, nothing will be done if the "bit" is zero.
The problem is that zero is returned for the "good" bit when in NMI context.
This will set the bit for NMIs making it look like *all* NMI tracing is
recursing, and prevent tracing of anything in NMI context!
The simple fix is to return "bit + 1" and subtract that bit on the end to
get the real bit.
Cc: stable@vger.kernel.org
Fixes: edc15cafcb ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b02414c8f0 upstream.
The recursion protection of the ring buffer depends on preempt_count() to be
correct. But it is possible that the ring buffer gets called after an
interrupt comes in but before it updates the preempt_count(). This will
trigger a false positive in the recursion code.
Use the same trick from the ftrace function callback recursion code which
uses a "transition" bit that gets set, to allow for a single recursion for
to handle transitions between contexts.
Cc: stable@vger.kernel.org
Fixes: 567cd4da54 ("ring-buffer: User context bit recursion checking")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da7d554f7c upstream.
Commit fc0e38dae6 ("GFS2: Fix glock deallocation race") fixed a
sd_glock_disposal accounting bug by adding a missing atomic_dec
statement, but it failed to wake up sd_glock_wait when that decrement
causes sd_glock_disposal to reach zero. As a consequence,
gfs2_gl_hash_clear can now run into a 10-minute timeout instead of
being woken up. Add the missing wakeup.
Fixes: fc0e38dae6 ("GFS2: Fix glock deallocation race")
Cc: stable@vger.kernel.org # v2.6.39+
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>