Commit Graph

801800 Commits

Author SHA1 Message Date
Shijie Luo
1b59600cd8 ext4: fix potential error in ext4_do_update_inode
commit 7d8bd3c76d upstream.

If set_large_file = 1 and errors occur in ext4_handle_dirty_metadata(),
the error code will be overridden, go to out_brelse to avoid this
situation.

Signed-off-by: Shijie Luo <luoshijie1@huawei.com>
Link: https://lore.kernel.org/r/20210312065051.36314-1-luoshijie1@huawei.com
Cc: stable@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:38 +01:00
zhangyi (F)
a8fb57ec92 ext4: do not try to set xattr into ea_inode if value is empty
commit 6b22489911 upstream.

Syzbot report a warning that ext4 may create an empty ea_inode if set
an empty extent attribute to a file on the file system which is no free
blocks left.

  WARNING: CPU: 6 PID: 10667 at fs/ext4/xattr.c:1640 ext4_xattr_set_entry+0x10f8/0x1114 fs/ext4/xattr.c:1640
  ...
  Call trace:
   ext4_xattr_set_entry+0x10f8/0x1114 fs/ext4/xattr.c:1640
   ext4_xattr_block_set+0x1d0/0x1b1c fs/ext4/xattr.c:1942
   ext4_xattr_set_handle+0x8a0/0xf1c fs/ext4/xattr.c:2390
   ext4_xattr_set+0x120/0x1f0 fs/ext4/xattr.c:2491
   ext4_xattr_trusted_set+0x48/0x5c fs/ext4/xattr_trusted.c:37
   __vfs_setxattr+0x208/0x23c fs/xattr.c:177
  ...

Now, ext4 try to store extent attribute into an external inode if
ext4_xattr_block_set() return -ENOSPC, but for the case of store an
empty extent attribute, store the extent entry into the extent
attribute block is enough. A simple reproduce below.

  fallocate test.img -l 1M
  mkfs.ext4 -F -b 2048 -O ea_inode test.img
  mount test.img /mnt
  dd if=/dev/zero of=/mnt/foo bs=2048 count=500
  setfattr -n "user.test" /mnt/foo

Reported-by: syzbot+98b881fdd8ebf45ab4ae@syzkaller.appspotmail.com
Fixes: 9c6e7853c5 ("ext4: reserve space for xattr entries/names")
Cc: stable@kernel.org
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Link: https://lore.kernel.org/r/20210305120508.298465-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:38 +01:00
zhangyi (F)
5cdacb4181 ext4: find old entry again if failed to rename whiteout
commit b7ff91fd03 upstream.

If we failed to add new entry on rename whiteout, we cannot reset the
old->de entry directly, because the old->de could have moved from under
us during make indexed dir. So find the old entry again before reset is
needed, otherwise it may corrupt the filesystem as below.

  /dev/sda: Entry '00000001' in ??? (12) has deleted/unused inode 15. CLEARED.
  /dev/sda: Unattached inode 75
  /dev/sda: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.

Fixes: 6b4b8e6b4a ("ext4: fix bug for rename with RENAME_WHITEOUT")
Cc: stable@vger.kernel.org
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Link: https://lore.kernel.org/r/20210303131703.330415-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:38 +01:00
Oleg Nesterov
1454a47fb9 x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall()
commit 8c150ba2fb upstream.

The comment in get_nr_restart_syscall() says:

	 * The problem is that we can get here when ptrace pokes
	 * syscall-like values into regs even if we're not in a syscall
	 * at all.

Yes, but if not in a syscall then the

	status & (TS_COMPAT|TS_I386_REGS_POKED)

check below can't really help:

	- TS_COMPAT can't be set

	- TS_I386_REGS_POKED is only set if regs->orig_ax was changed by
	  32bit debugger; and even in this case get_nr_restart_syscall()
	  is only correct if the tracee is 32bit too.

Suppose that a 64bit debugger plays with a 32bit tracee and

	* Tracee calls sleep(2)	// TS_COMPAT is set
	* User interrupts the tracee by CTRL-C after 1 sec and does
	  "(gdb) call func()"
	* gdb saves the regs by PTRACE_GETREGS
	* does PTRACE_SETREGS to set %rip='func' and %orig_rax=-1
	* PTRACE_CONT		// TS_COMPAT is cleared
	* func() hits int3.
	* Debugger catches SIGTRAP.
	* Restore original regs by PTRACE_SETREGS.
	* PTRACE_CONT

get_nr_restart_syscall() wrongly returns __NR_restart_syscall==219, the
tracee calls ia32_sys_call_table[219] == sys_madvise.

Add the sticky TS_COMPAT_RESTART flag which survives after return to user
mode. It's going to be removed in the next step again by storing the
information in the restart block. As a further cleanup it might be possible
to remove also TS_I386_REGS_POKED with that.

Test-case:

  $ cvs -d :pserver:anoncvs:anoncvs@sourceware.org:/cvs/systemtap co ptrace-tests
  $ gcc -o erestartsys-trap-debuggee ptrace-tests/tests/erestartsys-trap-debuggee.c --m32
  $ gcc -o erestartsys-trap-debugger ptrace-tests/tests/erestartsys-trap-debugger.c -lutil
  $ ./erestartsys-trap-debugger
  Unexpected: retval 1, errno 22
  erestartsys-trap-debugger: ptrace-tests/tests/erestartsys-trap-debugger.c:421

Fixes: 609c19a385 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code")
Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210201174709.GA17895@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:38 +01:00
Oleg Nesterov
2c79ceeec4 x86: Move TS_COMPAT back to asm/thread_info.h
commit 66c1b6d74c upstream.

Move TS_COMPAT back to asm/thread_info.h, close to TS_I386_REGS_POKED.

It was moved to asm/processor.h by b9d989c721 ("x86/asm: Move the
thread_info::status field to thread_struct"), then later 37a8f7c383
("x86/asm: Move 'status' from thread_struct to thread_info") moved the
'status' field back but TS_COMPAT was forgotten.

Preparatory patch to fix the COMPAT case for get_nr_restart_syscall()

Fixes: 609c19a385 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210201174649.GA17880@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:37 +01:00
Oleg Nesterov
6cd1e19841 kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()
commit 5abbe51a52 upstream.

Preparation for fixing get_nr_restart_syscall() on X86 for COMPAT.

Add a new helper which sets restart_block->fn and calls a dummy
arch_set_restart_data() helper.

Fixes: 609c19a385 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210201174641.GA17871@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:37 +01:00
Thomas Gleixner
2409e66dfe x86/ioapic: Ignore IRQ2 again
commit a501b048a9 upstream.

Vitaly ran into an issue with hotplugging CPU0 on an Amazon instance where
the matrix allocator claimed to be out of vectors. He analyzed it down to
the point that IRQ2, the PIC cascade interrupt, which is supposed to be not
ever routed to the IO/APIC ended up having an interrupt vector assigned
which got moved during unplug of CPU0.

The underlying issue is that IRQ2 for various reasons (see commit
af174783b9 ("x86: I/O APIC: Never configure IRQ2" for details) is treated
as a reserved system vector by the vector core code and is not accounted as
a regular vector. The Amazon BIOS has an routing entry of pin2 to IRQ2
which causes the IO/APIC setup to claim that interrupt which is granted by
the vector domain because there is no sanity check. As a consequence the
allocation counter of CPU0 underflows which causes a subsequent unplug to
fail with:

  [ ... ] CPU 0 has 4294967295 vectors, 589 available. Cannot disable CPU

There is another sanity check missing in the matrix allocator, but the
underlying root cause is that the IO/APIC code lost the IRQ2 ignore logic
during the conversion to irqdomains.

For almost 6 years nobody complained about this wreckage, which might
indicate that this requirement could be lifted, but for any system which
actually has a PIC IRQ2 is unusable by design so any routing entry has no
effect and the interrupt cannot be connected to a device anyway.

Due to that and due to history biased paranoia reasons restore the IRQ2
ignore logic and treat it as non existent despite a routing entry claiming
otherwise.

Fixes: d32932d02e ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces")
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210318192819.636943062@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:37 +01:00
Kan Liang
b35214c541 perf/x86/intel: Fix a crash caused by zero PEBS status
commit d88d05a9e0 upstream.

A repeatable crash can be triggered by the perf_fuzzer on some Haswell
system.
https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/

For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
may be mistakenly set to 0. To minimize the impact of the defect, the
commit was introduced to try to avoid dropping the PEBS record for some
cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
the local pebs_status accordingly. However, it doesn't correct the PEBS
status in the PEBS record, which may trigger the crash, especially for
the large PEBS.

It's possible that all the PEBS records in a large PEBS have the PEBS
status 0. If so, the first get_next_pebs_record_by_bit() in the
__intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
PEBS, the 'count' parameter must > 1. The second
get_next_pebs_record_by_bit() will crash.

Besides the local pebs_status, correct the PEBS status in the PEBS
record as well.

Fixes: 01330d7288 ("perf/x86: Allow zero PEBS status with only single active event")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1615555298-140216-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:37 +01:00
Tyrel Datwyler
f27a00f0d5 PCI: rpadlpar: Fix potential drc_name corruption in store functions
commit cc7a0bb058 upstream.

Both add_slot_store() and remove_slot_store() try to fix up the
drc_name copied from the store buffer by placing a NUL terminator at
nbyte + 1 or in place of a '\n' if present. However, the static buffer
that we copy the drc_name data into is not zeroed and can contain
anything past the n-th byte.

This is problematic if a '\n' byte appears in that buffer after nbytes
and the string copied into the store buffer was not NUL terminated to
start with as the strchr() search for a '\n' byte will mark this
incorrectly as the end of the drc_name string resulting in a drc_name
string that contains garbage data after the n-th byte.

Additionally it will cause us to overwrite that '\n' byte on the stack
with NUL, potentially corrupting data on the stack.

The following debugging shows an example of the drmgr utility writing
"PHB 4543" to the add_slot sysfs attribute, but add_slot_store()
logging a corrupted string value.

  drmgr: drmgr: -c phb -a -s PHB 4543 -d 1
  add_slot_store: drc_name = PHB 4543°|<82>!, rc = -19

Fix this by using strscpy() instead of memcpy() to ensure the string
is NUL terminated when copied into the static drc_name buffer.
Further, since the string is now NUL terminated the code only needs to
change '\n' to '\0' when present.

Cc: stable@vger.kernel.org
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
[mpe: Reformat change log and add mention of possible stack corruption]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210315214821.452959-1-tyreld@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:37 +01:00
Ye Xiang
c230ae3574 iio: hid-sensor-temperature: Fix issues of timestamp channel
commit 141e7633aa upstream.

This patch fixes 2 issues of timestamp channel:
1. This patch ensures that there is sufficient space and correct
alignment for the timestamp.
2. Correct the timestamp channel scan index.

Fixes: 59d0f2da35 ("iio: hid: Add temperature sensor support")
Signed-off-by: Ye Xiang <xiang.ye@intel.com>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210303063615.12130-4-xiang.ye@intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:36 +01:00
Ye Xiang
33cbca9902 iio: hid-sensor-prox: Fix scale not correct issue
commit d68c592e02 upstream.

Currently, the proxy sensor scale is zero because it just return the
exponent directly. To fix this issue, this patch use
hid_sensor_format_scale to process the scale first then return the
output.

Fixes: 39a3a0138f ("iio: hid-sensors: Added Proximity Sensor Driver")
Signed-off-by: Ye Xiang <xiang.ye@intel.com>
Link: https://lore.kernel.org/r/20210130102530.31064-1-xiang.ye@intel.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:36 +01:00
Ye Xiang
42cae14ad2 iio: hid-sensor-humidity: Fix alignment issue of timestamp channel
commit 37e89e574d upstream.

This patch ensures that, there is sufficient space and correct
alignment for the timestamp.

Fixes: d7ed89d5aa ("iio: hid: Add humidity sensor support")
Signed-off-by: Ye Xiang <xiang.ye@intel.com>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210303063615.12130-2-xiang.ye@intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:36 +01:00
Dinghao Liu
9f275ea15d iio: gyro: mpu3050: Fix error handling in mpu3050_trigger_handler
commit 6dbbbe4cfd upstream.

There is one regmap_bulk_read() call in mpu3050_trigger_handler
that we have caught its return value bug lack further handling.
Check and terminate the execution flow just like the other three
regmap_bulk_read() calls in this function.

Fixes: 3904b28efb ("iio: gyro: Add driver for the MPU-3050 gyroscope")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20210301080421.13436-1-dinghao.liu@zju.edu.cn
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:36 +01:00
Dan Carpenter
f8ff441921 iio: adis16400: Fix an error code in adis16400_initial_setup()
commit a71266e454 upstream.

This is to silence a new Smatch warning:

    drivers/iio/imu/adis16400.c:492 adis16400_initial_setup()
    warn: sscanf doesn't return error codes

If the condition "if (st->variant->flags & ADIS16400_HAS_SLOW_MODE) {"
is false then we return 1 instead of returning 0 and probe will fail.

Fixes: 72a868b38b ("iio: imu: check sscanf return value")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/YCwgFb3JVG6qrlQ+@mwanda
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:36 +01:00
Jonathan Albrieux
ed1323e4d5 iio:adc:qcom-spmi-vadc: add default scale to LR_MUX2_BAT_ID channel
commit 7d200b283a upstream.

Checking at both msm8909-pm8916.dtsi and msm8916.dtsi from downstream
it is indicated that "batt_id" channel has to be scaled with the default
function:

	chan@31 {
		label = "batt_id";
		reg = <0x31>;
		qcom,decimation = <0>;
		qcom,pre-div-channel-scaling = <0>;
		qcom,calibration-type = "ratiometric";
		qcom,scale-function = <0>;
		qcom,hw-settle-time = <0xb>;
		qcom,fast-avg-setup = <0>;
	};

Change LR_MUX2_BAT_ID scaling accordingly.

Signed-off-by: Jonathan Albrieux <jonathan.albrieux@gmail.com>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Fixes: 7c271eea7b ("iio: adc: spmi-vadc: Changes to support different scaling")
Link: https://lore.kernel.org/r/20210113151808.4628-2-jonathan.albrieux@gmail.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:35 +01:00
Jonathan Cameron
bf348faad0 iio:adc:stm32-adc: Add HAS_IOMEM dependency
commit 121875b28e upstream.

Seems that there are config combinations in which this driver gets enabled
and hence selects the MFD, but with out HAS_IOMEM getting pulled in
via some other route.  MFD is entirely contained in an
if HAS_IOMEM block, leading to the build issue in this bugzilla.

https://bugzilla.kernel.org/show_bug.cgi?id=209889

Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20210124195034.22576-1-jic23@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:35 +01:00
Jim Lin
c3a4fc9f37 usb: gadget: configfs: Fix KASAN use-after-free
commit 98f153a10d upstream.

When gadget is disconnected, running sequence is like this.
. composite_disconnect
. Call trace:
  usb_string_copy+0xd0/0x128
  gadget_config_name_configuration_store+0x4
  gadget_config_name_attr_store+0x40/0x50
  configfs_write_file+0x198/0x1f4
  vfs_write+0x100/0x220
  SyS_write+0x58/0xa8
. configfs_composite_unbind
. configfs_composite_bind

In configfs_composite_bind, it has
"cn->strings.s = cn->configuration;"

When usb_string_copy is invoked. it would
allocate memory, copy input string, release previous pointed memory space,
and use new allocated memory.

When gadget is connected, host sends down request to get information.
Call trace:
  usb_gadget_get_string+0xec/0x168
  lookup_string+0x64/0x98
  composite_setup+0xa34/0x1ee8

If gadget is disconnected and connected quickly, in the failed case,
cn->configuration memory has been released by usb_string_copy kfree but
configfs_composite_bind hasn't been run in time to assign new allocated
"cn->configuration" pointer to "cn->strings.s".

When "strlen(s->s) of usb_gadget_get_string is being executed, the dangling
memory is accessed, "BUG: KASAN: use-after-free" error occurs.

Cc: stable@vger.kernel.org
Signed-off-by: Jim Lin <jilin@nvidia.com>
Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Link: https://lore.kernel.org/r/1615444961-13376-1-git-send-email-macpaul.lin@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:35 +01:00
Macpaul Lin
eac35ed3f8 USB: replace hardcode maximum usb string length by definition
commit 81c7462883 upstream.

Replace hardcoded maximum USB string length (126 bytes) by definition
"USB_MAX_STRING_LEN".

Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/1592471618-29428-1-git-send-email-macpaul.lin@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:35 +01:00
Colin Ian King
682517dce5 usbip: Fix incorrect double assignment to udc->ud.tcp_rx
commit 9858af27e6 upstream.

Currently udc->ud.tcp_rx is being assigned twice, the second assignment
is incorrect, it should be to udc->ud.tcp_tx instead of rx. Fix this.

Fixes: 46613c9dfa ("usbip: fix vudc usbip_sockfd_store races leading to gpf")
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: stable <stable@vger.kernel.org>
Addresses-Coverity: ("Unused value")
Link: https://lore.kernel.org/r/20210311104445.7811-1-colin.king@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:34 +01:00
Alan Stern
b5bb40b336 usb-storage: Add quirk to defeat Kindle's automatic unload
commit 546aa0e4ea upstream.

Matthias reports that the Amazon Kindle automatically removes its
emulated media if it doesn't receive another SCSI command within about
one second after a SYNCHRONIZE CACHE.  It does so even when the host
has sent a PREVENT MEDIUM REMOVAL command.  The reason for this
behavior isn't clear, although it's not hard to make some guesses.

At any rate, the results can be unexpected for anyone who tries to
access the Kindle in an unusual fashion, and in theory they can lead
to data loss (for example, if one file is closed and synchronized
while other files are still in the middle of being written).

To avoid such problems, this patch creates a new usb-storage quirks
flag telling the driver always to issue a REQUEST SENSE following a
SYNCHRONIZE CACHE command, and adds an unusual_devs entry for the
Kindle with the flag set.  This is sufficient to prevent the Kindle
from doing its automatic unload, without interfering with proper
operation.

Another possible way to deal with this would be to increase the
frequency of TEST UNIT READY polling that the kernel normally carries
out for removable-media storage devices.  However that would increase
the overall load on the system and it is not as reliable, because the
user can override the polling interval.  Changing the driver's
behavior is safer and has minimal overhead.

CC: <stable@vger.kernel.org>
Reported-and-tested-by: Matthias Schwarzott <zzam@gentoo.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20210317190654.GA497856@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:34 +01:00
Christophe Leroy
734b262cf7 powerpc: Force inlining of cpu_has_feature() to avoid build failure
commit eed5fae005 upstream.

The code relies on constant folding of cpu_has_feature() based
on possible and always true values as defined per
CPU_FTRS_ALWAYS and CPU_FTRS_POSSIBLE.

Build failure is encountered with for instance
book3e_all_defconfig on kisskb in the AMDGPU driver which uses
cpu_has_feature(CPU_FTR_VSX_COMP) to decide whether calling
kernel_enable_vsx() or not.

The failure is due to cpu_has_feature() not being inlined with
that configuration with gcc 4.9.

In the same way as commit acdad8fb4a ("powerpc: Force inlining of
mmu_has_feature to fix build failure"), for inlining of
cpu_has_feature().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b231dfa040ce4cc37f702f5c3a595fdeabfe0462.1615378209.git.christophe.leroy@csgroup.eu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:34 +01:00
Sagi Grimberg
5d9873e46c nvme-rdma: fix possible hang when failing to set io queues
[ Upstream commit c4c6df5fc8 ]

We only setup io queues for nvme controllers, and it makes absolutely no
sense to allow a controller (re)connect without any I/O queues.  If we
happen to fail setting the queue count for any reason, we should not allow
this to be a successful reconnect as I/O has no chance in going through.
Instead just fail and schedule another reconnect.

Reported-by: Chao Leng <lengchao@huawei.com>
Fixes: 7110230719 ("nvme-rdma: add a NVMe over Fabrics RDMA host driver")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chao Leng <lengchao@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-24 11:07:34 +01:00
Dan Carpenter
934aa852f8 scsi: lpfc: Fix some error codes in debugfs
commit 19f1bc7edf upstream.

If copy_from_user() or kstrtoull() fail then the correct behavior is to
return a negative error code.

Link: https://lore.kernel.org/r/YEsbU/UxYypVrC7/@mwanda
Fixes: f9bb2da11d ("[SCSI] lpfc 8.3.27: T10 additions for SLI4")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:34 +01:00
Pavel Skripkin
c5d4f79b3c net/qrtr: fix __netdev_alloc_skb call
commit 093b036aa9 upstream.

syzbot found WARNING in __alloc_pages_nodemask()[1] when order >= MAX_ORDER.
It was caused by a huge length value passed from userspace to qrtr_tun_write_iter(),
which tries to allocate skb. Since the value comes from the untrusted source
there is no need to raise a warning in __alloc_pages_nodemask().

[1] WARNING in __alloc_pages_nodemask+0x5f8/0x730 mm/page_alloc.c:5014
Call Trace:
 __alloc_pages include/linux/gfp.h:511 [inline]
 __alloc_pages_node include/linux/gfp.h:524 [inline]
 alloc_pages_node include/linux/gfp.h:538 [inline]
 kmalloc_large_node+0x60/0x110 mm/slub.c:3999
 __kmalloc_node_track_caller+0x319/0x3f0 mm/slub.c:4496
 __kmalloc_reserve net/core/skbuff.c:150 [inline]
 __alloc_skb+0x4e4/0x5a0 net/core/skbuff.c:210
 __netdev_alloc_skb+0x70/0x400 net/core/skbuff.c:446
 netdev_alloc_skb include/linux/skbuff.h:2832 [inline]
 qrtr_endpoint_post+0x84/0x11b0 net/qrtr/qrtr.c:442
 qrtr_tun_write_iter+0x11f/0x1a0 net/qrtr/tun.c:98
 call_write_iter include/linux/fs.h:1901 [inline]
 new_sync_write+0x426/0x650 fs/read_write.c:518
 vfs_write+0x791/0xa30 fs/read_write.c:605
 ksys_write+0x12d/0x250 fs/read_write.c:658
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported-by: syzbot+80dccaee7c6630fa9dcf@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Acked-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:34 +01:00
Daniel Kobras
98982cf799 sunrpc: fix refcount leak for rpc auth modules
commit f1442d6349 upstream.

If an auth module's accept op returns SVC_CLOSE, svc_process_common()
enters a call path that does not call svc_authorise() before leaving the
function, and thus leaks a reference on the auth module's refcount. Hence,
make sure calls to svc_authenticate() and svc_authorise() are paired for
all call paths, to make sure rpc auth modules can be unloaded.

Signed-off-by: Daniel Kobras <kobras@puzzle-itc.de>
Fixes: 4d712ef1db ("svcauth_gss: Close connection when dropping an incoming message")
Link: https://lore.kernel.org/linux-nfs/3F1B347F-B809-478F-A1E9-0BE98E22B0F0@oracle.com/T/#t
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:33 +01:00
Timo Rothenpieler
6ae439d187 svcrdma: disable timeouts on rdma backchannel
commit 6820bf7786 upstream.

This brings it in line with the regular tcp backchannel, which also has
all those timeouts disabled.

Prevents the backchannel from timing out, getting some async operations
like server side copying getting stuck indefinitely on the client side.

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
Fixes: 5d252f90a8 ("svcrdma: Add class for RDMA backwards direction transport")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:33 +01:00
Joe Korty
ca2cc2ee95 NFSD: Repair misuse of sv_lock in 5.10.16-rt30.
commit c7de87ff9d upstream.

[ This problem is in mainline, but only rt has the chops to be
able to detect it. ]

Lockdep reports a circular lock dependency between serv->sv_lock and
softirq_ctl.lock on system shutdown, when using a kernel built with
CONFIG_PREEMPT_RT=y, and a nfs mount exists.

This is due to the definition of spin_lock_bh on rt:

	local_bh_disable();
	rt_spin_lock(lock);

which forces a softirq_ctl.lock -> serv->sv_lock dependency.  This is
not a problem as long as _every_ lock of serv->sv_lock is a:

	spin_lock_bh(&serv->sv_lock);

but there is one of the form:

	spin_lock(&serv->sv_lock);

This is what is causing the circular dependency splat.  The spin_lock()
grabs the lock without first grabbing softirq_ctl.lock via local_bh_disable.
If later on in the critical region,  someone does a local_bh_disable, we
get a serv->sv_lock -> softirq_ctrl.lock dependency established.  Deadlock.

Fix is to make serv->sv_lock be locked with spin_lock_bh everywhere, no
exceptions.

[  OK  ] Stopped target NFS client services.
         Stopping Logout off all iSCSI sessions on shutdown...
         Stopping NFS server and services...
[  109.442380]
[  109.442385] ======================================================
[  109.442386] WARNING: possible circular locking dependency detected
[  109.442387] 5.10.16-rt30 #1 Not tainted
[  109.442389] ------------------------------------------------------
[  109.442390] nfsd/1032 is trying to acquire lock:
[  109.442392] ffff994237617f60 ((softirq_ctrl.lock).lock){+.+.}-{2:2}, at: __local_bh_disable_ip+0xd9/0x270
[  109.442405]
[  109.442405] but task is already holding lock:
[  109.442406] ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90
[  109.442415]
[  109.442415] which lock already depends on the new lock.
[  109.442415]
[  109.442416]
[  109.442416] the existing dependency chain (in reverse order) is:
[  109.442417]
[  109.442417] -> #1 (&serv->sv_lock){+.+.}-{0:0}:
[  109.442421]        rt_spin_lock+0x2b/0xc0
[  109.442428]        svc_add_new_perm_xprt+0x42/0xa0
[  109.442430]        svc_addsock+0x135/0x220
[  109.442434]        write_ports+0x4b3/0x620
[  109.442438]        nfsctl_transaction_write+0x45/0x80
[  109.442440]        vfs_write+0xff/0x420
[  109.442444]        ksys_write+0x4f/0xc0
[  109.442446]        do_syscall_64+0x33/0x40
[  109.442450]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  109.442454]
[  109.442454] -> #0 ((softirq_ctrl.lock).lock){+.+.}-{2:2}:
[  109.442457]        __lock_acquire+0x1264/0x20b0
[  109.442463]        lock_acquire+0xc2/0x400
[  109.442466]        rt_spin_lock+0x2b/0xc0
[  109.442469]        __local_bh_disable_ip+0xd9/0x270
[  109.442471]        svc_xprt_do_enqueue+0xc0/0x4d0
[  109.442474]        svc_close_list+0x60/0x90
[  109.442476]        svc_close_net+0x49/0x1a0
[  109.442478]        svc_shutdown_net+0x12/0x40
[  109.442480]        nfsd_destroy+0xc5/0x180
[  109.442482]        nfsd+0x1bc/0x270
[  109.442483]        kthread+0x194/0x1b0
[  109.442487]        ret_from_fork+0x22/0x30
[  109.442492]
[  109.442492] other info that might help us debug this:
[  109.442492]
[  109.442493]  Possible unsafe locking scenario:
[  109.442493]
[  109.442493]        CPU0                    CPU1
[  109.442494]        ----                    ----
[  109.442495]   lock(&serv->sv_lock);
[  109.442496]                                lock((softirq_ctrl.lock).lock);
[  109.442498]                                lock(&serv->sv_lock);
[  109.442499]   lock((softirq_ctrl.lock).lock);
[  109.442501]
[  109.442501]  *** DEADLOCK ***
[  109.442501]
[  109.442501] 3 locks held by nfsd/1032:
[  109.442503]  #0: ffffffff93b49258 (nfsd_mutex){+.+.}-{3:3}, at: nfsd+0x19a/0x270
[  109.442508]  #1: ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90
[  109.442512]  #2: ffffffff93a81b20 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0
[  109.442518]
[  109.442518] stack backtrace:
[  109.442519] CPU: 0 PID: 1032 Comm: nfsd Not tainted 5.10.16-rt30 #1
[  109.442522] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015
[  109.442524] Call Trace:
[  109.442527]  dump_stack+0x77/0x97
[  109.442533]  check_noncircular+0xdc/0xf0
[  109.442546]  __lock_acquire+0x1264/0x20b0
[  109.442553]  lock_acquire+0xc2/0x400
[  109.442564]  rt_spin_lock+0x2b/0xc0
[  109.442570]  __local_bh_disable_ip+0xd9/0x270
[  109.442573]  svc_xprt_do_enqueue+0xc0/0x4d0
[  109.442577]  svc_close_list+0x60/0x90
[  109.442581]  svc_close_net+0x49/0x1a0
[  109.442585]  svc_shutdown_net+0x12/0x40
[  109.442588]  nfsd_destroy+0xc5/0x180
[  109.442590]  nfsd+0x1bc/0x270
[  109.442595]  kthread+0x194/0x1b0
[  109.442600]  ret_from_fork+0x22/0x30
[  109.518225] nfsd: last server has exited, flushing export cache
[  OK  ] Stopped NFSv4 ID-name mapping service.
[  OK  ] Stopped GSSAPI Proxy Daemon.
[  OK  ] Stopped NFS Mount Daemon.
[  OK  ] Stopped NFS status monitor for NFSv2/3 locking..

Fixes: 719f8bcc88 ("svcrpc: fix xpt_list traversal locking on shutdown")
Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:33 +01:00
Sagi Grimberg
268b3c6089 nvmet: don't check iosqes,iocqes for discovery controllers
commit d218a8a300 upstream.

From the base spec, Figure 78:

  "Controller Configuration, these fields are defined as parameters to
   configure an "I/O Controller (IOC)" and not to configure a "Discovery
   Controller (DC).

   ...
   If the controller does not support I/O queues, then this field shall
   be read-only with a value of 0h

Just perform this check for I/O controllers.

Fixes: a07b4970f4 ("nvmet: add a generic NVMe target")
Reported-by: Belanger, Martin <Martin.Belanger@dell.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:33 +01:00
Alexander Shiyan
fac089ce7b ASoC: fsl_ssi: Fix TDM slot setup for I2S mode
commit 8726396851 upstream.

When using the driver in I2S TDM mode, the _fsl_ssi_set_dai_fmt()
function rewrites the number of slots previously set by the
fsl_ssi_set_dai_tdm_slot() function to 2 by default.
To fix this, let's use the saved slot count value or, if TDM
is not used and the slot count is not set, proceed as before.

Fixes: 4f14f5c11d ("ASoC: fsl_ssi: Fix number of words per frame for I2S-slave mode")
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
Link: https://lore.kernel.org/r/20210216114221.26635-1-shc_work@mail.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:33 +01:00
David Sterba
412795cfdd btrfs: fix slab cache flags for free space tree bitmap
commit 34e49994d0 upstream.

The free space tree bitmap slab cache is created with SLAB_RED_ZONE but
that's a debugging flag and not always enabled. Also the other slabs are
created with at least SLAB_MEM_SPREAD that we want as well to average
the memory placement cost.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 3acd48507d ("btrfs: fix allocation of free space cache v1 bitmap pages")
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:32 +01:00
Filipe Manana
12dc6889bc btrfs: fix race when cloning extent buffer during rewind of an old root
commit dbcc7d57bf upstream.

While resolving backreferences, as part of a logical ino ioctl call or
fiemap, we can end up hitting a BUG_ON() when replaying tree mod log
operations of a root, triggering a stack trace like the following:

  ------------[ cut here ]------------
  kernel BUG at fs/btrfs/ctree.c:1210!
  invalid opcode: 0000 [#1] SMP KASAN PTI
  CPU: 1 PID: 19054 Comm: crawl_335 Tainted: G        W         5.11.0-2d11c0084b02-misc-next+ #89
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
  RIP: 0010:__tree_mod_log_rewind+0x3b1/0x3c0
  Code: 05 48 8d 74 10 (...)
  RSP: 0018:ffffc90001eb70b8 EFLAGS: 00010297
  RAX: 0000000000000000 RBX: ffff88812344e400 RCX: ffffffffb28933b6
  RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88812344e42c
  RBP: ffffc90001eb7108 R08: 1ffff11020b60a20 R09: ffffed1020b60a20
  R10: ffff888105b050f9 R11: ffffed1020b60a1f R12: 00000000000000ee
  R13: ffff8880195520c0 R14: ffff8881bc958500 R15: ffff88812344e42c
  FS:  00007fd1955e8700(0000) GS:ffff8881f5600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007efdb7928718 CR3: 000000010103a006 CR4: 0000000000170ee0
  Call Trace:
   btrfs_search_old_slot+0x265/0x10d0
   ? lock_acquired+0xbb/0x600
   ? btrfs_search_slot+0x1090/0x1090
   ? free_extent_buffer.part.61+0xd7/0x140
   ? free_extent_buffer+0x13/0x20
   resolve_indirect_refs+0x3e9/0xfc0
   ? lock_downgrade+0x3d0/0x3d0
   ? __kasan_check_read+0x11/0x20
   ? add_prelim_ref.part.11+0x150/0x150
   ? lock_downgrade+0x3d0/0x3d0
   ? __kasan_check_read+0x11/0x20
   ? lock_acquired+0xbb/0x600
   ? __kasan_check_write+0x14/0x20
   ? do_raw_spin_unlock+0xa8/0x140
   ? rb_insert_color+0x30/0x360
   ? prelim_ref_insert+0x12d/0x430
   find_parent_nodes+0x5c3/0x1830
   ? resolve_indirect_refs+0xfc0/0xfc0
   ? lock_release+0xc8/0x620
   ? fs_reclaim_acquire+0x67/0xf0
   ? lock_acquire+0xc7/0x510
   ? lock_downgrade+0x3d0/0x3d0
   ? lockdep_hardirqs_on_prepare+0x160/0x210
   ? lock_release+0xc8/0x620
   ? fs_reclaim_acquire+0x67/0xf0
   ? lock_acquire+0xc7/0x510
   ? poison_range+0x38/0x40
   ? unpoison_range+0x14/0x40
   ? trace_hardirqs_on+0x55/0x120
   btrfs_find_all_roots_safe+0x142/0x1e0
   ? find_parent_nodes+0x1830/0x1830
   ? btrfs_inode_flags_to_xflags+0x50/0x50
   iterate_extent_inodes+0x20e/0x580
   ? tree_backref_for_extent+0x230/0x230
   ? lock_downgrade+0x3d0/0x3d0
   ? read_extent_buffer+0xdd/0x110
   ? lock_downgrade+0x3d0/0x3d0
   ? __kasan_check_read+0x11/0x20
   ? lock_acquired+0xbb/0x600
   ? __kasan_check_write+0x14/0x20
   ? _raw_spin_unlock+0x22/0x30
   ? __kasan_check_write+0x14/0x20
   iterate_inodes_from_logical+0x129/0x170
   ? iterate_inodes_from_logical+0x129/0x170
   ? btrfs_inode_flags_to_xflags+0x50/0x50
   ? iterate_extent_inodes+0x580/0x580
   ? __vmalloc_node+0x92/0xb0
   ? init_data_container+0x34/0xb0
   ? init_data_container+0x34/0xb0
   ? kvmalloc_node+0x60/0x80
   btrfs_ioctl_logical_to_ino+0x158/0x230
   btrfs_ioctl+0x205e/0x4040
   ? __might_sleep+0x71/0xe0
   ? btrfs_ioctl_get_supported_features+0x30/0x30
   ? getrusage+0x4b6/0x9c0
   ? __kasan_check_read+0x11/0x20
   ? lock_release+0xc8/0x620
   ? __might_fault+0x64/0xd0
   ? lock_acquire+0xc7/0x510
   ? lock_downgrade+0x3d0/0x3d0
   ? lockdep_hardirqs_on_prepare+0x210/0x210
   ? lockdep_hardirqs_on_prepare+0x210/0x210
   ? __kasan_check_read+0x11/0x20
   ? do_vfs_ioctl+0xfc/0x9d0
   ? ioctl_file_clone+0xe0/0xe0
   ? lock_downgrade+0x3d0/0x3d0
   ? lockdep_hardirqs_on_prepare+0x210/0x210
   ? __kasan_check_read+0x11/0x20
   ? lock_release+0xc8/0x620
   ? __task_pid_nr_ns+0xd3/0x250
   ? lock_acquire+0xc7/0x510
   ? __fget_files+0x160/0x230
   ? __fget_light+0xf2/0x110
   __x64_sys_ioctl+0xc3/0x100
   do_syscall_64+0x37/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7fd1976e2427
  Code: 00 00 90 48 8b 05 (...)
  RSP: 002b:00007fd1955e5cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00007fd1955e5f40 RCX: 00007fd1976e2427
  RDX: 00007fd1955e5f48 RSI: 00000000c038943b RDI: 0000000000000004
  RBP: 0000000001000000 R08: 0000000000000000 R09: 00007fd1955e6120
  R10: 0000557835366b00 R11: 0000000000000246 R12: 0000000000000004
  R13: 00007fd1955e5f48 R14: 00007fd1955e5f40 R15: 00007fd1955e5ef8
  Modules linked in:
  ---[ end trace ec8931a1c36e57be ]---

  (gdb) l *(__tree_mod_log_rewind+0x3b1)
  0xffffffff81893521 is in __tree_mod_log_rewind (fs/btrfs/ctree.c:1210).
  1205                     * the modification. as we're going backwards, we do the
  1206                     * opposite of each operation here.
  1207                     */
  1208                    switch (tm->op) {
  1209                    case MOD_LOG_KEY_REMOVE_WHILE_FREEING:
  1210                            BUG_ON(tm->slot < n);
  1211                            fallthrough;
  1212                    case MOD_LOG_KEY_REMOVE_WHILE_MOVING:
  1213                    case MOD_LOG_KEY_REMOVE:
  1214                            btrfs_set_node_key(eb, &tm->key, tm->slot);

Here's what happens to hit that BUG_ON():

1) We have one tree mod log user (through fiemap or the logical ino ioctl),
   with a sequence number of 1, so we have fs_info->tree_mod_seq == 1;

2) Another task is at ctree.c:balance_level() and we have eb X currently as
   the root of the tree, and we promote its single child, eb Y, as the new
   root.

   Then, at ctree.c:balance_level(), we call:

      tree_mod_log_insert_root(eb X, eb Y, 1);

3) At tree_mod_log_insert_root() we create tree mod log elements for each
   slot of eb X, of operation type MOD_LOG_KEY_REMOVE_WHILE_FREEING each
   with a ->logical pointing to ebX->start. These are placed in an array
   named tm_list.
   Lets assume there are N elements (N pointers in eb X);

4) Then, still at tree_mod_log_insert_root(), we create a tree mod log
   element of operation type MOD_LOG_ROOT_REPLACE, ->logical set to
   ebY->start, ->old_root.logical set to ebX->start, ->old_root.level set
   to the level of eb X and ->generation set to the generation of eb X;

5) Then tree_mod_log_insert_root() calls tree_mod_log_free_eb() with
   tm_list as argument. After that, tree_mod_log_free_eb() calls
   __tree_mod_log_insert() for each member of tm_list in reverse order,
   from highest slot in eb X, slot N - 1, to slot 0 of eb X;

6) __tree_mod_log_insert() sets the sequence number of each given tree mod
   log operation - it increments fs_info->tree_mod_seq and sets
   fs_info->tree_mod_seq as the sequence number of the given tree mod log
   operation.

   This means that for the tm_list created at tree_mod_log_insert_root(),
   the element corresponding to slot 0 of eb X has the highest sequence
   number (1 + N), and the element corresponding to the last slot has the
   lowest sequence number (2);

7) Then, after inserting tm_list's elements into the tree mod log rbtree,
   the MOD_LOG_ROOT_REPLACE element is inserted, which gets the highest
   sequence number, which is N + 2;

8) Back to ctree.c:balance_level(), we free eb X by calling
   btrfs_free_tree_block() on it. Because eb X was created in the current
   transaction, has no other references and writeback did not happen for
   it, we add it back to the free space cache/tree;

9) Later some other task T allocates the metadata extent from eb X, since
   it is marked as free space in the space cache/tree, and uses it as a
   node for some other btree;

10) The tree mod log user task calls btrfs_search_old_slot(), which calls
    get_old_root(), and finally that calls __tree_mod_log_oldest_root()
    with time_seq == 1 and eb_root == eb Y;

11) First iteration of the while loop finds the tree mod log element with
    sequence number N + 2, for the logical address of eb Y and of type
    MOD_LOG_ROOT_REPLACE;

12) Because the operation type is MOD_LOG_ROOT_REPLACE, we don't break out
    of the loop, and set root_logical to point to tm->old_root.logical
    which corresponds to the logical address of eb X;

13) On the next iteration of the while loop, the call to
    tree_mod_log_search_oldest() returns the smallest tree mod log element
    for the logical address of eb X, which has a sequence number of 2, an
    operation type of MOD_LOG_KEY_REMOVE_WHILE_FREEING and corresponds to
    the old slot N - 1 of eb X (eb X had N items in it before being freed);

14) We then break out of the while loop and return the tree mod log operation
    of type MOD_LOG_ROOT_REPLACE (eb Y), and not the one for slot N - 1 of
    eb X, to get_old_root();

15) At get_old_root(), we process the MOD_LOG_ROOT_REPLACE operation
    and set "logical" to the logical address of eb X, which was the old
    root. We then call tree_mod_log_search() passing it the logical
    address of eb X and time_seq == 1;

16) Then before calling tree_mod_log_search(), task T adds a key to eb X,
    which results in adding a tree mod log operation of type
    MOD_LOG_KEY_ADD to the tree mod log - this is done at
    ctree.c:insert_ptr() - but after adding the tree mod log operation
    and before updating the number of items in eb X from 0 to 1...

17) The task at get_old_root() calls tree_mod_log_search() and gets the
    tree mod log operation of type MOD_LOG_KEY_ADD just added by task T.
    Then it enters the following if branch:

    if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) {
       (...)
    } (...)

    Calls read_tree_block() for eb X, which gets a reference on eb X but
    does not lock it - task T has it locked.
    Then it clones eb X while it has nritems set to 0 in its header, before
    task T sets nritems to 1 in eb X's header. From hereupon we use the
    clone of eb X which no other task has access to;

18) Then we call __tree_mod_log_rewind(), passing it the MOD_LOG_KEY_ADD
    mod log operation we just got from tree_mod_log_search() in the
    previous step and the cloned version of eb X;

19) At __tree_mod_log_rewind(), we set the local variable "n" to the number
    of items set in eb X's clone, which is 0. Then we enter the while loop,
    and in its first iteration we process the MOD_LOG_KEY_ADD operation,
    which just decrements "n" from 0 to (u32)-1, since "n" is declared with
    a type of u32. At the end of this iteration we call rb_next() to find the
    next tree mod log operation for eb X, that gives us the mod log operation
    of type MOD_LOG_KEY_REMOVE_WHILE_FREEING, for slot 0, with a sequence
    number of N + 1 (steps 3 to 6);

20) Then we go back to the top of the while loop and trigger the following
    BUG_ON():

        (...)
        switch (tm->op) {
        case MOD_LOG_KEY_REMOVE_WHILE_FREEING:
                 BUG_ON(tm->slot < n);
                 fallthrough;
        (...)

    Because "n" has a value of (u32)-1 (4294967295) and tm->slot is 0.

Fix this by taking a read lock on the extent buffer before cloning it at
ctree.c:get_old_root(). This should be done regardless of the extent
buffer having been freed and reused, as a concurrent task might be
modifying it (while holding a write lock on it).

Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Link: https://lore.kernel.org/linux-btrfs/20210227155037.GN28049@hungrycats.org/
Fixes: 834328a849 ("Btrfs: tree mod log's old roots could still be part of the tree")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:32 +01:00
Arnaldo Carvalho de Melo
02aa257435 tools build: Check if gettid() is available before providing helper
commit 4541a8bb13 upstream.

Laura reported that the perf build failed in fedora when we got a glibc
that provides gettid(), which I reproduced using fedora rawhide with the
glibc-devel-2.29.9000-26.fc31.x86_64 package.

Add a feature check to avoid providing a gettid() helper in such
systems.

On a fedora rawhide system with this patch applied we now get:

  [root@7a5f55352234 perf]# grep gettid /tmp/build/perf/FEATURE-DUMP
  feature-gettid=1
  [root@7a5f55352234 perf]# cat /tmp/build/perf/feature/test-gettid.make.output
  [root@7a5f55352234 perf]# ldd /tmp/build/perf/feature/test-gettid.bin
          linux-vdso.so.1 (0x00007ffc6b1f6000)
          libc.so.6 => /lib64/libc.so.6 (0x00007f04e0a74000)
          /lib64/ld-linux-x86-64.so.2 (0x00007f04e0c47000)
  [root@7a5f55352234 perf]# nm /tmp/build/perf/feature/test-gettid.bin | grep -w gettid
                   U gettid@@GLIBC_2.30
  [root@7a5f55352234 perf]#

While on a fedora:29 system:

  [acme@quaco perf]$ grep gettid /tmp/build/perf/FEATURE-DUMP
  feature-gettid=0
  [acme@quaco perf]$ cat /tmp/build/perf/feature/test-gettid.make.output
  test-gettid.c: In function ‘main’:
  test-gettid.c:8:9: error: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Werror=implicit-function-declaration]
    return gettid();
           ^~~~~~
           getgid
  cc1: all warnings being treated as errors
  [acme@quaco perf]$

Reported-by: Laura Abbott <labbott@redhat.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lkml.kernel.org/n/tip-yfy3ch53agmklwu9o7rlgf9c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:32 +01:00
Arnaldo Carvalho de Melo
65e8604da2 tools build feature: Check if eventfd() is available
commit 11c6cbe706 upstream.

A new 'perf bench epoll' will use this, and to disable it for older
systems, add a feature test for this API.

This is just a simple program that if successfully compiled, means that
the feature is present, at least at the library level, in a build that
sets the output directory to /tmp/build/perf (using O=/tmp/build/perf),
we end up with:

  $ ls -la /tmp/build/perf/feature/test-eventfd*
  -rwxrwxr-x. 1 acme acme 8176 Nov 21 15:58 /tmp/build/perf/feature/test-eventfd.bin
  -rw-rw-r--. 1 acme acme  588 Nov 21 15:58 /tmp/build/perf/feature/test-eventfd.d
  -rw-rw-r--. 1 acme acme    0 Nov 21 15:58 /tmp/build/perf/feature/test-eventfd.make.output
  $ ldd /tmp/build/perf/feature/test-eventfd.bin
	  linux-vdso.so.1 (0x00007fff3bf3f000)
	  libc.so.6 => /lib64/libc.so.6 (0x00007fa984061000)
	  /lib64/ld-linux-x86-64.so.2 (0x00007fa984417000)
  $ grep eventfd -A 2 -B 2 /tmp/build/perf/FEATURE-DUMP
  feature-dwarf=1
  feature-dwarf_getlocations=1
  feature-eventfd=1
  feature-fortify-source=1
  feature-sync-compare-and-swap=1
  $

The main thing here is that in the end we'll have -DHAVE_EVENTFD in
CFLAGS, and then the 'perf bench' entry needing that API can be
selectively pruned.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-wkeldwob7dpx6jvtuzl8164k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:32 +01:00
Arnaldo Carvalho de Melo
b1e75d8ec5 tools build feature: Check if get_current_dir_name() is available
commit 8feb8efef9 upstream.

As the namespace support code will use this, which is not available in
some non _GNU_SOURCE libraries such as Android's bionic used in my
container build tests (r12b and r15c at the moment).

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-x56ypm940pwclwu45d7jfj47@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:32 +01:00
Jiri Olsa
89a0a5d6ff perf tools: Use %define api.pure full instead of %pure-parser
commit fc8c0a9922 upstream.

bison deprecated the "%pure-parser" directive in favor of "%define
api.pure full".

The api.pure got introduced in bison 2.3 (Oct 2007), so it seems safe to
use it without any version check.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200112192259.GA35080@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:32 +01:00
Nicolas Boichat
ce1ad429d5 lkdtm: don't move ctors to .rodata
From: Mark Rutland <mark.rutland@arm.com>

commit 3f618ab332 upstream.

When building with KASAN and LKDTM, clang may implictly generate an
asan.module_ctor function in the LKDTM rodata object. The Makefile moves
the lkdtm_rodata_do_nothing() function into .rodata by renaming the
file's .text section to .rodata, and consequently also moves the ctor
function into .rodata, leading to a boot time crash (splat below) when
the ctor is invoked by do_ctors().

Let's prevent this by marking the function as noinstr rather than
notrace, and renaming the file's .noinstr.text to .rodata. Marking the
function as noinstr will prevent tracing and kprobes, and will inhibit
any undesireable compiler instrumentation.

The ctor function (if any) will be placed in .text and will work
correctly.

Example splat before this patch is applied:

[    0.916359] Unable to handle kernel execute from non-executable memory at virtual address ffffa0006b60f5ac
[    0.922088] Mem abort info:
[    0.922828]   ESR = 0x8600000e
[    0.923635]   EC = 0x21: IABT (current EL), IL = 32 bits
[    0.925036]   SET = 0, FnV = 0
[    0.925838]   EA = 0, S1PTW = 0
[    0.926714] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000427b3000
[    0.928489] [ffffa0006b60f5ac] pgd=000000023ffff003, p4d=000000023ffff003, pud=000000023fffe003, pmd=0068000042000f01
[    0.931330] Internal error: Oops: 8600000e [#1] PREEMPT SMP
[    0.932806] Modules linked in:
[    0.933617] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc7 #2
[    0.935620] Hardware name: linux,dummy-virt (DT)
[    0.936924] pstate: 40400005 (nZcv daif +PAN -UAO -TCO BTYPE=--)
[    0.938609] pc : asan.module_ctor+0x0/0x14
[    0.939759] lr : do_basic_setup+0x4c/0x70
[    0.940889] sp : ffff27b600177e30
[    0.941815] x29: ffff27b600177e30 x28: 0000000000000000
[    0.943306] x27: 0000000000000000 x26: 0000000000000000
[    0.944803] x25: 0000000000000000 x24: 0000000000000000
[    0.946289] x23: 0000000000000001 x22: 0000000000000000
[    0.947777] x21: ffffa0006bf4a890 x20: ffffa0006befb6c0
[    0.949271] x19: ffffa0006bef9358 x18: 0000000000000068
[    0.950756] x17: fffffffffffffff8 x16: 0000000000000000
[    0.952246] x15: 0000000000000000 x14: 0000000000000000
[    0.953734] x13: 00000000838a16d5 x12: 0000000000000001
[    0.955223] x11: ffff94000da74041 x10: dfffa00000000000
[    0.956715] x9 : 0000000000000000 x8 : ffffa0006b60f5ac
[    0.958199] x7 : f9f9f9f9f9f9f9f9 x6 : 000000000000003f
[    0.959683] x5 : 0000000000000040 x4 : 0000000000000000
[    0.961178] x3 : ffffa0006bdc15a0 x2 : 0000000000000005
[    0.962662] x1 : 00000000000000f9 x0 : ffffa0006bef9350
[    0.964155] Call trace:
[    0.964844]  asan.module_ctor+0x0/0x14
[    0.965895]  kernel_init_freeable+0x158/0x198
[    0.967115]  kernel_init+0x14/0x19c
[    0.968104]  ret_from_fork+0x10/0x30
[    0.969110] Code: 00000003 00000000 00000000 00000000 (00000000)
[    0.970815] ---[ end trace b5339784e20d015c ]---

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20201207170533.10738-1-mark.rutland@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:31 +01:00
Nicolas Boichat
c0387536ed vmlinux.lds.h: Create section for protection against instrumentation
From: Thomas Gleixner <tglx@linutronix.de>

commit 6553896666 upstream.

Some code pathes, especially the low level entry code, must be protected
against instrumentation for various reasons:

 - Low level entry code can be a fragile beast, especially on x86.

 - With NO_HZ_FULL RCU state needs to be established before using it.

Having a dedicated section for such code allows to validate with tooling
that no unsafe functions are invoked.

Add the .noinstr.text section and the noinstr attribute to mark
functions. noinstr implies notrace. Kprobes will gain a section check
later.

Provide also a set of markers: instrumentation_begin()/end()

These are used to mark code inside a noinstr function which calls
into regular instrumentable text section as safe.

The instrumentation markers are only active when CONFIG_DEBUG_ENTRY is
enabled as the end marker emits a NOP to prevent the compiler from merging
the annotation points. This means the objtool verification requires a
kernel compiled with this option.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134100.075416272@linutronix.de

[Nicolas:
Guard noinstr macro in include/linux/compiler_types.h in __KERNEL__
&& !__ASSEMBLY__, otherwise noinstr is expanded in the linker
script construct.

Upstream does not have this problem as many macros were moved by
commit 71391bdd2e ("include/linux/compiler_types.h: don't pollute
userspace with macro definitions"). We take the minimal approach here
and just guard the new macro.

Minor context conflicts in:
	arch/powerpc/kernel/vmlinux.lds.S
	include/asm-generic/vmlinux.lds.h
	include/linux/compiler.h]
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:31 +01:00
Rafael J. Wysocki
f68e008067 Revert "PM: runtime: Update device status before letting suppliers suspend"
commit 0cab893f40 upstream.

Revert commit 44cc89f764 ("PM: runtime: Update device status
before letting suppliers suspend") that introduced a race condition
into __rpm_callback() which allowed a concurrent rpm_resume() to
run and resume the device prematurely after its status had been
changed to RPM_SUSPENDED by __rpm_callback().

Fixes: 44cc89f764 ("PM: runtime: Update device status before letting suppliers suspend")
Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:31 +01:00
Hui Wang
5812307cce ALSA: hda: generic: Fix the micmute led init state
commit 2bf44e0ee9 upstream.

Recently we found the micmute led init state is not correct after
freshly installing the ubuntu linux on a Lenovo AIO machine. The
internal mic is not muted, but the micmute led is on and led mode is
'follow mute'. If we mute internal mic, the led is keeping on, then
unmute the internal mic, the led is off. And from then on, the
micmute led will work correctly.

So the micmute led init state is not correct. The led is controlled
by codec gpio (ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY), in the
patch_realtek, the gpio data is set to 0x4 initially and the led is
on with this data. In the hda_generic, the led_value is set to
0 initially, suppose users set the 'capture switch' to on from
user space and the micmute led should change to be off with this
operation, but the check "if (val == spec->micmute_led.led_value)" in
the call_micmute_led_update() will skip the led setting.

To guarantee the led state will be set by the 1st time of changing
"Capture Switch", set -1 to the init led_value.

Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20210312041408.3776-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:31 +01:00
Shengjiu Wang
04bb225a48 ASoC: ak5558: Add MODULE_DEVICE_TABLE
commit 80cffd2468 upstream.

Add missed MODULE_DEVICE_TABLE for the driver can be loaded
automatically at boot.

Fixes: 9208847774 ("ASoC: ak5558: Add support for AK5558 ADC driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1614149872-25510-2-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:31 +01:00
Shengjiu Wang
25a09f4aad ASoC: ak4458: Add MODULE_DEVICE_TABLE
commit 4ec5b96775 upstream.

Add missed MODULE_DEVICE_TABLE for the driver can be loaded
automatically at boot.

Fixes: 08660086ef ("ASoC: ak4458: Add support for AK4458 DAC driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1614149872-25510-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:07:31 +01:00
Greg Kroah-Hartman
125222814e Linux 4.19.182
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Tested-by: Jason Self <jason@bluehome.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Link: https://lore.kernel.org/r/20210319121744.114946147@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20 10:38:35 +01:00
Florian Fainelli
1270080b64 net: dsa: b53: Support setting learning on port
commit f9b3827ee6 upstream.

Add support for being able to set the learning attribute on port, and
make sure that the standalone ports start up with learning disabled.

We can remove the code in bcm_sf2 that configured the ports learning
attribute because we want the standalone ports to have learning disabled
by default and port 7 cannot be bridged, so its learning attribute will
not change past its initial configuration.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20 10:38:35 +01:00
DENG Qingfang
cfbfa9c721 net: dsa: tag_mtk: fix 802.1ad VLAN egress
commit 9200f515c4 upstream.

A different TPID bit is used for 802.1ad VLAN frames.

Reported-by: Ilario Gelmetti <iochesonome@gmail.com>
Fixes: f0af34317f ("net: dsa: mediatek: combine MediaTek tag with VLAN tag")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20 10:38:35 +01:00
Piotr Krysiuk
86eddd693d bpf: Add sanity check for upper ptr_limit
commit 1b1597e64e upstream.

Given we know the max possible value of ptr_limit at the time of retrieving
the latter, add basic assertions, so that the verifier can bail out if
anything looks odd and reject the program. Nothing triggered this so far,
but it also does not hurt to have these.

Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20 10:38:35 +01:00
Piotr Krysiuk
087299234e bpf: Simplify alu_limit masking for pointer arithmetic
commit b5871dca25 upstream.

Instead of having the mov32 with aux->alu_limit - 1 immediate, move this
operation to retrieve_ptr_limit() instead to simplify the logic and to
allow for subsequent sanity boundary checks inside retrieve_ptr_limit().
This avoids in future that at the time of the verifier masking rewrite
we'd run into an underflow which would not sign extend due to the nature
of mov32 instruction.

Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20 10:38:35 +01:00
Piotr Krysiuk
ec5307f2ed bpf: Fix off-by-one for area size in creating mask to left
commit 10d2bb2e6b upstream.

retrieve_ptr_limit() computes the ptr_limit for registers with stack and
map_value type. ptr_limit is the size of the memory area that is still
valid / in-bounds from the point of the current position and direction
of the operation (add / sub). This size will later be used for masking
the operation such that attempting out-of-bounds access in the speculative
domain is redirected to remain within the bounds of the current map value.

When masking to the right the size is correct, however, when masking to
the left, the size is off-by-one which would lead to an incorrect mask
and thus incorrect arithmetic operation in the non-speculative domain.
Piotr found that if the resulting alu_limit value is zero, then the
BPF_MOV32_IMM() from the fixup_bpf_calls() rewrite will end up loading
0xffffffff into AX instead of sign-extending to the full 64 bit range,
and as a result, this allows abuse for executing speculatively out-of-
bounds loads against 4GB window of address space and thus extracting the
contents of kernel memory via side-channel.

Fixes: 979d63d50c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20 10:38:35 +01:00
Piotr Krysiuk
bc49612a0e bpf: Prohibit alu ops for pointer types not defining ptr_limit
commit f232326f69 upstream.

The purpose of this patch is to streamline error propagation and in particular
to propagate retrieve_ptr_limit() errors for pointer types that are not defining
a ptr_limit such that register-based alu ops against these types can be rejected.

The main rationale is that a gap has been identified by Piotr in the existing
protection against speculatively out-of-bounds loads, for example, in case of
ctx pointers, unprivileged programs can still perform pointer arithmetic. This
can be abused to execute speculatively out-of-bounds loads without restrictions
and thus extract contents of kernel memory.

Fix this by rejecting unprivileged programs that attempt any pointer arithmetic
on unprotected pointer types. The two affected ones are pointer to ctx as well
as pointer to map. Field access to a modified ctx' pointer is rejected at a
later point in time in the verifier, and 7c69673262 ("bpf: Permit map_ptr
arithmetic with opcode add and offset 0") only relevant for root-only use cases.
Risk of unprivileged program breakage is considered very low.

Fixes: 7c69673262 ("bpf: Permit map_ptr arithmetic with opcode add and offset 0")
Fixes: b2157399cc ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20 10:38:34 +01:00
Suzuki K Poulose
23ce6b7d40 KVM: arm64: nvhe: Save the SPE context early
commit b96b0c5de6 upstream

The nVHE KVM hyp drains and disables the SPE buffer, before
entering the guest, as the EL1&0 translation regime
is going to be loaded with that of the guest.

But this operation is performed way too late, because :
 - The owning translation regime of the SPE buffer
   is transferred to EL2. (MDCR_EL2_E2PB == 0)
 - The guest Stage1 is loaded.

Thus the flush could use the host EL1 virtual address,
but use the EL2 translations instead of host EL1, for writing
out any cached data.

Fix this by moving the SPE buffer handling early enough.
The restore path is doing the right thing.

Cc: stable@vger.kernel.org # v4.19
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-20 10:38:34 +01:00
Jan Kara
a75a5d1638 ext4: check journal inode extents more carefully
commit ce9f24cccd upstream.

Currently, system zones just track ranges of block, that are "important"
fs metadata (bitmaps, group descriptors, journal blocks, etc.). This
however complicates how extent tree (or indirect blocks) can be checked
for inodes that actually track such metadata - currently the journal
inode but arguably we should be treating quota files or resize inode
similarly. We cannot run __ext4_ext_check() on such metadata inodes when
loading their extents as that would immediately trigger the validity
checks and so we just hack around that and special-case the journal
inode. This however leads to a situation that a journal inode which has
extent tree of depth at least one can have invalid extent tree that gets
unnoticed until ext4_cache_extents() crashes.

To overcome this limitation, track inode number each system zone belongs
to (0 is used for zones not belonging to any inode). We can then verify
inode number matches the expected one when verifying extent tree and
thus avoid the false errors. With this there's no need to to
special-case journal inode during extent tree checking anymore so remove
it.

Fixes: 0a944e8a6c ("ext4: don't perform block validity checks on the journal inode")
Reported-by: Wolfgang Frisch <wolfgang.frisch@suse.com>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20200728130437.7804-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20 10:38:34 +01:00