We no longer rely on the module loader treating a module called
'fips140.ko' in a special way, so let's remove this terrible code.
Bug: 153614920
Bug: 188620248
Change-Id: I05da5fd6308eafae641bb403e3522f9fa58333be
Signed-off-by: Ard Biesheuvel <ardb@google.com>
Instead of having a special case in the core kernel's module loader that
treats a module called 'fips140.ko' in a special way, use a host tool to
tweak the ELF metadata of this module so that the RELA data is preserved
and accessible to the module init code.
This is done in the following way:
- each RELA section that we care about (the ones for .text and .rodata
at the moment) is copied into a new section called .init.rela.<name>
with the SHF_ALLOC attribute, so that the module loader will copy it
into __init memory at load time;
- for each such section, an offset/count tuple is added as a global
variable to the module;
- the count field of those tuples is populated directly by the host tool
based on the actual size of the RELA section in question;
- the offset field is decorated with a place-relative relocation against
the start of the copied RELA section via a weak symbol reference,
which causes an entry to be emitted into the ELF symbol table;
- these ELF symbol table entries are updated by the host tool and turned
into STT_SECTION type symbols with STB_GLOBAL linkage, carrying the
correct section index.
With these changes in place, the unmodified module loader will load all
required information into memory in a way that permits the module init
code to locate the relocations, and apply them in reverse.
Bug: 153614920
Bug: 188620248
Change-Id: I07d9704febdf913834502dd09c19aa4a04d983b1
Signed-off-by: Ard Biesheuvel <ardb@google.com>
We currently only emit directives for handling the .text section into
the module linker script if both LTO and CFI are enabled, while for
other sections, we do this even if CFI is not enabled. This is
inconsistent at best, but as it also interferes with the assumption in
the fips140.ko module that the .text._start and .text._end input
sections are placed at the very start and end of the .text section,
which currently can only be relied upon if CFI is enabled.
So rearrange the #ifdef so that it only covers the .text.__cfi_check
input section. Note that aligning to page size is likely to be redundant
in any case, given that the .text section is laid out first, and module
allocations are page aligned to begin with, so making that part
unconditional is unlikely to make an observeable difference in the
output.
Bug: 153614920
Bug: 188620248
Fixes: 6be141eb36 ("ANDROID: crypto: fips140 - perform load time integrity check")
Change-Id: I3f9ed0ae8fa8fe5693c8d2964566cbb42c101aa7
Signed-off-by: Ard Biesheuvel <ardb@google.com>
These flags are supposed to be used when building all source files for
the module.
Bug: 188620248
Fixes: b7397e89db ("ANDROID: fips140: add power-up cryptographic self-tests")
Change-Id: I41cacff040c8a8a0065dd3cfc537303f1ff18335
Signed-off-by: Eric Biggers <ebiggers@google.com>
With CONFIG_LTO_CLANG, we currently link modules into native
code just before modpost, which means with TRIM_UNUSED_KSYMS
enabled, we still look at the LLVM bitcode in the .o files when
generating the list of used symbols. As the bitcode doesn't
yet have calls to compiler intrinsics and llvm-nm doesn't see
function references that only exist in function-level inline
assembly, we currently need a whitelist for TRIM_UNUSED_KSYMS to
work with LTO.
This change moves module LTO linking to happen earlier, and
thus avoids the issue with LLVM bitcode and TRIM_UNUSED_KSYMS
entirely, allowing us to also drop the whitelist from
gen_autoksyms.sh.
Link: https://github.com/ClangBuiltLinux/linux/issues/1369
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Alexander Lobakin <alobakin@pm.me>
Tested-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
(cherry picked from commit 850ded46c6)
Conflicts:
scripts/Makefile.modfinal
scripts/Makefile.modpost
Bug: 189327973
Change-Id: I61c3782619e6475cfff74c270b70e01e229dc8df
Signed-off-by: Eric Biggers <ebiggers@google.com>
Changes in 5.10.72
spi: rockchip: handle zero length transfers without timing out
platform/x86: touchscreen_dmi: Add info for the Chuwi HiBook (CWI514) tablet
platform/x86: touchscreen_dmi: Update info for the Chuwi Hi10 Plus (CWI527) tablet
nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWN
btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
btrfs: fix mount failure due to past and transient device flush error
net: mdio: introduce a shutdown method to mdio device drivers
xen-netback: correct success/error reporting for the SKB-with-fraglist case
sparc64: fix pci_iounmap() when CONFIG_PCI is not set
ext2: fix sleeping in atomic bugs on error
scsi: sd: Free scsi_disk device via put_device()
usb: testusb: Fix for showing the connection speed
usb: dwc2: check return value after calling platform_get_resource()
habanalabs/gaudi: fix LBW RR configuration
selftests: be sure to make khdr before other targets
selftests:kvm: fix get_warnings_count() ignoring fscanf() return warn
nvme-fc: update hardware queues before using them
nvme-fc: avoid race between time out and tear down
thermal/drivers/tsens: Fix wrong check for tzd in irq handlers
scsi: ses: Retry failed Send/Receive Diagnostic commands
irqchip/gic: Work around broken Renesas integration
smb3: correct smb3 ACL security descriptor
tools/vm/page-types: remove dependency on opt_file for idle page tracking
selftests: KVM: Align SMCCC call with the spec in steal_time
KVM: do not shrink halt_poll_ns below grow_start
kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[]
KVM: x86: nSVM: restore int_vector in svm_clear_vintr
perf/x86: Reset destroy callback on event init failure
libata: Add ATA_HORKAGE_NO_NCQ_ON_ATI for Samsung 860 and 870 SSD.
Linux 5.10.72
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I842a64f2a02b26c46b6d1151fe343497854c6300
commit 7a8526a5cd upstream.
Many users are reporting that the Samsung 860 and 870 SSD are having
various issues when combined with AMD/ATI (vendor ID 0x1002) SATA
controllers and only completely disabling NCQ helps to avoid these
issues.
Always disabling NCQ for Samsung 860/870 SSDs regardless of the host
SATA adapter vendor will cause I/O performance degradation with well
behaved adapters. To limit the performance impact to ATI adapters,
introduce the ATA_HORKAGE_NO_NCQ_ON_ATI flag to force disable NCQ
only for these adapters.
Also, two libata.force parameters (noncqati and ncqati) are introduced
to disable and enable the NCQ for the system which equipped with ATI
SATA adapter and Samsung 860 and 870 SSDs. The user can determine NCQ
function to be enabled or disabled according to the demand.
After verifying the chipset from the user reports, the issue appears
on AMD/ATI SB7x0/SB8x0/SB9x0 SATA Controllers and does not appear on
recent AMD SATA adapters. The vendor ID of ATI should be 0x1002.
Therefore, ATA_HORKAGE_NO_NCQ_ON_AMD was modified to
ATA_HORKAGE_NO_NCQ_ON_ATI.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=201693
Signed-off-by: Kate Hsuan <hpa@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210903094411.58749-1-hpa@redhat.com
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Krzysztof Olędzki <ole@ans.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02d029a41d upstream.
perf_init_event tries multiple init callbacks and does not reset the
event state between tries. When x86_pmu_event_init runs, it
unconditionally sets the destroy callback to hw_perf_event_destroy. On
the next init attempt after x86_pmu_event_init, in perf_try_init_event,
if the pmu's capabilities includes PERF_PMU_CAP_NO_EXCLUDE, the destroy
callback will be run. However, if the next init didn't set the destroy
callback, hw_perf_event_destroy will be run (since the callback wasn't
reset).
Looking at other pmu init functions, the common pattern is to only set
the destroy callback on a successful init. Resetting the callback on
failure tries to replicate that pattern.
This was discovered after commit f11dd0d805 ("perf/x86/amd/ibs: Extend
PERF_PMU_CAP_NO_EXCLUDE to IBS Op") when the second (and only second)
run of the perf tool after a reboot results in 0 samples being
generated. The extra run of hw_perf_event_destroy results in
active_events having an extra decrement on each perf run. The second run
has active_events == 0 and every subsequent run has active_events < 0.
When active_events == 0, the NMI handler will early-out and not record
any samples.
Signed-off-by: Anand K Mistry <amistry@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210929170405.1.I078b98ee7727f9ae9d6df8262bad7e325e40faf0@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e1fc1553cd ]
Intel PMU MSRs is in msrs_to_save_all[], so add AMD PMU MSRs to have a
consistent behavior between Intel and AMD when using KVM_GET_MSRS,
KVM_SET_MSRS or KVM_GET_MSR_INDEX_LIST.
We have to add legacy and new MSRs to handle guests running without
X86_FEATURE_PERFCTR_CORE.
Signed-off-by: Fares Mehanna <faresx@amazon.de>
Message-Id: <20210915133951.22389-1-faresx@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae232ea460 ]
grow_halt_poll_ns() ignores values between 0 and
halt_poll_ns_grow_start (10000 by default). However,
when we shrink halt_poll_ns we may fall way below
halt_poll_ns_grow_start and endup with halt_poll_ns
values that don't make a lot of sense: like 1 or 9,
or 19.
VCPU1 trace (halt_poll_ns_shrink equals 2):
VCPU1 grow 10000
VCPU1 shrink 5000
VCPU1 shrink 2500
VCPU1 shrink 1250
VCPU1 shrink 625
VCPU1 shrink 312
VCPU1 shrink 156
VCPU1 shrink 78
VCPU1 shrink 39
VCPU1 shrink 19
VCPU1 shrink 9
VCPU1 shrink 4
Mirror what grow_halt_poll_ns() does and set halt_poll_ns
to 0 as soon as new shrink-ed halt_poll_ns value falls
below halt_poll_ns_grow_start.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210902031100.252080-1-senozhatsky@chromium.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 01f91acb55 ]
The SMC64 calling convention passes a function identifier in w0 and its
parameters in x1-x17. Given this, there are two deviations in the
SMC64 call performed by the steal_time test: the function identifier is
assigned to a 64 bit register and the parameter is only 32 bits wide.
Align the call with the SMCCC by using a 32 bit register to handle the
function identifier and increasing the parameter width to 64 bits.
Suggested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20210921171121.2148982-3-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ebaeab2fe8 ]
Idle page tracking can also be used for process address space, not only
file mappings.
Without this change, using with '-i' option for process address space
encounters below errors reported.
$ sudo ./page-types -p $(pidof bash) -i
mark page idle: Bad file descriptor
mark page idle: Bad file descriptor
mark page idle: Bad file descriptor
mark page idle: Bad file descriptor
...
Link: https://lkml.kernel.org/r/20210917032826.10669-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b06d893ef2 ]
Address warning:
fs/smbfs_client/smb2pdu.c:2425 create_sd_buf()
warn: struct type mismatch 'smb3_acl vs cifs_acl'
Pointed out by Dan Carpenter via smatch code analysis tool
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b78f26926b ]
Geert reported that the GIC driver locks up on a Renesas system
since 005c34ae4b ("irqchip/gic: Atomically update affinity")
fixed the driver to use writeb_relaxed() instead of writel_relaxed().
As it turns out, the interconnect used on this system mandates
32bit wide accesses for all MMIO transactions, even if the GIC
architecture specifically mandates for some registers to be byte
accessible. Gahhh...
Work around the issue by crudly detecting the offending system,
and falling back to an inefficient RMW+lock implementation.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/CAMuHMdV+Ev47K5NO8XHsanSq5YRMCHn2gWAQyV-q2LpJVy9HiQ@mail.gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fbdac19e64 ]
Setting SCSI logging level with error=3, we saw some errors from enclosues:
[108017.360833] ses 0:0:9:0: tag#641 Done: NEEDS_RETRY Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s
[108017.360838] ses 0:0:9:0: tag#641 CDB: Receive Diagnostic 1c 01 01 00 20 00
[108017.427778] ses 0:0:9:0: Power-on or device reset occurred
[108017.427784] ses 0:0:9:0: tag#641 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[108017.427788] ses 0:0:9:0: tag#641 CDB: Receive Diagnostic 1c 01 01 00 20 00
[108017.427791] ses 0:0:9:0: tag#641 Sense Key : Unit Attention [current]
[108017.427793] ses 0:0:9:0: tag#641 Add. Sense: Bus device reset function occurred
[108017.427801] ses 0:0:9:0: Failed to get diagnostic page 0x1
[108017.427804] ses 0:0:9:0: Failed to bind enclosure -19
[108017.427895] ses 0:0:10:0: Attached Enclosure device
[108017.427942] ses 0:0:10:0: Attached scsi generic sg18 type 13
Retry if the Send/Receive Diagnostic commands complete with a transient
error status (NOT_READY or UNIT_ATTENTION with ASC 0x29).
Link: https://lore.kernel.org/r/1631849061-10210-2-git-send-email-wenxiong@linux.ibm.com
Reviewed-by: Brian King <brking@linux.ibm.com>
Reviewed-by: James Bottomley <jejb@linux.ibm.com>
Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e5445dae29 ]
To avoid race between time out and tear down, in tear down process,
first we quiesce the queue, and then delete the timer and cancel
the time out work for the queue.
This patch merges the admin and io sync ops into the queue teardown logic
as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time
out and tear down". There is no teardown_lock in nvme-fc.
Signed-off-by: James Smart <jsmart2021@gmail.com>
Tested-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 555f66d0f8 ]
In case the number of hardware queues changes, we need to update the
tagset and the mapping of ctx to hctx first.
If we try to create and connect the I/O queues first, this operation
will fail (target will reject the connect call due to the wrong number
of queues) and hence we bail out of the recreate function. Then we
will to try the very same operation again, thus we don't make any
progress.
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 39a71f712d ]
Fix get_warnings_count() to check fscanf() return value to get rid
of the following warning:
x86_64/mmio_warning_test.c: In function ‘get_warnings_count’:
x86_64/mmio_warning_test.c:85:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
85 | fscanf(f, "%d", &warnings);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8914a7a247 ]
LKP/0Day reported some building errors about kvm, and errors message
are not always same:
- lib/x86_64/processor.c:1083:31: error: ‘KVM_CAP_NESTED_STATE’ undeclared
(first use in this function); did you mean ‘KVM_CAP_PIT_STATE2’?
- lib/test_util.c:189:30: error: ‘MAP_HUGE_16KB’ undeclared (first use
in this function); did you mean ‘MAP_HUGE_16GB’?
Although kvm relies on the khdr, they still be built in parallel when -j
is specified. In this case, it will cause compiling errors.
Here we mark target khdr as NOTPARALLEL to make it be always built
first.
CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0a5ff77bf0 ]
Couple of fixes to the LBW RR configuration:
1. Add missing configuration of the SM RR registers in the DMA_IF.
2. Remove HBW range that doesn't belong.
3. Add entire gap + DBG area, from end of TPC7 to end of entire
DBG space.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f81c08f897 ]
testusb' application which uses 'usbtest' driver reports 'unknown speed'
from the function 'find_testdev'. The variable 'entry->speed' was not
updated from the application. The IOCTL mentioned in the FIXME comment can
only report whether the connection is low speed or not. Speed is read using
the IOCTL USBDEVFS_GET_SPEED which reports the proper speed grade. The
call is implemented in the function 'handle_testdev' where the file
descriptor was availble locally. Sample output is given below where 'high
speed' is printed as the connected speed.
sudo ./testusb -a
high speed /dev/bus/usb/001/011 0
/dev/bus/usb/001/011 test 0, 0.000015 secs
/dev/bus/usb/001/011 test 1, 0.194208 secs
/dev/bus/usb/001/011 test 2, 0.077289 secs
/dev/bus/usb/001/011 test 3, 0.170604 secs
/dev/bus/usb/001/011 test 4, 0.108335 secs
/dev/bus/usb/001/011 test 5, 2.788076 secs
/dev/bus/usb/001/011 test 6, 2.594610 secs
/dev/bus/usb/001/011 test 7, 2.905459 secs
/dev/bus/usb/001/011 test 8, 2.795193 secs
/dev/bus/usb/001/011 test 9, 8.372651 secs
/dev/bus/usb/001/011 test 10, 6.919731 secs
/dev/bus/usb/001/011 test 11, 16.372687 secs
/dev/bus/usb/001/011 test 12, 16.375233 secs
/dev/bus/usb/001/011 test 13, 2.977457 secs
/dev/bus/usb/001/011 test 14 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 17, 0.148826 secs
/dev/bus/usb/001/011 test 18, 0.068718 secs
/dev/bus/usb/001/011 test 19, 0.125992 secs
/dev/bus/usb/001/011 test 20, 0.127477 secs
/dev/bus/usb/001/011 test 21 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 24, 4.133763 secs
/dev/bus/usb/001/011 test 27, 2.140066 secs
/dev/bus/usb/001/011 test 28, 2.120713 secs
/dev/bus/usb/001/011 test 29, 0.507762 secs
Signed-off-by: Faizel K B <faizel.kb@dicortech.com>
Link: https://lore.kernel.org/r/20210902114444.15106-1-faizel.kb@dicortech.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 372d1f3e1b ]
The ext2_error() function syncs the filesystem so it sleeps. The caller
is holding a spinlock so it's not allowed to sleep.
ext2_statfs() <- disables preempt
-> ext2_count_free_blocks()
-> ext2_get_group_desc()
Fix this by using WARN() to print an error message and a stack trace
instead of using ext2_error().
Link: https://lore.kernel.org/r/20210921203233.GA16529@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ede7f84c7 ]
When re-entering the main loop of xenvif_tx_check_gop() a 2nd time, the
special considerations for the head of the SKB no longer apply. Don't
mistakenly report ERROR to the frontend for the first entry in the list,
even if - from all I can tell - this shouldn't matter much as the overall
transmit will need to be considered failed anyway.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cf9579976f ]
MDIO-attached devices might have interrupts and other things that might
need quiesced when we kexec into a new kernel. Things are even more
creepy when those interrupt lines are shared, and in that case it is
absolutely mandatory to disable all interrupt sources.
Moreover, MDIO devices might be DSA switches, and DSA needs its own
shutdown method to unlink from the DSA master, which is a new
requirement that appeared after commit 2f1e8ea726 ("net: dsa: link
interfaces with the DSA master to get rid of lockdep warnings").
So introduce a ->shutdown method in the MDIO device driver structure.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b225baaba ]
When we get an error flushing one device, during a super block commit, we
record the error in the device structure, in the field 'last_flush_error'.
This is used to later check if we should error out the super block commit,
depending on whether the number of flush errors is greater than or equals
to the maximum tolerated device failures for a raid profile.
However if we get a transient device flush error, unmount the filesystem
and later try to mount it, we can fail the mount because we treat that
past error as critical and consider the device is missing. Even if it's
very likely that the error will happen again, as it's probably due to a
hardware related problem, there may be cases where the error might not
happen again. One example is during testing, and a test case like the
new generic/648 from fstests always triggers this. The test cases
generic/019 and generic/475 also trigger this scenario, but very
sporadically.
When this happens we get an error like this:
$ mount /dev/sdc /mnt
mount: /mnt wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error.
$ dmesg
(...)
[12918.886926] BTRFS warning (device sdc): chunk 13631488 missing 1 devices, max tolerance is 0 for writable mount
[12918.888293] BTRFS warning (device sdc): writable mount is not allowed due to too many missing devices
[12918.890853] BTRFS error (device sdc): open_ctree failed
The failure happens because when btrfs_check_rw_degradable() is called at
mount time, or at remount from RO to RW time, is sees a non zero value in
a device's ->last_flush_error attribute, and therefore considers that the
device is 'missing'.
Fix this by setting a device's ->last_flush_error to zero when we close a
device, making sure the error is not seen on the next mount attempt. We
only need to track flush errors during the current mount, so that we never
commit a super block if such errors happened.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bbc9a6eb5e ]
There is a BUG_ON() in btrfs_csum_one_bio() to catch code logic error.
It has indeed caught several bugs during subpage development.
But the BUG_ON() itself will bring down the whole system which is
an overkill.
Replace it with a WARN() and exit gracefully, so that it won't crash the
whole system while we can still catch the code logic error.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 02579b2ff8 ]
When the back channel enters SEQ4_STATUS_CB_PATH_DOWN state, the client
recovers by sending BIND_CONN_TO_SESSION but the server fails to recover
the back channel and leaves it as NFSD4_CB_DOWN.
Fix by enhancing nfsd4_bind_conn_to_session to probe the back channel
by calling nfsd4_probe_callback.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 196159d278 ]
Add info for getting the firmware directly from the UEFI for the Chuwi Hi10
Plus (CWI527), so that the user does not need to manually install the
firmware in /lib/firmware/silead.
This change will make the touchscreen on these devices work OOTB,
without requiring any manual setup.
Also tweak the min and width/height values a bit for more accurate position
reporting.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210905130210.32810-2-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3bf1669b0e ]
Add touchscreen info for the Chuwi HiBook (CWI514) tablet. This includes
info for getting the firmware directly from the UEFI, so that the user does
not need to manually install the firmware in /lib/firmware/silead.
This change will make the touchscreen on these devices work OOTB,
without requiring any manual setup.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210905130210.32810-1-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5457773ef9 ]
Previously zero length transfers submitted to the Rokchip SPI driver would
time out in the SPI layer. This happens because the SPI peripheral does
not trigger a transfer completion interrupt for zero length transfers.
Fix that by completing zero length transfers immediately at start of
transfer.
Signed-off-by: Tobias Schramm <t.schramm@manjaro.org>
Link: https://lore.kernel.org/r/20210827050357.165409-1-t.schramm@manjaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit aa53f580e6 ("scsi: ufs: Minor adjustments to error handling")
introduced a ufshcd_clear_ua_wluns() call in
ufshcd_err_handling_unprepare(). As explained in detail by Adrian Hunter,
this can trigger a deadlock. Avoid that deadlock by removing the code that
clears the unit attention. This is safe because the only software that
relies on clearing unit attentions is the Android Trusty software and
because support for handling unit attentions has been added in the Trusty software.
See also https://lore.kernel.org/linux-scsi/20210930124224.114031-2-adrian.hunter@intel.com/
Note that, should apply "scsi: ufs: retry START_STOP on UNIT_ATTENTION" before
this patch, since this patch gives UNIT ATTENTION to scsi_execute(START_STOP).
Bug: 194712579
(cherry picked from commit edc0596cc0
git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next)
Link: https://lore.kernel.org/r/20211001182015.1347587-3-jaegeuk@kernel.org
Fixes: aa53f580e6 ("scsi: ufs: Minor adjustments to error handling")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change-Id: I73656b8b6773558dc7a552700d283c1ae6dc25f7
Commit 57d104c153 ("ufs: add UFS power management support") made the UFS
driver submit a REQUEST SENSE command before submitting a power management
command to a WLUN to clear the POWER ON unit attention. Instead of
submitting a REQUEST SENSE command before submitting a power management
command, retry the power management command until it succeeds.
This is the preparation to get rid of all UNIT ATTENTION code which should
be handled by users.
Bug: 194712579
Link: https://lore.kernel.org/r/20211001182015.1347587-2-jaegeuk@kernel.org
(cherry picked from commit af21c3fd5b
git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change-Id: I7e639d89ae9fbd5ff0f1b3a6e5cbe77682ebefc0
Expose param_get_int/param_set_int for module use
Bug: 202230809
Change-Id: I5b1daa38a04024783c52881d14ef460b18da3dbc
Signed-off-by: David Kimmel <davidkimmel@google.com>
Changes in 5.10.71
tty: Fix out-of-bound vmalloc access in imageblit
cpufreq: schedutil: Use kobject release() method to free sugov_tunables
scsi: qla2xxx: Changes to support kdump kernel for NVMe BFS
cpufreq: schedutil: Destroy mutex before kobject_put() frees the memory
usb: cdns3: fix race condition before setting doorbell
ALSA: hda/realtek: Quirks to enable speaker output for Lenovo Legion 7i 15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 laptops.
ACPI: NFIT: Use fallback node id when numa info in NFIT table is incorrect
fs-verity: fix signed integer overflow with i_size near S64_MAX
hwmon: (tmp421) handle I2C errors
hwmon: (w83793) Fix NULL pointer dereference by removing unnecessary structure field
hwmon: (w83792d) Fix NULL pointer dereference by removing unnecessary structure field
hwmon: (w83791d) Fix NULL pointer dereference by removing unnecessary structure field
gpio: pca953x: do not ignore i2c errors
scsi: ufs: Fix illegal offset in UPIU event trace
mac80211: fix use-after-free in CCMP/GCMP RX
x86/kvmclock: Move this_cpu_pvti into kvmclock.h
KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect()
KVM: x86: nSVM: don't copy virt_ext from vmcb12
KVM: nVMX: Filter out all unsupported controls when eVMCS was activated
KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest
media: ir_toy: prevent device from hanging during transmit
RDMA/cma: Do not change route.addr.src_addr.ss_family
drm/amd/display: Pass PCI deviceid into DC
drm/amdgpu: correct initial cp_hqd_quantum for gfx9
ipvs: check that ip_vs_conn_tab_bits is between 8 and 20
bpf: Handle return value of BPF_PROG_TYPE_STRUCT_OPS prog
IB/cma: Do not send IGMP leaves for sendonly Multicast groups
RDMA/cma: Fix listener leak in rdma_cma_listen_on_all() failure
bpf, mips: Validate conditional branch offsets
hwmon: (mlxreg-fan) Return non-zero value when fan current state is enforced from sysfs
mac80211: Fix ieee80211_amsdu_aggregate frag_tail bug
mac80211: limit injected vht mcs/nss in ieee80211_parse_tx_radiotap
mac80211: mesh: fix potentially unaligned access
mac80211-hwsim: fix late beacon hrtimer handling
sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
mptcp: don't return sockets in foreign netns
hwmon: (tmp421) report /PVLD condition as fault
hwmon: (tmp421) fix rounding for negative values
net: enetc: fix the incorrect clearing of IF_MODE bits
net: ipv4: Fix rtnexthop len when RTA_FLOW is present
smsc95xx: fix stalled rx after link change
drm/i915/request: fix early tracepoints
dsa: mv88e6xxx: 6161: Use chip wide MAX MTU
dsa: mv88e6xxx: Fix MTU definition
dsa: mv88e6xxx: Include tagger overhead when setting MTU for DSA and CPU ports
e100: fix length calculation in e100_get_regs_len
e100: fix buffer overrun in e100_get_regs
RDMA/hns: Fix inaccurate prints
bpf: Exempt CAP_BPF from checks against bpf_jit_limit
selftests, bpf: Fix makefile dependencies on libbpf
selftests, bpf: test_lwt_ip_encap: Really disable rp_filter
net: ks8851: fix link error
Revert "block, bfq: honor already-setup queue merges"
scsi: csiostor: Add module softdep on cxgb4
ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
net: hns3: do not allow call hns3_nic_net_open repeatedly
net: hns3: keep MAC pause mode when multiple TCs are enabled
net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE
net: hns3: fix show wrong state when add existing uc mac address
net: hns3: fix prototype warning
net: hns3: reconstruct function hns3_self_test
net: hns3: fix always enable rx vlan filter problem after selftest
net: phy: bcm7xxx: Fixed indirect MMD operations
net: sched: flower: protect fl_walk() with rcu
af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
perf/x86/intel: Update event constraints for ICX
hwmon: (pmbus/mp2975) Add missed POUT attribute for page 1 mp2975 controller
nvme: add command id quirk for apple controllers
elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings
debugfs: debugfs_create_file_size(): use IS_ERR to check for error
ipack: ipoctal: fix stack information leak
ipack: ipoctal: fix tty registration race
ipack: ipoctal: fix tty-registration error handling
ipack: ipoctal: fix missing allocation-failure check
ipack: ipoctal: fix module reference leak
ext4: fix loff_t overflow in ext4_max_bitmap_size()
ext4: limit the number of blocks in one ADD_RANGE TLV
ext4: fix reserved space counter leakage
ext4: add error checking to ext4_ext_replay_set_iblocks()
ext4: fix potential infinite loop in ext4_dx_readdir()
HID: u2fzero: ignore incomplete packets without data
net: udp: annotate data race around udp_sk(sk)->corkflag
ASoC: dapm: use component prefix when checking widget names
usb: hso: remove the bailout parameter
crypto: ccp - fix resource leaks in ccp_run_aes_gcm_cmd()
HID: betop: fix slab-out-of-bounds Write in betop_probe
netfilter: ipset: Fix oversized kvmalloc() calls
mm: don't allow oversized kvmalloc() calls
HID: usbhid: free raw_report buffers in usbhid_stop
KVM: x86: Handle SRCU initialization failure during page track init
netfilter: conntrack: serialize hash resizes and cleanups
netfilter: nf_tables: Fix oversized kvmalloc() calls
Linux 5.10.71
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1240e641c378df6c8da41fb8ce5894db5e34a5be
commit e9edc188fc upstream.
Syzbot was able to trigger the following warning [1]
No repro found by syzbot yet but I was able to trigger similar issue
by having 2 scripts running in parallel, changing conntrack hash sizes,
and:
for j in `seq 1 1000` ; do unshare -n /bin/true >/dev/null ; done
It would take more than 5 minutes for net_namespace structures
to be cleaned up.
This is because nf_ct_iterate_cleanup() has to restart everytime
a resize happened.
By adding a mutex, we can serialize hash resizes and cleanups
and also make get_next_corpse() faster by skipping over empty
buckets.
Even without resizes in the picture, this patch considerably
speeds up network namespace dismantles.
[1]
INFO: task syz-executor.0:8312 can't die for more than 144 seconds.
task:syz-executor.0 state:R running task stack:25672 pid: 8312 ppid: 6573 flags:0x00004006
Call Trace:
context_switch kernel/sched/core.c:4955 [inline]
__schedule+0x940/0x26f0 kernel/sched/core.c:6236
preempt_schedule_common+0x45/0xc0 kernel/sched/core.c:6408
preempt_schedule_thunk+0x16/0x18 arch/x86/entry/thunk_64.S:35
__local_bh_enable_ip+0x109/0x120 kernel/softirq.c:390
local_bh_enable include/linux/bottom_half.h:32 [inline]
get_next_corpse net/netfilter/nf_conntrack_core.c:2252 [inline]
nf_ct_iterate_cleanup+0x15a/0x450 net/netfilter/nf_conntrack_core.c:2275
nf_conntrack_cleanup_net_list+0x14c/0x4f0 net/netfilter/nf_conntrack_core.c:2469
ops_exit_list+0x10d/0x160 net/core/net_namespace.c:171
setup_net+0x639/0xa30 net/core/net_namespace.c:349
copy_net_ns+0x319/0x760 net/core/net_namespace.c:470
create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110
unshare_nsproxy_namespaces+0xc1/0x1f0 kernel/nsproxy.c:226
ksys_unshare+0x445/0x920 kernel/fork.c:3128
__do_sys_unshare kernel/fork.c:3202 [inline]
__se_sys_unshare kernel/fork.c:3200 [inline]
__x64_sys_unshare+0x2d/0x40 kernel/fork.c:3200
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f63da68e739
RSP: 002b:00007f63d7c05188 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00007f63da792f80 RCX: 00007f63da68e739
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000040000000
RBP: 00007f63da6e8cc4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f63da792f80
R13: 00007fff50b75d3f R14: 00007f63d7c05300 R15: 0000000000022000
Showing all locks held in the system:
1 lock held by khungtaskd/27:
#0: ffffffff8b980020 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6446
2 locks held by kworker/u4:2/153:
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline]
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1198 [inline]
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:634 [inline]
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline]
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x896/0x1690 kernel/workqueue.c:2268
#1: ffffc9000140fdb0 ((kfence_timer).work){+.+.}-{0:0}, at: process_one_work+0x8ca/0x1690 kernel/workqueue.c:2272
1 lock held by systemd-udevd/2970:
1 lock held by in:imklog/6258:
#0: ffff88807f970ff0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:990
3 locks held by kworker/1:6/8158:
1 lock held by syz-executor.0/8312:
2 locks held by kworker/u4:13/9320:
1 lock held by syz-executor.5/10178:
1 lock held by syz-executor.4/10217:
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7661809d49 upstream.
'kvmalloc()' is a convenience function for people who want to do a
kmalloc() but fall back on vmalloc() if there aren't enough physically
contiguous pages, or if the allocation is larger than what kmalloc()
supports.
However, let's make sure it doesn't get _too_ easy to do crazy things
with it. In particular, don't allow big allocations that could be due
to integer overflow or underflow. So make sure the allocation size fits
in an 'int', to protect against trivial integer conversion issues.
Acked-by: Willy Tarreau <w@1wt.eu>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>