Like all lz77-based algrithms, lz4 has a dynamically populated
("sliding window") dictionary and the maximum lookback distance
is 65535. Therefore the number of bounced pages could be limited
by erofs based on this property.
However, just now we observed some lz4 sequences in the extreme
case cannot be decompressed correctly after this feature is enabled,
the root causes after analysis are clear as follows:
1) max bounced pages should be 17 rather than 16 pages;
2) considering the following case, the broken implementation
could reuse unsafely in advance (in other words, reuse it
less than a safe distance),
0 1 2 ... 16 17 18 ... 33 34
b p b b
note that the bounce page that we are concerned was allocated
at 0, and it reused at 18 since page 17 exists, but it mis-reused
at 34 in advance again, which causes decompress failure.
This patch resolves the issue by introducing a bitmap to mark
whether the page in the same position of last round is a bounced
page or not, and a micro stack data structure to store all
available bounced pages.
Fixes: 7fc45dbc93 ("staging: erofs: introduce generic decompression backend")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In multi-window mode, the read iterator is supposed to start from the
window with the oldest data, which is, chronologically, the next window
after the one with the newest data. This, however, fails to take into
account the potentially empty windows, so in short trace sessions it's
possible to have a lot of zeroes read from the character device first.
Fix this by skipping over the empty windows in initialization of the
read iterator.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20190627125152.54905-5-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit ba39bd8306 ("intel_th: msu: Switch over to scatterlist")
introduced the following warnings on non-x86 architectures, as a result
of reordering the multi mode buffer allocation sequence:
> drivers/hwtracing/intel_th/msu.c: In function ‘msc_buffer_win_alloc’:
> drivers/hwtracing/intel_th/msu.c:783:21: warning: unused variable ‘i’
> [-Wunused-variable]
> int ret = -ENOMEM, i;
> ^
> drivers/hwtracing/intel_th/msu.c: In function ‘msc_buffer_win_free’:
> drivers/hwtracing/intel_th/msu.c:863:6: warning: unused variable ‘i’
> [-Wunused-variable]
> int i;
> ^
Fix this compiler warning by factoring out set_memory sequences and making
them x86-only.
Suggested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Fixes: ba39bd8306 ("intel_th: msu: Switch over to scatterlist")
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190621161930.60785-2-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Change return types of below functions as they never fails
xfs_log_mount_cancel
xlog_recover_cancel
xlog_recover_cancel_intents
fix below issue reported by coccicheck
fs/xfs/xfs_log_recover.c:4886:7-12: Unneeded variable: "error". Return
"0" on line 4926
Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Create a pwork destroy function that uses polling instead of
uninterruptible sleep to wait for work items to finish so that we can
touch the softlockup watchdog. IOWs, gross hack.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Stanislav Fomichev says:
====================
Congestion control team would like to have a periodic callback to
track some TCP statistics. Let's add a sock_ops callback that can be
selectively enabled on a socket by socket basis and is executed for
every RTT. BPF program frequency can be further controlled by calling
bpf_ktime_get_ns and bailing out early.
I run neper tcp_stream and tcp_rr tests with the sample program
from the last patch and didn't observe any noticeable performance
difference.
v2:
* add a comment about second accept() in selftest (Yonghong Song)
* refer to tcp_bpf.readme in sample program (Yonghong Song)
====================
Suggested-by: Eric Dumazet <edumazet@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Priyaranjan Jha <priyarjha@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The FSGSBASE series turned out to have serious bugs and there is still an
open issue which is not fully understood yet.
The confidence in those changes has become close to zero especially as the
test cases which have been shipped with that series were obviously never
run before sending the final series out to LKML.
./fsgsbase_64 >/dev/null
Segmentation fault
As the merge window is close, the only sane decision is to revert FSGSBASE
support. The revert is necessary as this branch has been merged into
perf/core already and rebasing all of that a few days before the merge
window is not the most brilliant idea.
I could definitely slap myself for not noticing the test case fail when
merging that series, but TBH my expectations weren't that low back
then. Won't happen again.
Revert the following commits:
539bca535d ("x86/entry/64: Fix and clean up paranoid_exit")
2c7b5ac5d5 ("Documentation/x86/64: Add documentation for GS/FS addressing mode")
f987c955c7 ("x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2")
2032f1f96e ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit")
5bf0cab60e ("x86/entry/64: Document GSBASE handling in the paranoid path")
708078f657 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")
79e1932fa3 ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro")
1d07316b13 ("x86/entry/64: Switch CR3 before SWAPGS in paranoid entry")
f60a83df45 ("x86/process/64: Use FSGSBASE instructions on thread copy and ptrace")
1ab5f3f7fe ("x86/process/64: Use FSBSBASE in switch_to() if available")
a86b462513 ("x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions")
8b71340d70 ("x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions")
b64ed19b93 ("x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Create a parallel iwalk implementation and switch quotacheck to use it.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
kmemdup is introduced to duplicate a region of memory in a neat way.
Rather than kmalloc/kzalloc + memset, which the programmer needs to
write the size twice (sometimes lead to mistakes), kmemdup improves
readability, leads to smaller code and also reduce the chances of mistakes.
Suggestion to use kmemdup rather than using kmalloc/kzalloc + memset.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Link: https://lore.kernel.org/r/20190703131727.25735-1-huangfq.daxian@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
When DMA is used, waiting for completion must not be
interruptible as it can generate an error that is not handle
by the driver. There is no need to put the completion
interruptible in this driver.
Signed-off-by: Lionel Debieve <lionel.debieve@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Correct condition for the second hmac loop. Key must be only
set in the first loop. Initial condition was wrong,
HMAC_KEY flag was not properly checked.
Signed-off-by: Lionel Debieve <lionel.debieve@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The trailing newlines will lead to extra newlines in the trace file
which looks like the following output, so remove it.
qemu-system-x86-15695 [002] ...1 15774.839240: kvm_hv_timer_state: vcpu_id 0 hv_timer 1
qemu-system-x86-15695 [002] ...1 15774.839309: kvm_hv_timer_state: vcpu_id 0 hv_timer 1
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow testing code for old processors that lack the next RIP save
feature, by disabling usage of the next_rip field.
Nested hypervisors however get the feature unconditionally.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In commit af7ddd8a62
("Merge tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping"),
dma_alloc_coherent has already zeroed the memory.
So memset is not needed.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The error code read from the queue status register is only 6 bits wide,
but we need to verify its value is within range before indexing the error
messages.
Fixes: 81422badb3 ("crypto: ccp - Make syslog errors human-readable")
Cc: <stable@vger.kernel.org>
Reported-by: Cfir Cohen <cfir@google.com>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tidy up the formatting/grammar in crypto_engine.rst. Use bulleted lists
where appropriate.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fill in missing parameter descriptions for the compression algorithm,
then pick them up to document for the compression_alg structure.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This implements 5-way interleaving for ECB, CBC decryption and CTR,
resulting in a speedup of ~11% on Marvell ThunderX2, which has a
very deep pipeline and therefore a high issue latency for NEON
instructions operating on the same registers.
Note that XTS is left alone: implementing 5-way interleave there
would either involve spilling of the calculated tweaks to the
stack, or recalculating them after the encryption operation, and
doing either of those would most likely penalize low end cores.
For ECB, this is not a concern at all, given that we have plenty
of spare registers. For CTR and CBC decryption, we take advantage
of the fact that v16 is not used by the CE version of the code
(which is the only one targeted by the optimization), and so we
can reshuffle the code a bit and avoid having to spill to memory
(with the exception of one extra reload in the CBC routine)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In preparation of tweaking the accelerated AES chaining mode routines
to be able to use a 5-way stride, implement the core routines to
support processing 5 blocks of input at a time. While at it, drop
the 2 way versions, which have been unused for a while now.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
On SEC1, hash provides wrong result when performing hashing in several
steps with input data SG list has more than one element. This was
detected with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:
[ 44.185947] alg: hash: md5-talitos test failed (wrong result) on test vector 6, cfg="random: may_sleep use_finup src_divs=[<reimport>25.88%@+8063, <flush>24.19%@+9588, 28.63%@+16333, <reimport>4.60%@+6756, 16.70%@+16281] dst_divs=[71.61%@alignmask+16361, 14.36%@+7756, 14.3%@+"
[ 44.325122] alg: hash: sha1-talitos test failed (wrong result) on test vector 3, cfg="random: inplace use_final src_divs=[<flush,nosimd>16.56%@+16378, <reimport>52.0%@+16329, 21.42%@alignmask+16380, 10.2%@alignmask+16380] iv_offset=39"
[ 44.493500] alg: hash: sha224-talitos test failed (wrong result) on test vector 4, cfg="random: use_final nosimd src_divs=[<reimport>52.27%@+7401, <reimport>17.34%@+16285, <flush>17.71%@+26, 12.68%@+10644] iv_offset=43"
[ 44.673262] alg: hash: sha256-talitos test failed (wrong result) on test vector 4, cfg="random: may_sleep use_finup src_divs=[<reimport>60.6%@+12790, 17.86%@+1329, <reimport>12.64%@alignmask+16300, 8.29%@+15, 0.40%@+13506, <reimport>0.51%@+16322, <reimport>0.24%@+16339] dst_divs"
This is due to two issues:
- We have an overlap between the buffer used for copying the input
data (SEC1 doesn't do scatter/gather) and the chained descriptor.
- Data copy is wrong when the previous hash left less than one
blocksize of data to hash, implying a complement of the previous
block with a few bytes from the new request.
Fix it by:
- Moving the second descriptor after the buffer, as moving the buffer
after the descriptor would make it more complex for other cipher
operations (AEAD, ABLKCIPHER)
- Skip the bytes taken from the new request to complete the previous
one by moving the SG list forward.
Fixes: 37b5e8897e ("crypto: talitos - chain in buffered data for ahash on SEC1")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Moves struct talitos_edesc into talitos.h so that it can be used
from any place in talitos.c
It will be required for next patch ("crypto: talitos - fix hash
on SEC1")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All mapping iterator logic is based on the assumption that sg->offset
is always lower than PAGE_SIZE.
But there are situations where sg->offset is such that the SG item
is on the second page. In that case sg_copy_to_buffer() fails
properly copying the data into the buffer. One of the reason is
that the data will be outside the kmapped area used to access that
data.
This patch fixes the issue by adjusting the mapping iterator
offset and pgoffset fields such that offset is always lower than
PAGE_SIZE.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Fixes: 4225fc8555 ("lib/scatterlist: use page iterator in the mapping iterator")
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>