commit 701d678599 upstream.
In zs_destroy_pool() we call flush_work(&pool->free_work). However, we
have no guarantee that migration isn't happening in the background at
that time.
Since migration can't directly free pages, it relies on free_work being
scheduled to free the pages. But there's nothing preventing an
in-progress migrate from queuing the work *after*
zs_unregister_migration() has called flush_work(). Which would mean
pages still pointing at the inode when we free it.
Since we know at destroy time all objects should be free, no new
migrations can come in (since zs_page_isolate() fails for fully-free
zspages). This means it is sufficient to track a "# isolated zspages"
count by class, and have the destroy logic ensure all such pages have
drained before proceeding. Keeping that state under the class spinlock
keeps the logic straightforward.
In this case a memory leak could lead to an eventual crash if compaction
hits the leaked page. This crash would only occur if people are
changing their zswap backend at runtime (which eventually starts
destruction).
Link: http://lkml.kernel.org/r/20190809181751.219326-2-henryburns@google.com
Fixes: 48b4800a1c ("zsmalloc: page migration support")
Signed-off-by: Henry Burns <henryburns@google.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Henry Burns <henrywolfeburns@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Jonathan Adams <jwadams@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0ff14fdc9 upstream.
If alloc_descs() fails before irq_sysfs_init() has run, free_desc() in the
cleanup path will call kobject_del() even though the kobject has not been
added with kobject_add().
Fix this by making the call to kobject_del() conditional on whether
irq_sysfs_init() has run.
This problem surfaced because commit aa30f47cf6 ("kobject: Add support
for default attribute groups to kobj_type") makes kobject_del() stricter
about pairing with kobject_add(). If the pairing is incorrrect, a WARNING
and backtrace occur in sysfs_remove_group() because there is no parent.
[ tglx: Add a comment to the code and make it work with CONFIG_SYSFS=n ]
Fixes: ecb3f394c5 ("genirq: Expose interrupt information through sysfs")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1564703564-4116-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75d66ffb48 upstream.
dm-zoned is observed to lock up or livelock in case of hardware
failure or some misconfiguration of the backing zoned device.
This patch adds a new dm-zoned target function that checks the status of
the backing device. If the request queue of the backing device is found
to be in dying state or the SCSI backing device enters offline state,
the health check code sets a dm-zoned target flag prompting all further
incoming I/O to be rejected. In order to detect backing device failures
timely, this new function is called in the request mapping path, at the
beginning of every reclaim run and before performing any metadata I/O.
The proper way out of this situation is to do
dmsetup remove <dm-zoned target>
and recreate the target when the problem with the backing device
is resolved.
Fixes: 3b1a94c88b ("dm zoned: drive-managed zoned block device target")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d7428c5011 upstream.
Some errors are ignored in the I/O path during queueing chunks
for processing by chunk works. Since at least these errors are
transient in nature, it should be possible to retry the failed
incoming commands.
The fix -
Errors that can happen while queueing chunks are carried upwards
to the main mapping function and it now returns DM_MAPIO_REQUEUE
for any incoming requests that can not be properly queued.
Error logging/debug messages are added where needed.
Fixes: 3b1a94c88b ("dm zoned: drive-managed zoned block device target")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b234c6d7a7 upstream.
There are several places in reclaim code where errors are not
propagated to the main function, dmz_reclaim(). This function
is responsible for unlocking zones that might be still locked
at the end of any failed reclaim iterations. As the result,
some device zones may be left permanently locked for reclaim,
degrading target's capability to reclaim zones.
This patch fixes these issues as follows -
Make sure that dmz_reclaim_buf(), dmz_reclaim_seq_data() and
dmz_reclaim_rnd_data() return error codes to the caller.
dmz_reclaim() function is renamed to dmz_do_reclaim() to avoid
clashing with "struct dmz_reclaim" and is modified to return the
error to the caller.
dmz_get_zone_for_reclaim() now returns an error instead of NULL
pointer and reclaim code checks for that error.
Error logging/debug messages are added where necessary.
Fixes: 3b1a94c88b ("dm zoned: drive-managed zoned block device target")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1cfd5d3399 upstream.
If the sector number is too high, dm_table_find_target() should return a
pointer to a zeroed dm_target structure (the caller should test it with
dm_target_is_valid).
However, for some table sizes, the code in dm_table_find_target() that
performs btree lookup will access out of bound memory structures.
Fix this bug by testing the sector number at the beginning of
dm_table_find_target(). Also, add an "inline" keyword to the function
dm_table_get_size() because this is a hot path.
Fixes: 512875bd96 ("dm: table detect io beyond device")
Cc: stable@vger.kernel.org
Reported-by: Zhang Tao <kontais@zoho.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae148243d3 upstream.
In commit 6096d91af0 ("dm space map metadata: fix occasional leak
of a metadata block on resize"), we refactor the commit logic to a new
function 'apply_bops'. But when that logic was replaced in out() the
return value was not stored. This may lead out() returning a wrong
value to the caller.
Fixes: 6096d91af0 ("dm space map metadata: fix occasional leak of a metadata block on resize")
Cc: stable@vger.kernel.org
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc1a3e8e0c upstream.
If rs_prepare_reshape() fails, no cleanup is executed, leading to
leak of the raid_set structure allocated at the beginning of
raid_ctr(). To fix this issue, go to the label 'bad' if the error
occurs.
Fixes: 11e4723206 ("dm raid: stop keeping raid set frozen altogether")
Cc: stable@vger.kernel.org
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5729b6e5a1 upstream.
Fix a crash that was introduced by the commit 724376a04d. The crash is
reported here: https://gitlab.com/cryptsetup/cryptsetup/issues/468
When reading from the integrity device, the function
dm_integrity_map_continue calls find_journal_node to find out if the
location to read is present in the journal. Then, it calculates how many
sectors are consecutively stored in the journal. Then, it locks the range
with add_new_range and wait_and_add_new_range.
The problem is that during wait_and_add_new_range, we hold no locks (we
don't hold ic->endio_wait.lock and we don't hold a range lock), so the
journal may change arbitrarily while wait_and_add_new_range sleeps.
The code then goes to __journal_read_write and hits
BUG_ON(journal_entry_get_sector(je) != logical_sector); because the
journal has changed.
In order to fix this bug, we need to re-check the journal location after
wait_and_add_new_range. We restrict the length to one block in order to
not complicate the code too much.
Fixes: 724376a04d ("dm integrity: implement fair range locks")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4f9d60138 upstream.
When btree_split_beneath() splits a node to two new children, it will
allocate two blocks: left and right. If right block's allocation
failed, the left block will be unlocked and marked dirty. If this
happened, the left block'ss content is zero, because it wasn't
initialized with the btree struct before the attempot to allocate the
right block. Upon return, when flushing the left block to disk, the
validator will fail when check this block. Then a BUG_ON is raised.
Fix this by completely initializing the left block before allocating and
initializing the right block.
Fixes: 4dcb8b57df ("dm btree: fix leak of bufio-backed block in btree_split_beneath error path")
Cc: stable@vger.kernel.org
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1fef41465 upstream.
This patch fixes a problem in dm-kcopyd that may leave jobs in
complete queue indefinitely in the event of backing storage failure.
This behavior has been observed while running 100% write file fio
workload against an XFS volume created on top of a dm-zoned target
device. If the underlying storage of dm-zoned goes to offline state
under I/O, kcopyd sometimes never issues the end copy callback and
dm-zoned reclaim work hangs indefinitely waiting for that completion.
This behavior was traced down to the error handling code in
process_jobs() function that places the failed job to complete_jobs
queue, but doesn't wake up the job handler. In case of backing device
failure, all outstanding jobs may end up going to complete_jobs queue
via this code path and then stay there forever because there are no
more successful I/O jobs to wake up the job handler.
This patch adds a wake() call to always wake up kcopyd job wait queue
for all I/O jobs that fail before dm_io() gets called for that job.
The patch also sets the write error status in all sub jobs that are
failed because their master job has failed.
Fixes: b73c67c2cb ("dm kcopyd: add sequential write feature")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a90118c445 upstream.
Recent gcc compilers (gcc 9.1) generate warnings about an out of bounds
memset, if the memset goes accross several fields of a struct. This
generated a couple of warnings on x86_64 builds in sanitize_boot_params().
Fix this by explicitly saving the fields in struct boot_params
that are intended to be preserved, and zeroing all the rest.
[ tglx: Tagged for stable as it breaks the warning free build there as well ]
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190731054627.5627-2-jhubbard@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f897e60a12 upstream.
Some newer machines do not advertise legacy timers. The kernel can handle
that situation if the TSC and the CPU frequency are enumerated by CPUID or
MSRs and the CPU supports TSC deadline timer. If the CPU does not support
TSC deadline timer the local APIC timer frequency has to be known as well.
Some Ryzens machines do not advertize legacy timers, but there is no
reliable way to determine the bus frequency which feeds the local APIC
timer when the machine allows overclocking of that frequency.
As there is no legacy timer the local APIC timer calibration crashes due to
a NULL pointer dereference when accessing the not installed global clock
event device.
Switch the calibration loop to a non interrupt based one, which polls
either TSC (if frequency is known) or jiffies. The latter requires a global
clockevent. As the machines which do not have a global clockevent installed
have a known TSC frequency this is a non issue. For older machines where
TSC frequency is not known, there is no known case where the legacy timers
do not exist as that would have been reported long ago.
Reported-by: Daniel Drake <drake@endlessm.com>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Drake <drake@endlessm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908091443030.21433@nanos.tec.linutronix.de
Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1142926#c12
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b63f20a778 upstream.
Use 'lea' instead of 'add' when adjusting %rsp in CALL_NOSPEC so as to
avoid clobbering flags.
KVM's emulator makes indirect calls into a jump table of sorts, where
the destination of the CALL_NOSPEC is a small blob of code that performs
fast emulation by executing the target instruction with fixed operands.
adcb_al_dl:
0x000339f8 <+0>: adc %dl,%al
0x000339fa <+2>: ret
A major motiviation for doing fast emulation is to leverage the CPU to
handle consumption and manipulation of arithmetic flags, i.e. RFLAGS is
both an input and output to the target of CALL_NOSPEC. Clobbering flags
results in all sorts of incorrect emulation, e.g. Jcc instructions often
take the wrong path. Sans the nops...
asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
0x0003595a <+58>: mov 0xc0(%ebx),%eax
0x00035960 <+64>: mov 0x60(%ebx),%edx
0x00035963 <+67>: mov 0x90(%ebx),%ecx
0x00035969 <+73>: push %edi
0x0003596a <+74>: popf
0x0003596b <+75>: call *%esi
0x000359a0 <+128>: pushf
0x000359a1 <+129>: pop %edi
0x000359a2 <+130>: mov %eax,0xc0(%ebx)
0x000359b1 <+145>: mov %edx,0x60(%ebx)
ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
0x000359a8 <+136>: mov -0x10(%ebp),%eax
0x000359ab <+139>: and $0x8d5,%edi
0x000359b4 <+148>: and $0xfffff72a,%eax
0x000359b9 <+153>: or %eax,%edi
0x000359bd <+157>: mov %edi,0x4(%ebx)
For the most part this has gone unnoticed as emulation of guest code
that can trigger fast emulation is effectively limited to MMIO when
running on modern hardware, and MMIO is rarely, if ever, accessed by
instructions that affect or consume flags.
Breakage is almost instantaneous when running with unrestricted guest
disabled, in which case KVM must emulate all instructions when the guest
has invalid state, e.g. when the guest is in Big Real Mode during early
BIOS.
Fixes: 776b043848fd2 ("x86/retpoline: Add initial retpoline support")
Fixes: 1a29b5b7f3 ("KVM: x86: Make indirect calls in emulator speculation safe")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190822211122.27579-1-sean.j.christopherson@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a9fc4340ae upstream.
In the case of X86_PAE, unsigned long is u32, but the physical address type
should be u64. Due to the bug here, the netvsc driver can not load
successfully, and sometimes the VM can panic due to memory corruption (the
hypervisor writes data to the wrong location).
Fixes: 6ba34171bc ("Drivers: hv: vmbus: Remove use of slow_virt_to_phys()")
Cc: stable@vger.kernel.org
Cc: Michael Kelley <mikelley@microsoft.com>
Reported-and-tested-by: Juliana Rodrigueiro <juliana.rodrigueiro@intra2net.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c60e6b5c9 upstream.
If the driver doesn't support open-drain/source config options, we
emulate this behavior when setting the direction by calling
gpiod_direction_input() if the default value is 0 (open-source) or
1 (open-drain), thus not actively driving the line in those cases.
This however clears the FLAG_IS_OUT bit for the GPIO line descriptor
and makes the LINEINFO ioctl() incorrectly report this line's mode as
'input' to user-space.
This commit modifies the ioctl() to always set the GPIOLINE_FLAG_IS_OUT
bit in the lineinfo structure's flags field. Since it's impossible to
use the input mode and open-drain/source options at the same time, we
can be sure the reported information will be correct.
Fixes: 521a2ad6f8 ("gpio: add userspace ABI for GPIO line information")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Link: https://lore.kernel.org/r/20190806114151.17652-1-brgl@bgdev.pl
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c358ebf596 upstream.
While I had thought I had fixed this issue in:
commit 342406e4fb ("drm/nouveau/i2c: Disable i2c bus access after
->fini()")
It turns out that while I did fix the error messages I was seeing on my
P50 when trying to access i2c busses with the GPU in runtime suspend, I
accidentally had missed one important detail that was mentioned on the
bug report this commit was supposed to fix: that the CPU would only lock
up when trying to access i2c busses _on connected devices_ _while the
GPU is not in runtime suspend_. Whoops. That definitely explains why I
was not able to get my machine to hang with i2c bus interactions until
now, as plugging my P50 into it's dock with an HDMI monitor connected
allowed me to finally reproduce this locally.
Now that I have managed to reproduce this issue properly, it looks like
the problem is much simpler then it looks. It turns out that some
connected devices, such as MST laptop docks, will actually ACK i2c reads
even if no data was actually read:
[ 275.063043] nouveau 0000:01:00.0: i2c: aux 000a: 1: 0000004c 1
[ 275.063447] nouveau 0000:01:00.0: i2c: aux 000a: 00 01101000 10040000
[ 275.063759] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000001
[ 275.064024] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
[ 275.064285] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
[ 275.064594] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
Because we don't handle the situation of i2c ack without any data, we
end up entering an infinite loop in nvkm_i2c_aux_i2c_xfer() since the
value of cnt always remains at 0. This finally properly explains how
this could result in a CPU hang like the ones observed in the
aforementioned commit.
So, fix this by retrying transactions if no data is written or received,
and give up and fail the transaction if we continue to not write or
receive any data after 32 retries.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a561372405 upstream.
We can't rely on ->peer_features in calc_target() because it may be
called both when the OSD session is established and open and when it's
not. ->peer_features is not valid unless the OSD session is open. If
this happens on a PG split (pg_num increase), that could mean we don't
resend a request that should have been resent, hanging the client
indefinitely.
In userspace this was fixed by looking at require_osd_release and
get_xinfo[osd].features fields of the osdmap. However these fields
belong to the OSD section of the osdmap, which the kernel doesn't
decode (only the client section is decoded).
Instead, let's drop this feature check. It effectively checks for
luminous, so only pre-luminous OSDs would be affected in that on a PG
split the kernel might resend a request that should not have been
resent. Duplicates can occur in other scenarios, so both sides should
already be prepared for them: see dup/replay logic on the OSD side and
retry_attempt check on the client side.
Cc: stable@vger.kernel.org
Fixes: 7de030d6b1 ("libceph: resend on PG splits if OSD has RESEND_ON_SPLIT")
Link: https://tracker.ceph.com/issues/41162
Reported-by: Jerry Lee <leisurelysw24@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Jerry Lee <leisurelysw24@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cf3591ef83 upstream.
Revert the commit bd293d071f. The proper
fix has been made available with commit d0a255e795 ("loop: set
PF_MEMALLOC_NOIO for the worker thread").
Note that the fix offered by commit bd293d071f doesn't really prevent
the deadlock from occuring - if we look at the stacktrace reported by
Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex -
i.e. it has already successfully taken the mutex. Changing the mutex
from mutex_lock to mutex_trylock won't help with deadlocks that happen
afterwards.
PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0"
#0 [ffff8813dedfb938] __schedule at ffffffff8173f405
#1 [ffff8813dedfb990] schedule at ffffffff8173fa27
#2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
#3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
#4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
#5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
#6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
#9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
#13 [ffff8813dedfbec0] kthread at ffffffff810a8428
#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: bd293d071f ("dm bufio: fix deadlock with loop device")
Depends-on: d0a255e795 ("loop: set PF_MEMALLOC_NOIO for the worker thread")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c69509c70a ]
At the moment, the way we reset CP15 registers is mildly insane:
We write junk to them, call the reset functions, and then check that
we have something else in them.
The "fun" thing is that this can happen while the guest is running
(PSCI, for example). If anything in KVM has to evaluate the state
of a CP15 register while junk is in there, bad thing may happen.
Let's stop doing that. Instead, we track that we have called a
reset function for that register, and assume that the reset
function has done something.
In the end, the very need of this reset check is pretty dubious,
as it doesn't check everything (a lot of the CP15 reg leave outside
of the cp15_regs[] array). It may well be axed in the near future.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 03fdfb2690 ]
At the moment, the way we reset system registers is mildly insane:
We write junk to them, call the reset functions, and then check that
we have something else in them.
The "fun" thing is that this can happen while the guest is running
(PSCI, for example). If anything in KVM has to evaluate the state
of a system register while junk is in there, bad thing may happen.
Let's stop doing that. Instead, we track that we have called a
reset function for that register, and assume that the reset
function has done something. This requires fixing a couple of
sysreg refinition in the trap table.
In the end, the very need of this reset check is pretty dubious,
as it doesn't check everything (a lot of the sysregs leave outside of
the sys_regs[] array). It may well be axed in the near future.
Tested-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e6e5bea2e ]
The events defined in pmu-events JSON are parsed and added into perf
tool. For fixed counters, we handle the encodings between JSON and perf
by using a static array fixed[].
But the fixed[] has missed an important event "cpu_clk_unhalted.core".
For example, on the Tremont platform,
[root@localhost ~]# perf stat -e cpu_clk_unhalted.core -a
event syntax error: 'cpu_clk_unhalted.core'
\___ parser error
With this patch, the event cpu_clk_unhalted.core can be parsed.
[root@localhost perf]# ./perf stat -e cpu_clk_unhalted.core -a -vvv
------------------------------------------------------------
perf_event_attr:
type 4
size 112
config 0x3c
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
...
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190729072755.2166-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b7c3b86f0 ]
Currently when too many retries have occurred there is a memory
leak on the allocation for reply on the error return path. Fix
this by kfree'ing reply before returning.
Addresses-Coverity: ("Resource leak")
Fixes: a9cd9c044a ("drm/vmwgfx: Add a check to handle host message failure")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 04f5bda84b ]
When building with W=1, warnings about missing prototypes are emitted:
CC arch/x86/lib/cpu.o
arch/x86/lib/cpu.c:5:14: warning: no previous prototype for 'x86_family' [-Wmissing-prototypes]
5 | unsigned int x86_family(unsigned int sig)
| ^~~~~~~~~~
arch/x86/lib/cpu.c:18:14: warning: no previous prototype for 'x86_model' [-Wmissing-prototypes]
18 | unsigned int x86_model(unsigned int sig)
| ^~~~~~~~~
arch/x86/lib/cpu.c:33:14: warning: no previous prototype for 'x86_stepping' [-Wmissing-prototypes]
33 | unsigned int x86_stepping(unsigned int sig)
| ^~~~~~~~~~~~
Add the proper include file so the prototypes are there.
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/42513.1565234837@turing-police
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 752ead4449 ]
Abort processing of a command if we run out of mapped data in the
SG list. This should never happen, but a previous bug caused it to
be possible. Play it safe and attempt to abort nicely if we don't
have more SG segments left.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2d72715017 ]
For passthrough requests, libata-scsi takes what the user passes in
as gospel. This can be problematic if the user fills in the CDB
incorrectly. One example of that is in request sizes. For read/write
commands, the CDB contains fields describing the transfer length of
the request. These should match with the SG_IO header fields, but
libata-scsi currently does no validation of that.
Check that the number of blocks in the CDB for passthrough requests
matches what was mapped into the request. If the CDB asks for more
data then the validated SG_IO header fields, error it.
Reported-by: Krishna Ram Prakash R <krp@gtux.in>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 96a50c0d90 ]
On the arm64 platform, executing "ifconfig eth0 up" will fail,
returning "ifconfig: SIOCSIFFLAGS: Input/output error."
ndev->dev is not initialized, dma_map_single->get_dma_ops->
dummy_dma_ops->__dummy_map_page will return DMA_ERROR_CODE
directly, so when we use dma_map_single, the first parameter
is to use the device of platform_device.
Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f2243b8278 ]
TX_DESC_NUM is 256, in tx_count, the maximum value of
mod(TX_DESC_NUM - 1) is 254, the variable "count" in
the hip04_mac_start_xmit function is never equal to
(TX_DESC_NUM - 1), so hip04_mac_start_xmit never
return NETDEV_TX_BUSY.
tx_count is modified to mod(TX_DESC_NUM) so that
the maximum value of tx_count can reach
(TX_DESC_NUM - 1), then hip04_mac_start_xmit can reurn
NETDEV_TX_BUSY.
Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4a6a1385a4 ]
Do not try to return a fragment entry from TC list. Otherwise we may not
clean properly allocated entries.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8df7e8c23 ]
When queues >= 4 we use different registers but we were not subtracting
the offset of 4. Fix this.
Found out by Coverity.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit debea2cd31 ]
A call to 'kfree_skb()' is missing in the error handling path of
'init_one()'.
This is already present in 'remove_one()' but is missing here.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 24350fdadb ]
Perf relies on _etext and _stext symbols being one of 't', 'T', 'v' or
'V'. Put them into .text section to guarantee that.
Also moves padding to page boundary inside .text which has an effect that
.text section is now padded with nops rather than 0's, which apparently
has been the initial intention for specifying 0x0700 fill expression.
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Suggested-by: Andreas Krebbel <krebbel@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3edeb4a414 ]
When a reconnect happens in the middle of processing a compound chain
the code leaks a buffer from the memory pool. Fix this by properly
checking for a return code and freeing buffers in case of error.
Also maintain a buf variable to be equal to either smallbuf or bigbuf
depending on a response buffer size while parsing a chain and when
returning to the caller.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f7ccbed656 ]
In commit fe64ba5c63 ("drm/rockchip: Resume DP early") we moved
resume to be early but left suspend at its normal time. This seems
like it could be OK, but casues problems if a suspend gets interrupted
partway through. The OS only balances matching suspend/resume levels.
...so if suspend was called then resume will be called. If suspend
late was called then resume early will be called. ...but if suspend
was called resume early might not get called. This leads to an
unbalance in the clock enables / disables.
Lets take the simple fix and just move suspend to be late to match.
This makes the PM core take proper care in keeping things balanced.
Fixes: fe64ba5c63 ("drm/rockchip: Resume DP early")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802184616.44822-1-dianders@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>