Commit Graph

677432 Commits

Author SHA1 Message Date
Helen Koike
f9f38e3338 nvme: improve performance for virtual NVMe devices
This change provides a mechanism to reduce the number of MMIO doorbell
writes for the NVMe driver. When running in a virtualized environment
like QEMU, the cost of an MMIO is quite hefy here. The main idea for
the patch is provide the device two memory location locations:
 1) to store the doorbell values so they can be lookup without the doorbell
    MMIO write
 2) to store an event index.
I believe the doorbell value is obvious, the event index not so much.
Similar to the virtio specification, the virtual device can tell the
driver (guest OS) not to write MMIO unless you are writing past this
value.

FYI: doorbell values are written by the nvme driver (guest OS) and the
event index is written by the virtual device (host OS).

The patch implements a new admin command that will communicate where
these two memory locations reside. If the command fails, the nvme
driver will work as before without any optimizations.

Contributions:
  Eric Northup <digitaleric@google.com>
  Frank Swiderski <fes@google.com>
  Ted Tso <tytso@mit.edu>
  Keith Busch <keith.busch@intel.com>

Just to give an idea on the performance boost with the vendor
extension: Running fio [1], a stock NVMe driver I get about 200K read
IOPs with my vendor patch I get about 1000K read IOPs. This was
running with a null device i.e. the backing device simply returned
success on every read IO request.

[1] Running on a 4 core machine:
  fio --time_based --name=benchmark --runtime=30
  --filename=/dev/nvme0n1 --nrfiles=1 --ioengine=libaio --iodepth=32
  --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=4
  --rw=randread --blocksize=4k --randrepeat=false

Signed-off-by: Rob Nelson <rlnelson@google.com>
[mlin: port for upstream]
Signed-off-by: Ming Lin <mlin@kernel.org>
[koike: updated for upstream]
Signed-off-by: Helen Koike <helen.koike@collabora.co.uk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
2017-04-21 16:41:47 +02:00
Keith Busch
81c1cd9835 nvme/pci: Don't set reserved SQ create flags
The QPRIO field is only valid if weighted round robin arbitration is used,
and this driver doesn't enable that controller configuration option.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2017-04-21 16:41:46 +02:00
Florian Fainelli
8e6c1812e6 net: dsa: Remove redundant NULL dst check
tag_lan9303.c does check for a NULL dst but that's already checked by
dsa_switch_rcv() one layer above.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Juergen Borleis <jbe@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 10:41:24 -04:00
Johannes Thumshirn
c5ce0abeb6 scsi: sas: move scsi_remove_host call into sas_remove_host
Move scsi_remove_host call into sas_remove_host and remove it from SAS
HBA drivers, so we don't mess up the ordering. This solves an issue with
double deleting sysfs entries that was introduced by the change of sysfs
behaviour from commit bcdde7e221 ("sysfs: make __sysfs_remove_dir()
recursive").

[mkp: addressed checkpatch complaints]

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Suggested-by: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: James Bottomley <jejb@linux.vnet.ibm.com>
Cc: Jinpu Wang <jinpu.wang@profitbricks.com>
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jinpu Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-21 10:34:29 -04:00
Colin Ian King
20961065a8 scsi: BusLogic: fix incorrect spelling of adatper_reset_req
Trivial fix to spelling mistake, adatper_reset_req should be
adapter_reset_req.  Also break up very long seq_printf statement into
multiple lines.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Khalid Aziz <khalid@gonehiking.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-21 10:31:33 -04:00
Kees Cook
b34b10a725 scsi: bfa: use designated initializers
Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. This also initializes the
array members using the enum used to look up __port_action entries.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-21 10:11:34 -04:00
Jens Axboe
99c749a4c4 blk-stat: kill blk_stat_rq_ddir()
No point in providing and exporting this helper. There's just
one (real) user of it, just use rq_data_dir().

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-21 07:56:23 -06:00
Paul E. McKenney
f2094107ac Merge branches 'doc.2017.04.12a', 'fixes.2017.04.19a' and 'srcu.2017.04.21a' into HEAD
doc.2017.04.12a: Documentation updates
fixes.2017.04.19a: Miscellaneous fixes
srcu.2017.04.21a: Parallelize SRCU callback handling
2017-04-21 06:00:13 -07:00
Paul E. McKenney
bcbfdd01dc rcu: Make non-preemptive schedule be Tasks RCU quiescent state
Currently, a call to schedule() acts as a Tasks RCU quiescent state
only if a context switch actually takes place.  However, just the
call to schedule() guarantees that the calling task has moved off of
whatever tracing trampoline that it might have been one previously.
This commit therefore plumbs schedule()'s "preempt" parameter into
rcu_note_context_switch(), which then records the Tasks RCU quiescent
state, but only if this call to schedule() was -not- due to a preemption.

To avoid adding overhead to the common-case context-switch path,
this commit hides the rcu_note_context_switch() check under an existing
non-common-case check.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-21 05:59:27 -07:00
Paul E. McKenney
0497b489b8 srcu: Expedite srcu_schedule_cbs_snp() callback invocation
Although Tree SRCU does reduce delays when there is at least one
synchronize_srcu_expedited() invocation pending, srcu_schedule_cbs_snp()
still waits for SRCU_INTERVAL before invoking callbacks.  Since
synchronize_srcu_expedited() now posts a callback and waits for
that callback to do a wakeup, this destroys the expedited nature of
synchronize_srcu_expedited().  This destruction became apparent to
Marc Zyngier in the guise of a guest-OS bootup slowdown from five
seconds to no fewer than forty seconds.

This commit therefore invokes callbacks immediately at the end of the
grace period when there is at least one synchronize_srcu_expedited()
invocation pending.  This brought Marc's guest-OS bootup times back
into the realm of reason.

Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
2017-04-21 05:59:27 -07:00
Paul E. McKenney
da915ad5cf srcu: Parallelize callback handling
Peter Zijlstra proposed using SRCU to reduce mmap_sem contention [1,2],
however, there are workloads that could result in a high volume of
concurrent invocations of call_srcu(), which with current SRCU would
result in excessive lock contention on the srcu_struct structure's
->queue_lock, which protects SRCU's callback lists.  This commit therefore
moves SRCU to per-CPU callback lists, thus greatly reducing contention.

Because a given SRCU instance no longer has a single centralized callback
list, starting grace periods and invoking callbacks are both more complex
than in the single-list Classic SRCU implementation.  Starting grace
periods and handling callbacks are now handled using an srcu_node tree
that is in some ways similar to the rcu_node trees used by RCU-bh,
RCU-preempt, and RCU-sched (for example, the srcu_node tree shape is
controlled by exactly the same Kconfig options and boot parameters that
control the shape of the rcu_node tree).

In addition, the old per-CPU srcu_array structure is now named srcu_data
and contains an rcu_segcblist structure named ->srcu_cblist for its
callbacks (and a spinlock to protect this).  The srcu_struct gets
an srcu_gp_seq that is used to associate callback segments with the
corresponding completion-time grace-period number.  These completion-time
grace-period numbers are propagated up the srcu_node tree so that the
grace-period workqueue handler can determine whether additional grace
periods are needed on the one hand and where to look for callbacks that
are ready to be invoked.

The srcu_barrier() function must now wait on all instances of the per-CPU
->srcu_cblist.  Because each ->srcu_cblist is protected by ->lock,
srcu_barrier() can remotely add the needed callbacks.  In theory,
it could also remotely start grace periods, but in practice doing so
is complex and racy.  And interestingly enough, it is never necessary
for srcu_barrier() to start a grace period because srcu_barrier() only
enqueues a callback when a callback is already present--and it turns out
that a grace period has to have already been started for this pre-existing
callback.  Furthermore, it is only the callback that srcu_barrier()
needs to wait on, not any particular grace period.  Therefore, a new
rcu_segcblist_entrain() function enqueues the srcu_barrier() function's
callback into the same segment occupied by the last pre-existing callback
in the list.  The special case where all the pre-existing callbacks are
on a different list (because they are in the process of being invoked)
is handled by enqueuing srcu_barrier()'s callback into the RCU_DONE_TAIL
segment, relying on the done-callbacks check that takes place after all
callbacks are inovked.

Note that the readers use the same algorithm as before.  Note that there
is a separate srcu_idx that tells the readers what counter to increment.
This unfortunately cannot be combined with srcu_gp_seq because they
need to be incremented at different times.

This commit introduces some ugly #ifdefs in rcutorture.  These will go
away when I feel good enough about Tree SRCU to ditch Classic SRCU.

Some crude performance comparisons, courtesy of a quickly hacked rcuperf
asynchronous-grace-period capability:

			Callback Queuing Overhead
			-------------------------
	# CPUS		Classic SRCU	Tree SRCU
	------          ------------    ---------
	     2              0.349 us     0.342 us
	    16             31.66  us     0.4   us
	    41             ---------     0.417 us

The times are the 90th percentiles, a statistic that was chosen to reject
the overheads of the occasional srcu_barrier() call needed to avoid OOMing
the test machine.  The rcuperf test hangs when running Classic SRCU at 41
CPUs, hence the line of dashes.  Despite the hacks to both the rcuperf code
and that statistics, this is a convincing demonstration of Tree SRCU's
performance and scalability advantages.

[1] https://lwn.net/Articles/309030/
[2] https://patchwork.kernel.org/patch/5108281/

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Fix initialization if synchronize_srcu_expedited() called first. ]
2017-04-21 05:59:26 -07:00
Michael Ellerman
9fea59bd7c powerpc/mm: Add support for runtime configuration of ASLR limits
Add powerpc support for mmap_rnd_bits and mmap_rnd_compat_bits, which are two
sysctls that allow a user to configure the number of bits of randomness used for
ASLR.

Because of the way the Kconfig for ARCH_MMAP_RND_BITS is defined, we have to
construct at least the MIN value in Kconfig, vs in a header which would be more
natural. Given that we just go ahead and do it all in Kconfig.

At least according to the code (the documentation makes no mention of it), the
value is defined as the number of bits of randomisation *of the page*, not the
address. This makes some sense, with larger page sizes more of the low bits are
forced to zero, which would reduce the randomisation if we didn't take the
PAGE_SIZE into account. However it does mean the min/max values have to change
depending on the PAGE_SIZE in order to actually limit the amount of address
space consumed by the randomisation.

The result of that is that we have to define the default values based on both
32-bit vs 64-bit, but also the configured PAGE_SIZE. Furthermore now that we
have 128TB address space support on Book3S, we also have to take that into
account.

Finally we can wire up the value in arch_mmap_rnd().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Tested-by: Bhupesh Sharma <bhsharma@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2017-04-21 22:57:55 +10:00
Paul E. McKenney
6ade8694f4 kvm: Move srcu_struct fields to end of struct kvm
Parallelizing SRCU callback handling will increase the size of
srcu_struct, which will move the kvm structure's kvm_arch field out
of reach of powerpc's current assembly code, which will result in the
following sort of build error:

arch/powerpc/kvm/book3s_hv_rmhandlers.S:617: Error: operand out of range (0x000000000000b328 is not between 0xffffffffffff8000 and 0x0000000000007fff)

This commit moves the srcu_struct fields in the kvm structure to follow
the kvm_arch field, which will allow powerpc's assembly code to continue
to be able to reach the kvm_arch field.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Michael Ellerman <michaele@au1.ibm.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[ paulmck: Moved this commit to precede SRCU callback parallelization,
  and reworded the commit log into future tense, all in the name of
  bisectability. ]
2017-04-21 05:56:31 -07:00
Gary R Hook
116591fe3e crypto: ccp - Disable interrupts early on unload
Ensure that we disable interrupts first when shutting down
the driver.

Cc: <stable@vger.kernel.org> # 4.9.x+

Signed-off-by: Gary R Hook <ghook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:57 +08:00
Gary R Hook
56467cb11c crypto: ccp - Use only the relevant interrupt bits
Each CCP queue can product interrupts for 4 conditions:
operation complete, queue empty, error, and queue stopped.
This driver only works with completion and error events.

Cc: <stable@vger.kernel.org> # 4.9.x+

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:56 +08:00
Sean Wang
7701d1ff8e hwrng: mtk - Add driver for hardware random generator on MT7623 SoC
This patch adds support for hardware random generator on MT7623 SoC
and should also work on other similar Mediatek SoCs. Currently,
the driver is already tested successfully with rng-tools.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Reviewed-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:56 +08:00
Sean Wang
c2b06bcd4b dt-bindings: hwrng: Add Mediatek hardware random generator bindings
Document the devicetree bindings for Mediatek random number
generator which could be found on MT7623 SoC or other similar
Mediatek SoCs.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:55 +08:00
Michael Ellerman
0f89f6e188 crypto: crct10dif-vpmsum - Fix missing preempt_disable()
In crct10dif_vpmsum() we call enable_kernel_altivec() without first
disabling preemption, which is not allowed.

It used to be sufficient just to call pagefault_disable(), because that
also disabled preemption. But the two were decoupled in commit 8222dbe21e
("sched/preempt, mm/fault: Decouple preemption from the page fault
logic") in mid 2015.

The crct10dif-vpmsum code inherited this bug from the crc32c-vpmsum code
on which it was modelled.

So add the missing preempt_disable/enable(). We should also call
disable_kernel_fp(), although it does nothing by default, there is a
debug switch to make it active and all enables should be paired with
disables.

Fixes: b01df1c16c ("crypto: powerpc - Add CRC-T10DIF acceleration")
Acked-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:51 +08:00
Giovanni Cabiddu
a9943a0ad1 crypto: testmgr - replace compression known answer test
Compression implementations might return valid outputs that
do not match what specified in the test vectors.
For this reason, the testmgr might report that a compression
implementation failed the test even if the data produced
by the compressor is correct.
This implements a decompress-and-verify test for acomp
compression tests rather than a known answer test.

Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:50 +08:00
Giovanni Cabiddu
3ce5bc72eb crypto: acomp - allow registration of multiple acomps
Add crypto_register_acomps and crypto_unregister_acomps to allow
the registration of multiple implementations with one call.

Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:50 +08:00
Markus Elfring
b2a1d27179 hwrng: n2 - Use devm_kcalloc() in n2rng_probe()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "devm_kcalloc".

* Replace the specification of a data structure by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:47 +08:00
Christophe Jaillet
ec1bca941a crypto: chcr - Fix error handling related to 'chcr_alloc_shash'
Up to now, 'crypto_alloc_shash()' may return a valid pointer, an error
pointer or NULL (in case of invalid parameter)
Update it to always return an error pointer in case of error. It now
returns ERR_PTR(-EINVAL) instead of NULL in case of invalid parameter.

This simplifies error handling.

Also fix a crash in 'chcr_authenc_setkey()' if 'chcr_alloc_shash()'
returns an error pointer and the "goto out" path is taken.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:47 +08:00
Jason A. Donenfeld
69b348449b padata: get_next is never NULL
Per Dan's static checker warning, the code that returns NULL was removed
in 2010, so this patch updates the comments and fixes the code
assumptions.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:46 +08:00
Krzysztof Kozlowski
c46ea13f55 crypto: exynos - Add new Exynos RNG driver
Replace existing hw_ranndom/exynos-rng driver with a new, reworked one.
This is a driver for pseudo random number generator block which on
Exynos4 chipsets must be seeded with some value.  On newer Exynos5420
chipsets it might seed itself from true random number generator block
but this is not implemented yet.

New driver is a complete rework to use the crypto ALGAPI instead of
hw_random API.  Rationale for the change:
1. hw_random interface is for true RNG devices.
2. The old driver was seeding itself with jiffies which is not a
   reliable source for randomness.
3. Device generates five random 32-bit numbers in each pass but old
   driver was returning only one 32-bit number thus its performance was
   reduced.

Compatibility with DeviceTree bindings is preserved.

New driver does not use runtime power management but manually enables
and disables the clock when needed.  This is preferred approach because
using runtime PM just to toggle clock is huge overhead.

Another difference is reseeding itself with generated random data
periodically and during resuming from system suspend (previously driver
was re-seeding itself again with jiffies).

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Stephan Müller <smueller@chronox.de>
Reviewed-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:46 +08:00
Krzysztof Kozlowski
ed067d4a85 linux/kernel.h: Add ALIGN_DOWN macro
Few parts of kernel define their own macro for aligning down so provide
a common define for this, with the same usage and assumptions as existing
ALIGN.

Convert also three existing implementations to this one.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:44 +08:00
Wei Yongjun
7e207d8550 crypto: caam - fix error return code in caam_qi_init()
Fix to return error code -ENOMEM from the kmem_cache_create() error
handling case instead of 0(err is 0 here), as done elsewhere in this
function.

Fixes: 67c2315def ("crypto: caam - add Queue Interface (QI) backend support")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:37 +08:00
Harsh Jain
0e93708dab crypto: chcr - Add fallback for AEAD algos
Fallback to sw when
    I AAD length greater than 511
    II Zero length payload
    II No of sg entries exceeds Request size.

Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:35 +08:00
Harsh Jain
72a56ca97d crypto: chcr - Fix txq ids.
The patch fixes a critical issue to map txqid with flows on the hardware appropriately,
if tx queues created are more than flows configured then  txqid shall map within
the range of hardware flows configured. This ensure that un-mapped txqid does not remain un-handled.
The patch also segregated the rxqid and txqid for clarity.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Reviewed-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:34 +08:00
Harsh Jain
0a7bd30c1b crypto: chcr - Set hmac_ctrl bit to use HW register HMAC_CFG[456]
Use hmac_ctrl bit value saved in setauthsize callback.

Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:32 +08:00
Harsh Jain
e29abda591 crypto: chcr - Increase priority of AEAD algos.
templates(gcm,ccm etc) inherit priority value of driver to
calculate its priority. In some cases template priority becomes
 more than driver priority for same algo.
Without this patch we will not be able to use driver authenc algos. It will
be good if it pushed in stable kernel.

Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-04-21 20:30:31 +08:00
Wolfram Sang
63a761eef5 i2c: rcar: clarify PM handling with more comments
PM handling is correct but might be a bit subtle. Add some comments for
clarification.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-04-21 14:08:57 +02:00
Wolfram Sang
ae481cc139 i2c: rcar: fix resume by always initializing registers before transfer
Resume failed because of uninitialized registers. Instead of adding a
resume callback, we simply initialize registers before every transfer.
This lightweight change is more robust and will keep us safe if we ever
need support for power domains or dynamic frequency changes.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-04-21 14:08:52 +02:00
Colin Ian King
8c91fd5ee6 i2c: tegra: fix spelling mistake: "contoller" -> "controller"
trivial fix to spelling mistake in MODULE_DESCRIPTION text

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-04-21 14:04:57 +02:00
Andrzej Hajda
b371f866d9 i2c: exynos5: use core helper to get driver data
Driver core provides of_device_get_match_data which can be used
to get driver data instead of custom helper.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-04-21 14:02:41 +02:00
Andrzej Hajda
70c8c4e9bf i2c: exynos5: de-duplicate error logs on clock setup
In case of clock setup error it is enough to log it once.
Moreover patch simplifies clock setup routines.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-04-21 14:02:11 +02:00
Andrzej Hajda
b9d5b31a0d i2c: exynos5: simplify clock frequency handling
There is no need to keep separate settings for high and fast speed clock.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-04-21 14:01:16 +02:00
Andrzej Hajda
b917d4fd50 i2c: exynos5: simplify timings calculation
Instead of using cryptic loop direct calculation of timings
can be used.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Reviewed-by: Andi Shyti <andi.shyti@samsung.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-04-21 14:00:16 +02:00
Oliver O'Halloran
f855b2f544 powerpc/mm: Wire up ioremap_cache()
The default implementation of ioremap_cache() is aliased to ioremap().
On powerpc ioremap() creates cache-inhibited mappings by default which
is almost certainly not what you wanted.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-21 21:08:47 +10:00
Marcelo Tosatti
e891a32e7a KVM: x86: remove irq disablement around KVM_SET_CLOCK/KVM_GET_CLOCK
The disablement of interrupts at KVM_SET_CLOCK/KVM_GET_CLOCK
attempts to disable software suspend from causing "non atomic behaviour" of
the operation:

    Add a helper function to compute the kernel time and convert nanoseconds
    back to CPU specific cycles.  Note that these must not be called in preemptible
    context, as that would mean the kernel could enter software suspend state,
    which would cause non-atomic operation.

However, assume the kernel can enter software suspend at the following 2 points:

        ktime_get_ts(&ts);
1.
						hypothetical_ktime_get_ts(&ts)
        monotonic_to_bootbased(&ts);
2.

monotonic_to_bootbased() should be correct relative to a ktime_get_ts(&ts)
performed after point 1 (that is after resuming from software suspend),
hypothetical_ktime_get_ts()

Therefore it is also correct for the ktime_get_ts(&ts) before point 1,
which is

	ktime_get_ts(&ts) = hypothetical_ktime_get_ts(&ts) + time-to-execute-suspend-code

Note CLOCK_MONOTONIC does not count during suspension.

So remove the irq disablement, which causes the following warning on
-RT kernels:

 With this reasoning, and the -RT bug that the irq disablement causes
 (because spin_lock is now a sleeping lock), remove the IRQ protection as it
 causes:

 [ 1064.668109] in_atomic(): 0, irqs_disabled(): 1, pid: 15296, name:m
 [ 1064.668110] INFO: lockdep is turned off.
 [ 1064.668110] irq event stamp: 0
 [ 1064.668112] hardirqs last  enabled at (0): [<          (null)>]  )
 [ 1064.668116] hardirqs last disabled at (0): [] c0
 [ 1064.668118] softirqs last  enabled at (0): [] c0
 [ 1064.668118] softirqs last disabled at (0): [<          (null)>]  )
 [ 1064.668121] CPU: 13 PID: 15296 Comm: qemu-kvm Not tainted 3.10.0-1
 [ 1064.668121] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 5
 [ 1064.668123]  ffff8c1796b88000 00000000afe7344c ffff8c179abf3c68 f3
 [ 1064.668125]  ffff8c179abf3c90 ffffffff930ccb3d ffff8c1b992b3610 f0
 [ 1064.668126]  00007ffc1a26fbc0 ffff8c179abf3cb0 ffffffff9375f694 f0
 [ 1064.668126] Call Trace:
 [ 1064.668132]  [] dump_stack+0x19/0x1b
 [ 1064.668135]  [] __might_sleep+0x12d/0x1f0
 [ 1064.668138]  [] rt_spin_lock+0x24/0x60
 [ 1064.668155]  [] __get_kvmclock_ns+0x36/0x110 [k]
 [ 1064.668159]  [] ? futex_wait_queue_me+0x103/0x10
 [ 1064.668171]  [] kvm_arch_vm_ioctl+0xa2/0xd70 [k]
 [ 1064.668173]  [] ? futex_wait+0x1ac/0x2a0

v2: notice get_kvmclock_ns with the same problem (Pankaj).
v3: remove useless helper function (Pankaj).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-21 12:50:28 +02:00
Michael S. Tsirkin
668fffa3f8 kvm: better MWAIT emulation for guests
Guests that are heavy on futexes end up IPI'ing each other a lot. That
can lead to significant slowdowns and latency increase for those guests
when running within KVM.

If only a single guest is needed on a host, we have a lot of spare host
CPU time we can throw at the problem. Modern CPUs implement a feature
called "MWAIT" which allows guests to wake up sleeping remote CPUs without
an IPI - thus without an exit - at the expense of never going out of guest
context.

The decision whether this is something sensible to use should be up to the
VM admin, so to user space. We can however allow MWAIT execution on systems
that support it properly hardware wise.

This patch adds a CAP to user space and a KVM cpuid leaf to indicate
availability of native MWAIT execution. With that enabled, the worst a
guest can do is waste as many cycles as a "jmp ." would do, so it's not
a privilege problem.

We consciously do *not* expose the feature in our CPUID bitmap, as most
people will want to benefit from sleeping vCPUs to allow for over commit.

Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
[agraf: fix amd, change commit message]
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-21 12:50:28 +02:00
Kyle Huey
db2336a804 KVM: x86: virtualize cpuid faulting
Hardware support for faulting on the cpuid instruction is not required to
emulate it, because cpuid triggers a VM exit anyways. KVM handles the relevant
MSRs (MSR_PLATFORM_INFO and MSR_MISC_FEATURES_ENABLE) and upon a
cpuid-induced VM exit checks the cpuid faulting state and the CPL.
kvm_require_cpl is even kind enough to inject the GP fault for us.

Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Reviewed-by: David Matlack <dmatlack@google.com>
[Return "1" from kvm_emulate_cpuid, it's not void. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-21 12:50:06 +02:00
Paolo Bonzini
bd17117bb2 Merge tag 'kvm-s390-next-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: Guarded storage fixup and keyless subset mode

- detect and use the keyless subset mode (guests without
  storage keys)
- fix vSIE support for sdnxc
- fix machine check data for guarded storage
2017-04-21 12:47:47 +02:00
Paolo Bonzini
ec594c4710 Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD 2017-04-21 12:47:31 +02:00
Martin Schwidefsky
e525f8a6e6 s390/gs: add regset for the guarded storage broadcast control block
The guarded storage interface allows to register a control block for
each thread that is activated with the guarded storage broadcast event.
To retrieve the complete state of a process from the kernel a register
set for the stored broadcast control block is required.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-04-21 12:38:56 +02:00
Paolo Bonzini
8afd74c296 Merge branch 'x86/process' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into HEAD
Required for KVM support of the CPUID faulting feature.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-21 11:55:06 +02:00
David Hildenbrand
fe0e80befd KVM: VMX: drop vmm_exclusive module parameter
vmm_exclusive=0 leads to KVM setting X86_CR4_VMXE always and calling
VMXON only when the vcpu is loaded. X86_CR4_VMXE is used as an
indication in cpu_emergency_vmxoff() (called on kdump) if VMXOFF has to be
called. This is obviously not the case if both are used independtly.
Calling VMXOFF without a previous VMXON will result in an exception.

In addition, X86_CR4_VMXE is used as a mean to test if VMX is already in
use by another VMM in hardware_enable(). So there can't really be
co-existance. If the other VMM is prepared for co-existance and does a
similar check, only one VMM can exist. If the other VMM is not prepared
and blindly sets/clears X86_CR4_VMXE, we will get inconsistencies with
X86_CR4_VMXE.

As we also had bug reports related to clearing of vmcs with vmm_exclusive=0
this seems to be pretty much untested. So let's better drop it.

While at it, directly move setting/clearing X86_CR4_VMXE into
kvm_cpu_vmxon/off.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-21 11:42:49 +02:00
Farhan Ali
730cd632c4 KVM: s390: Support keyless subset guest mode
If the KSS facility is available on the machine, we also make it
available for our KVM guests.

The KSS facility bypasses storage key management as long as the guest
does not issue a related instruction. When that happens, the control is
returned to the host, which has to turn off KSS for a guest vcpu
before retrying the instruction.

Signed-off-by: Corey S. McQuay <csmcquay@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-04-21 11:08:11 +02:00
Farhan Ali
71cb1bf66e s390/sclp: Detect KSS facility
Let's detect the keyless subset facility.

Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-04-21 11:08:04 +02:00
Takashi Iwai
56798e6b3a ALSA: hda - Use a helper function for renaming kctl names
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-04-21 10:52:59 +02:00
Takashi Iwai
7beb3a6e93 ALSA: hda - Support Gigabyte Gaming board with dual Realtek codecs
This patch adds some workarounds to make Gigabyte GA-AX370 Gaming 5
board working without the conflicts of kctls, etc.  In general, the
dual codec configs result in the conflicts of the following stuff:
- Master controls
- Capture controls
- Analog loopback controls
In addition, the auto-mute and the auto-mic can't work well among
multiple codecs.

The current "solution" is to disable all these features, and use UCM
for a better PulseAudio management.  For a dedicated UCM profile, the
patch overrides the card longname so that the system an get a unique
profile path.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195305
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-04-21 10:52:46 +02:00