Commit Graph

109009 Commits

Author SHA1 Message Date
NeilBrown
f06bc03339 cred: allow get_cred() and put_cred() to be given NULL.
It is common practice for helpers like this to silently,
accept a NULL pointer.
get_rpccred() and put_rpccred() used by NFS act this way
and using the same interface will ease the conversion
for NFS, and simplify the resulting code.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:44 -05:00
NeilBrown
97d0fb239c cred: add get_cred_rcu()
Sometimes we want to opportunistically get a
ref to a cred in an rcu_read_lock protected section.
get_task_cred() does this, and NFS does as similar thing
with its own credential structures.
To prepare for NFS converting to use 'struct cred' more
uniformly, define get_cred_rcu(), and use it in
get_task_cred().

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:44 -05:00
NeilBrown
d89b22d46a cred: add cred_fscmp() for comparing creds.
NFS needs to compare to credentials, to see if they can
be treated the same w.r.t. filesystem access.  Sometimes
an ordering is needed when credentials are used as a key
to an rbtree.
NFS currently has its own private credential management from
before 'struct cred' existed.  To move it over to more consistent
use of 'struct cred' we need a comparison function.
This patch adds that function.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:44 -05:00
Mark Brown
58331d618b Merge remote-tracking branch 'regmap/topic/irq' into regmap-next 2018-12-19 18:38:33 +00:00
Bartosz Golaszewski
c82ea33ead regmap: irq: add an option to clear status registers on unmask
Some interrupt controllers whose interrupts are acked on read will set
the status bits for masked interrupts without changing the state of
the IRQ line.

Some chips have an additional "feature" where if those set bits are
not cleared before unmasking their respective interrupts, the IRQ
line will change the state and we'll interpret this as an interrupt
although it actually fired when it was masked.

Add a new field to the irq chip struct that tells the regmap irq chip
code to always clear the status registers before actually changing the
irq mask values.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2018-12-19 18:38:13 +00:00
Roy Pledge
e80081c34b soc: fsl: dpio: Add BP and FQ query APIs
Add FQ (Frame Queue) and BP (Buffer Pool) query APIs that
users of QBMan can invoke to see the status of the queues
and pools that they are using.

Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 10:37:22 -08:00
Matti Vaittinen
1c2928e3e3 regmap: regmap-irq/gpio-max77620: add level-irq support
Add level active IRQ support to regmap-irq irqchip. Change breaks
existing regmap-irq type setting. Convert the existing drivers which
use regmap-irq with trigger type setting (gpio-max77620) to work
with this new approach. So we do not magically support level-active
IRQs on gpio-max77620 - but add support to the regmap-irq for chips
which support them =)

We do not support distinguishing situation where HW supports rising
and falling edge detection but not both. Separating this would require
inventing yet another flags for IRQ types.

Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2018-12-19 18:35:45 +00:00
Christoffer Dall
8a411b060f KVM: arm/arm64: Remove arch timer workqueue
The use of a work queue in the hrtimer expire function for the bg_timer
is a leftover from the time when we would inject interrupts when the
bg_timer expired.

Since we are no longer doing that, we can instead call
kvm_vcpu_wake_up() directly from the hrtimer function and remove all
workqueue functionality from the arch timer code.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-19 17:47:07 +00:00
Pierre-Louis Bossart
d82b51c855 ALSA: HD-Audio: SKL+: force HDaudio legacy or SKL+ driver selection
For HDaudio and Skylake drivers, add module parameter "pci_binding"

When pci_binding == 0 (AUTO), the PCI class/subclass info is used to
select drivers based on the presence of the DSP.

pci_binding == 1 (LEGACY) forces the use of the HDAudio legacy driver,
even if the DSP is present.

pci_binding == 2 (ASOC) forces the use of the ASOC driver. The
information on the DSP presence is bypassed.

The value for the module parameter needs to be identical for both
drivers. This parameter is intended as a back-up solution if the
automatic detection fails or when the DSP usage fails. Such cases
should be reported on the alsa-devel mailing list for analysis.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-12-19 18:07:23 +01:00
Keyon Jie
18d43c9b88 ALSA: HDA: export process_unsol_events()
The SOF implementation does not rely on the hdac_bus library, however
for HDMI and HDaudio codec support it does need to deal with
unsolicited events. Instead of re-inventing the wheel, export this
symbol to reuse this part of the library directly.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-12-19 18:07:18 +01:00
David S. Miller
5a862f86b8 Merge tag 'mac80211-next-for-davem-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:

====================
This time we have too many changes to list, highlights:
 * virt_wifi - wireless control simulation on top of
   another network interface
 * hwsim configurability to test capabilities similar
   to real hardware
 * various mesh improvements
 * various radiotap vendor data fixes in mac80211
 * finally the nl_set_extack_cookie_u64() we talked
   about previously, used for
 * peer measurement APIs, right now only with FTM
   (flight time measurement) for location
 * made nl80211 radio/interface announcements more complete
 * various new HE (802.11ax) things:
   updates, TWT support, ...
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 08:36:18 -08:00
Boris Brezillon
6c4f52dca3 drm/connector: Allow creation of margin props alone
TV margins properties can only be added as part of the SDTV TV
connector properties creation, but we might need those props for HDMI
TVs too, so let's move the margins props creation in a separate
function and expose it to drivers.

We also add an helper to attach margins props to a connector.

Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181206142439.10441-4-boris.brezillon@bootlin.com
2018-12-19 14:47:58 +01:00
Boris Brezillon
56406e15b5 drm/connector: Clarify the unit of TV margins
All margins are expressed in pixels. Clarify that in the doc.

Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181206142439.10441-3-boris.brezillon@bootlin.com
2018-12-19 14:38:35 +01:00
Ingo Molnar
96af6cd02a Revert "x86/objtool: Use asm macros to work around GCC inlining bugs"
This reverts commit c06c4d8090.

See this commit for details about the revert:

  e769742d35 ("Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"")

Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Richard Biener <rguenther@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-19 12:00:23 +01:00
Ingo Molnar
ffb61c6346 Revert "x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs"
This reverts commit f81f8ad56f.

See this commit for details about the revert:

  e769742d35 ("Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"")

Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Richard Biener <rguenther@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-19 12:00:00 +01:00
Dou Liyang
c410abbbac genirq/affinity: Add is_managed to struct irq_affinity_desc
Devices which use managed interrupts usually have two classes of
interrupts:

  - Interrupts for multiple device queues
  - Interrupts for general device management

Currently both classes are treated the same way, i.e. as managed
interrupts. The general interrupts get the default affinity mask assigned
while the device queue interrupts are spread out over the possible CPUs.

Treating the general interrupts as managed is both a limitation and under
certain circumstances a bug. Assume the following situation:

 default_irq_affinity = 4..7

So if CPUs 4-7 are offlined, then the core code will shut down the device
management interrupts because the last CPU in their affinity mask went
offline.

It's also a limitation because it's desired to allow manual placement of
the general device interrupts for various reasons. If they are marked
managed then the interrupt affinity setting from both user and kernel space
is disabled. That limitation was reported by Kashyap and Sumit.

Expand struct irq_affinity_desc with a new bit 'is_managed' which is set
for truly managed interrupts (queue interrupts) and cleared for the general
device interrupts.

[ tglx: Simplify code and massage changelog ]

Reported-by: Kashyap Desai <kashyap.desai@broadcom.com>
Reported-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Dou Liyang <douliyangs@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: shivasharan.srikanteshwara@broadcom.com
Cc: ming.lei@redhat.com
Cc: hch@lst.de
Cc: bhelgaas@google.com
Cc: douliyang1@huawei.com
Link: https://lkml.kernel.org/r/20181204155122.6327-3-douliyangs@gmail.com
2018-12-19 11:32:08 +01:00
Dou Liyang
bec04037e4 genirq/core: Introduce struct irq_affinity_desc
The interrupt affinity management uses straight cpumask pointers to convey
the automatically assigned affinity masks for managed interrupts. The core
interrupt descriptor allocation also decides based on the pointer being non
NULL whether an interrupt is managed or not.

Devices which use managed interrupts usually have two classes of
interrupts:

  - Interrupts for multiple device queues
  - Interrupts for general device management

Currently both classes are treated the same way, i.e. as managed
interrupts. The general interrupts get the default affinity mask assigned
while the device queue interrupts are spread out over the possible CPUs.

Treating the general interrupts as managed is both a limitation and under
certain circumstances a bug. Assume the following situation:

 default_irq_affinity = 4..7

So if CPUs 4-7 are offlined, then the core code will shut down the device
management interrupts because the last CPU in their affinity mask went
offline.

It's also a limitation because it's desired to allow manual placement of
the general device interrupts for various reasons. If they are marked
managed then the interrupt affinity setting from both user and kernel space
is disabled.

To remedy that situation it's required to convey more information than the
cpumasks through various interfaces related to interrupt descriptor
allocation.

Instead of adding yet another argument, create a new data structure
'irq_affinity_desc' which for now just contains the cpumask. This struct
can be expanded to convey auxilliary information in the next step.

No functional change, just preparatory work.

[ tglx: Simplified logic and clarified changelog ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Dou Liyang <douliyangs@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: kashyap.desai@broadcom.com
Cc: shivasharan.srikanteshwara@broadcom.com
Cc: sumit.saxena@broadcom.com
Cc: ming.lei@redhat.com
Cc: hch@lst.de
Cc: douliyang1@huawei.com
Link: https://lkml.kernel.org/r/20181204155122.6327-2-douliyangs@gmail.com
2018-12-19 11:32:08 +01:00
Amanoel Dawod
ac8b6f148f Fonts: New Terminus large console font
This patch adds an option to compile-in a high resolution
and large Terminus (ter16x32) bitmap console font for use with
HiDPI and Retina screens.

The font was convereted from standard Terminus ter-i32b.psf
(size 16x32) with the help of psftools and minor hand editing
deleting useless characters.

This patch is non-intrusive, no options are enabled by default so most
users won't notice a thing.

I am placing my changes under the GPL 2.0 just as source Terminus font.

Signed-off-by: Amanoel Dawod <amanoeladawod@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-19 10:42:08 +01:00
Vincent Guittot
8234f6734c PM-runtime: Switch autosuspend over to using hrtimers
PM-runtime uses the timer infrastructure for autosuspend. This implies
that the minimum time before autosuspending a device is in the range
of 1 tick included to 2 ticks excluded
 -On arm64 this means between 4ms and 8ms with default jiffies
  configuration
 -And on arm, it is between 10ms and 20ms

These values are quite high for embedded systems which sometimes want
the duration to be in the range of 1 ms.

It is possible to switch autosuspend over to using hrtimers to get
finer granularity for short durations and take advantage of slack to
retain some margins and get long timeouts with minimum wakeups.

On an arm64 platform that uses 1ms for autosuspending timeout of its
GPU, idle power is reduced by 10% with hrtimer.

The latency impact on arm64 hikey octo cores is:
 - mark_last_busy: from 1.11 us to 1.25 us
 - rpm_suspend: from 15.54 us to 15.38 us
[Only the code path of rpm_suspend() that starts hrtimer has been
measured.]

arm64 image (arm64 default defconfig) decreases by around 3KB
with following details:

$ size vmlinux-timer
   text	   data	    bss	    dec	    hex	filename
12034646	6869268	 386840	19290754	1265a82	vmlinux

$ size vmlinux-hrtimer
   text	   data	    bss	    dec	    hex	filename
12030550	6870164	 387032	19287746	1264ec2	vmlinux

The latency impact on arm 32bits snowball dual cores is :
 - mark_last_busy: from 0.31 us usec to 0.77 us
 - rpm_suspend: from 6.83 us to 6.67 usec

The increase of the image for snowball platform that I used for
testing performance impact, is neglictable (244B).

$ size vmlinux-timer
   text	   data	    bss	    dec	    hex	filename
7157961	2119580	 264120	9541661	 91981d	build-ux500/vmlinux

size vmlinux-hrtimer
   text	   data	    bss	    dec	    hex	filename
7157773	2119884	 264248	9541905	 919911	vmlinux-hrtimer

And arm 32bits image (multi_v7_defconfig) increases by around 1.7KB
with following details:

$ size vmlinux-timer
   text	   data	    bss	    dec	    hex	filename
13304443	6803420	 402768	20510631	138f7a7	vmlinux

$ size vmlinux-hrtimer
   text	   data	    bss	    dec	    hex	filename
13304299	6805276	 402768	20512343	138fe57	vmlinux

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-19 10:31:50 +01:00
Christian Brauner
3ad20fe393 binder: implement binderfs
As discussed at Linux Plumbers Conference 2018 in Vancouver [1] this is the
implementation of binderfs.

/* Abstract */
binderfs is a backwards-compatible filesystem for Android's binder ipc
mechanism. Each ipc namespace will mount a new binderfs instance. Mounting
binderfs multiple times at different locations in the same ipc namespace
will not cause a new super block to be allocated and hence it will be the
same filesystem instance.
Each new binderfs mount will have its own set of binder devices only
visible in the ipc namespace it has been mounted in. All devices in a new
binderfs mount will follow the scheme binder%d and numbering will always
start at 0.

/* Backwards compatibility */
Devices requested in the Kconfig via CONFIG_ANDROID_BINDER_DEVICES for the
initial ipc namespace will work as before. They will be registered via
misc_register() and appear in the devtmpfs mount. Specifically, the
standard devices binder, hwbinder, and vndbinder will all appear in their
standard locations in /dev. Mounting or unmounting the binderfs mount in
the initial ipc namespace will have no effect on these devices, i.e. they
will neither show up in the binderfs mount nor will they disappear when the
binderfs mount is gone.

/* binder-control */
Each new binderfs instance comes with a binder-control device. No other
devices will be present at first. The binder-control device can be used to
dynamically allocate binder devices. All requests operate on the binderfs
mount the binder-control device resides in.
Assuming a new instance of binderfs has been mounted at /dev/binderfs
via mount -t binderfs binderfs /dev/binderfs. Then a request to create a
new binder device can be made as illustrated in [2].
Binderfs devices can simply be removed via unlink().

/* Implementation details */
- dynamic major number allocation:
  When binderfs is registered as a new filesystem it will dynamically
  allocate a new major number. The allocated major number will be returned
  in struct binderfs_device when a new binder device is allocated.
- global minor number tracking:
  Minor are tracked in a global idr struct that is capped at
  BINDERFS_MAX_MINOR. The minor number tracker is protected by a global
  mutex. This is the only point of contention between binderfs mounts.
- struct binderfs_info:
  Each binderfs super block has its own struct binderfs_info that tracks
  specific details about a binderfs instance:
  - ipc namespace
  - dentry of the binder-control device
  - root uid and root gid of the user namespace the binderfs instance
    was mounted in
- mountable by user namespace root:
  binderfs can be mounted by user namespace root in a non-initial user
  namespace. The devices will be owned by user namespace root.
- binderfs binder devices without misc infrastructure:
  New binder devices associated with a binderfs mount do not use the
  full misc_register() infrastructure.
  The misc_register() infrastructure can only create new devices in the
  host's devtmpfs mount. binderfs does however only make devices appear
  under its own mountpoint and thus allocates new character device nodes
  from the inode of the root dentry of the super block. This will have
  the side-effect that binderfs specific device nodes do not appear in
  sysfs. This behavior is similar to devpts allocated pts devices and
  has no effect on the functionality of the ipc mechanism itself.

[1]: https://goo.gl/JL2tfX
[2]: program to allocate a new binderfs binder device:

     #define _GNU_SOURCE
     #include <errno.h>
     #include <fcntl.h>
     #include <stdio.h>
     #include <stdlib.h>
     #include <string.h>
     #include <sys/ioctl.h>
     #include <sys/stat.h>
     #include <sys/types.h>
     #include <unistd.h>
     #include <linux/android/binder_ctl.h>

     int main(int argc, char *argv[])
     {
             int fd, ret, saved_errno;
             size_t len;
             struct binderfs_device device = { 0 };

             if (argc < 2)
                     exit(EXIT_FAILURE);

             len = strlen(argv[1]);
             if (len > BINDERFS_MAX_NAME)
                     exit(EXIT_FAILURE);

             memcpy(device.name, argv[1], len);

             fd = open("/dev/binderfs/binder-control", O_RDONLY | O_CLOEXEC);
             if (fd < 0) {
                     printf("%s - Failed to open binder-control device\n",
                            strerror(errno));
                     exit(EXIT_FAILURE);
             }

             ret = ioctl(fd, BINDER_CTL_ADD, &device);
             saved_errno = errno;
             close(fd);
             errno = saved_errno;
             if (ret < 0) {
                     printf("%s - Failed to allocate new binder device\n",
                            strerror(errno));
                     exit(EXIT_FAILURE);
             }

             printf("Allocated new binder device with major %d, minor %d, and "
                    "name %s\n", device.major, device.minor,
                    device.name);

             exit(EXIT_SUCCESS);
     }

Cc: Martijn Coenen <maco@android.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-19 09:40:13 +01:00
Todd Kjos
80cd795630 binder: fix use-after-free due to ksys_close() during fdget()
44d8047f1d ("binder: use standard functions to allocate fds")
exposed a pre-existing issue in the binder driver.

fdget() is used in ksys_ioctl() as a performance optimization.
One of the rules associated with fdget() is that ksys_close() must
not be called between the fdget() and the fdput(). There is a case
where this requirement is not met in the binder driver which results
in the reference count dropping to 0 when the device is still in
use. This can result in use-after-free or other issues.

If userpace has passed a file-descriptor for the binder driver using
a BINDER_TYPE_FDA object, then kys_close() is called on it when
handling a binder_ioctl(BC_FREE_BUFFER) command. This violates
the assumptions for using fdget().

The problem is fixed by deferring the close using task_work_add(). A
new variant of __close_fd() was created that returns a struct file
with a reference. The fput() is deferred instead of using ksys_close().

Fixes: 44d8047f1d ("binder: use standard functions to allocate fds")
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-19 09:40:13 +01:00
David S. Miller
3061169a47 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2018-12-18

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) promote bpf_perf_event.h to mandatory UAPI header, from Masahiro.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 22:30:33 -08:00
Leon Romanovsky
2acc7957db net/mlx5: Add shared Q counter bits
Updated HW specification file with needed bits to allow
sharing of Q counters between DEVX contexts and kernel.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-19 08:06:57 +02:00
Christoph Hellwig
38417468d4 scsi: block: remove the cluster flag
Now that the the SCSI layer replaced the use of the cluster flag with
segment size limits and the DMA boundary we can remove the cluster flag
from the block layer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-18 23:39:26 -05:00
Christoph Hellwig
4af14d113b scsi: remove the use_clustering flag
The same effects can be achieved by setting the dma_boundary to
PAGE_SIZE - 1 and the max_segment_size to PAGE_SIZE, so shift those
settings into the drivers.  Note that in many cases the setting might
be bogus, but this keeps the status quo.

[mkp: fix myrs and myrb]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-18 23:19:21 -05:00
Christoph Hellwig
50c2e9107f scsi: introduce a max_segment_size host_template parameters
This allows the host driver to indicate the maximum supported
segment size in a nice an easy way, so that the driver doesn't
have to worry about DMA-layer imposed limitations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-18 23:13:12 -05:00
Christoph Hellwig
2a3d4eb8e2 scsi: flip the default on use_clustering
Most SCSI drivers want to enable "clustering", that is merging of
segments so that they might span more than a single page.  Remove the
ENABLE_CLUSTERING define, and require drivers to explicitly set
DISABLE_CLUSTERING to disable this feature.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-18 23:13:12 -05:00
Davide Caratti
e2c4cf7f98 net: Use __kernel_clockid_t in uapi net_stamp.h
Herton reports the following error when building a userspace program that
includes net_stamp.h:

 In file included from foo.c:2:
 /usr/include/linux/net_tstamp.h:158:2: error: unknown type name
 ‘clockid_t’
   clockid_t clockid; /* reference clockid */
   ^~~~~~~~~

Fix it by using __kernel_clockid_t in place of clockid_t.

Fixes: 80b14dee2b ("net: Add a new socket option for a future transmit time.")
Cc: Timothy Redaelli <tredaelli@redhat.com>
Reported-by: Herton R. Krzesinski <herton@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Tested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:59:29 -08:00
Linus Torvalds
62393dbcbe Merge tag 'for-linus-20181218' of git://git.kernel.dk/linux-block
Pull block fix from Jens Axboe:
 "Correct an ioctl direction for the zoned ioctls"

* tag 'for-linus-20181218' of git://git.kernel.dk/linux-block:
  uapi: linux/blkzoned.h: fix BLKGETZONESZ and BLKGETNRZONES definitions
2018-12-18 15:47:40 -08:00
John Fastabend
3bdbd0228e bpf: sockmap, metadata support for reporting size of msg
This adds metadata to sk_msg_md for BPF programs to read the sk_msg
size.

When the SK_MSG program is running under an application that is using
sendfile the data is not copied into sk_msg buffers by default. Rather
the BPF program uses sk_msg_pull_data to read the bytes in. This
avoids doing the costly memcopy instructions when they are not in
fact needed. However, if we don't know the size of the sk_msg we
have to guess if needed bytes are available by doing a pull request
which may fail. By including the size of the sk_msg BPF programs can
check the size before issuing sk_msg_pull_data requests.

Additionally, the same applies for sendmsg calls when the application
provides multiple iovs. Here the BPF program needs to pull in data
to update data pointers but its not clear where the data ends without
a size parameter. In many cases "guessing" is not easy to do
and results in multiple calls to pull and without bounded loops
everything gets fairly tricky.

Clean this up by including a u32 size field. Note, all writes into
sk_msg_md are rejected already from sk_msg_is_valid_access so nothing
additional is needed there.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-19 00:27:23 +01:00
Heiner Kallweit
2b3e88ea65 net: phy: improve phy state checking
Add helpers phy_is_started() and __phy_is_started() to avoid open-coded
checks whether PHY has been started. To make the check easier move
PHY_HALTED before PHY_UP in enum phy_state. Further improvements:

phy_start_aneg():
Return -EBUSY and print warning if function is called from a non-started
state (DOWN, READY, HALTED). Better check because function is exported
and drivers may use it incorrectly.

phy_interrupt():
Return IRQ_NONE also if state is DOWN or READY. We should never receive
an interrupt in one of these states, but better play safe.

phy_stop():
Just return and print a warning if PHY is in a non-started state.
This warning should help to identify drivers with unbalanced calls to
phy_start() / phy_stop().

phy_state_machine():
Schedule state machine run only if PHY is in a started state.
E.g. if state is READY we don't need the state machine, it will be
started by phy_start().

v2:
- don't use __func__ within phy_warn_state
v3:
- use WARN() instead of printing error message to facilitate debugging

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:11:07 -08:00
Shamir Rabinovitch
af8d70375d RDMA/restrack: Resource-tracker should not use uobject pointers
Having uobject pointer embedded in ib core objects is not aligned with a
future shared ib_x model. The resource tracker only does this to keep
track of user/kernel objects - track this directly instead.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18 15:38:26 -07:00
Moni Shoua
ad8a449675 IB/uverbs: Add support to advise_mr
Add new ioctl method for the MR object - ADVISE_MR.

This command can be used by users to give an advice or directions to the
kernel about an address range that belongs to memory regions.

A new ib_device callback, advise_mr(), is introduced here to suupport the
new command. This command takes the following arguments:

- pd:		The protection domain to which all memory regions belong
- advice: 	The type of the advice
	  	* IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH - Pre-fetch a range of
		an on-demand paging MR
	  	* IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_WRITE - Pre-fetch a range
		of an on-demand paging MR with write intention
- flags:	The properties of the advice
		* IB_UVERBS_ADVISE_MR_FLAG_FLUSH - Operation must end before
		return to the caller
- sg_list:	The list of memory ranges
- num_sge:	The number of memory ranges in the list
- attrs:	More attributes to be parsed by the provider

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18 15:19:46 -07:00
Moni Shoua
cbfdd442c4 IB/uverbs: Add helper to get array size from ptr attribute
When the parser of an ioctl command has the knowledge that a ptr attribute
in a bundle represents an array of structures, it is useful for it to know
the number of elements in the array. This is done by dividing the
attribute length with the element size.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18 15:19:45 -07:00
Matt Mullins
a38d1107f9 bpf: support raw tracepoints in modules
Distributions build drivers as modules, including network and filesystem
drivers which export numerous tracepoints.  This enables
bpf(BPF_RAW_TRACEPOINT_OPEN) to attach to those tracepoints.

Signed-off-by: Matt Mullins <mmullins@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-18 14:08:12 -08:00
Parav Pandit
bbc13cda37 RDMA/uverbs: Add an ioctl method to destroy an object
Add an ioctl method to destroy the PD, MR, MW, AH, flow, RWQ indirection
table and XRCD objects by handle which doesn't require any output response
during destruction.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18 14:59:57 -07:00
Jason Gunthorpe
149d3845f4 RDMA/uverbs: Add a method to introspect handles in a context
Introduce a helper function gather_objects_handle() to copy object handles
under a spin lock.

Expose these objects handles via the uverbs ioctl interface.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-18 14:54:54 -07:00
Alexandre Belloni
9a03201170 rtc: enforce rtc_timer_init private_data type
All the remaining users of rtc_timers are passing the rtc_device as private
data. Enforce that and rename private_data to rtc.

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2018-12-18 22:53:29 +01:00
David S. Miller
fde9cd69a5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2018-12-18

1) Fix error return code in xfrm_output_one()
   when no dst_entry is attached to the skb.
   From Wei Yongjun.

2) The xfrm state hash bucket count reported to
   userspace is off by one. Fix from Benjamin Poirier.

3) Fix NULL pointer dereference in xfrm_input when
   skb_dst_force clears the dst_entry.

4) Fix freeing of xfrm states on acquire. We use a
   dedicated slab cache for the xfrm states now,
   so free it properly with kmem_cache_free.
   From Mathias Krause.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 11:43:26 -08:00
Jason Gunthorpe
4785860e04 RDMA/uverbs: Implement an ioctl that can call write and write_ex handlers
Now that the handlers do not process their own udata we can make a
sensible ioctl that wrappers them. The ioctl follows the same format as
the write_ex() and has the user explicitly specify the core and driver
in/out opaque structures and a command number.

This works for all forms of write commands.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-18 14:12:48 -05:00
Boris Brezillon
f366d3854e Merge tag 'spi-nor/for-4.21' of git://git.infradead.org/linux-mtd into mtd/next
Core changes:
- Parse the 4BAIT SFDP section
- Add a bunch of SPI NOR entries to the flash_info table
- Add the concept of SFDP fixups and use it to fix a bug on MX25L25635F
- A bunch of minor cleanups/comestic changes
2018-12-18 20:00:52 +01:00
Boris Brezillon
ccec4a4a4f Merge tag 'nand/for-4.21' of git://git.infradead.org/linux-mtd into mtd/next
NAND core changes:
- kernel-doc miscellaneous fixes.
- Third batch of fixes/cleanup to the raw NAND core impacting various
  controller drivers (ams-delta, marvell, fsmc, denali, tegra, vf610):
  * Stopping to pass mtd_info objects to internal functions
  * Reorganizing code to avoid forward declarations
  * Dropping useless test in nand_legacy_set_defaults()
  * Moving nand_exec_op() to internal.h
  * Adding nand_[de]select_target() helpers
  * Passing the CS line to be selected in struct nand_operation
  * Making ->select_chip() optional when ->exec_op() is implemented
  * Deprecating the ->select_chip() hook
  * Moving the ->exec_op() method to nand_controller_ops
  * Moving ->setup_data_interface() to nand_controller_ops
  * Deprecating the dummy_controller field
  * Fixing JEDEC detection
  * Providing a helper for polling GPIO R/B pin

Raw NAND chip drivers changes:
- Macronix:
  * Flagging 1.8V AC chips with a broken GET_FEATURES(TIMINGS)

Raw NAND controllers drivers changes:
- Ams-delta:
  * Fixing the error path
  * SPDX tag added
  * May be compiled with COMPILE_TEST=y
  * Conversion to ->exec_op() interface
  * Dropping .IOADDR_R/W use
  * Use GPIO API for data I/O
- Denali:
  * Removing denali_reset_banks()
  * Removing ->dev_ready() hook
  * Including <linux/bits.h> instead of <linux/bitops.h>
  * Changes to comply with the above fixes/cleanup done in the core.
- FSMC:
  * Adding an SPDX tag to replace the license text
  * Making conversion from chip to fsmc consistent
  * Fixing unchecked return value in fsmc_read_page_hwecc
  * Changes to comply with the above fixes/cleanup done in the core.
- Marvell:
  * Preventing timeouts on a loaded machine (fix)
  * Changes to comply with the above fixes/cleanup done in the core.
- OMAP2:
  * Pass the parent of pdev to dma_request_chan() (fix)
- R852:
  * Use generic DMA API
- sh_flctl:
  * Converting to SPDX identifiers
- Sunxi:
  * Write pageprog related opcodes to the right register: WCMD_SET (fix)
- Tegra:
  * Stop implementing ->select_chip()
- VF610:
  * Adding an SPDX tag to replace the license text
  * Changes to comply with the above fixes/cleanup done in the core.
- Various trivial/spelling/coding style fixes.

SPI-NAND drivers changes:
- Removing the depreacated mt29f_spinand driver from staging.
- Adding support for:
  * Toshiba TC58CVG2S0H
  * GigaDevice GD5FxGQ4xA
  * Winbond W25N01GV
2018-12-18 19:59:16 +01:00
Linus Torvalds
ddfbab4653 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Three fixes: The t10-pi one is a regression from the 4.19 release, the
  qla2xxx one is a 4.20 merge window regression and the bnx2fc is a very
  old bug"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: t10-pi: Return correct ref tag when queue has no integrity profile
  scsi: bnx2fc: Fix NULL dereference in error handling
  Revert "scsi: qla2xxx: Fix NVMe Target discovery"
2018-12-18 09:38:34 -08:00
Thomas Gleixner
ff3730a497 Merge tag 'irqchip-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull irqchip updates from Marc Zyngier:

 - A bunch of new irqchip drivers (RDA8810PL, Madera, imx-irqsteer)
 - Updates for new (and old) platforms (i.MX8MQ, F1C100s)
 - A number of SPDX cleanups
 - A workaround for a very broken GICv3 implementation
 - A platform-msi fix
 - Various cleanups
2018-12-18 18:37:27 +01:00
Oleksandr Andrushchenko
b3383974fe xen: Introduce shared buffer helpers for page directory...
based frontends. Currently the frontends which implement
similar code for sharing big buffers between frontend and
backend are para-virtualized DRM and sound drivers.
Both define the same way to share grant references of a
data buffer with the corresponding backend with little
differences.

Move shared code into a helper module, so there is a single
implementation of the same functionality for all.

This patch introduces code which is used by sound and display
frontend drivers without functional changes with the intention
to remove shared code from the corresponding drivers.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-12-18 12:15:55 -05:00
Sagi Grimberg
7b7ab780a0 block: make request_to_qc_t public
block consumers will need it for polling requests that
are sent with blk_execute_rq_nowait. Also, get rid of
blk_tag_to_qc_t and open-code it instead.

Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18 17:50:47 +01:00
David S. Miller
77c7a7b3e7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2018-12-18

1) Add xfrm policy selftest scripts.
   From Florian Westphal.

2) Split inexact policies into four different search list
   classes and use the rbtree infrastructure to store/lookup
   the policies. This is to improve the policy lookup
   performance after the flowcache removal.
   Patches from Florian Westphal.

3) Various coding style fixes, from Colin Ian King.

4) Fix policy lookup logic after adding the inexact policy
   search tree infrastructure. From Florian Westphal.

5) Remove a useless remove BUG_ON from xfrm6_dst_ifdown.
   From Li RongQing.

6) Use the correct policy direction for lookups on hash
   rebuilding. From Florian Westphal.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 08:49:48 -08:00
Arnd Bergmann
e4b92b108c timekeeping: remove obsolete time accessors
There are no more remaining users of these deprecated wrappers, so
let's remove them before new users have a chance to make it in.

See Documentation/core-api/timekeeping.rst for replacements when
porting old drivers that contain calls to this function.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: John Stultz <john.stultz@linaro.org>
2018-12-18 16:13:05 +01:00
Arnd Bergmann
437e78d3fd timekeeping: remove timespec_add/timespec_del
The last users were removed a while ago since everyone moved to ktime_t,
so we can remove the two unused interfaces for old timespec structures.

With those two gone, set_normalized_timespec() is also unused, so
remove that as well.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: John Stultz <john.stultz@linaro.org>
2018-12-18 16:13:05 +01:00
Arnd Bergmann
926617889d timekeeping: remove unused {read,update}_persistent_clock
After arch/sh has removed the last reference to these functions,
we can remove them completely and just rely on the 64-bit time_t
based versions. This cleans up a rather ugly use of __weak
functions.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: John Stultz <john.stultz@linaro.org>
2018-12-18 16:13:05 +01:00