Commit Graph

691501 Commits

Author SHA1 Message Date
Nicholas Piggin
837e72f78a powerpc/64: Drop reservation-clearing ldarx in context switch
There is no need to explicitly break the reservation in _switch,
because we are guaranteed that the context switch path will include a
larx/stcx.

Comment the guarantee and remove the reservation clear from _switch.

This is worth 1-2% in context switch performance.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:34:39 +10:00
Nicholas Piggin
e4c0fc5f72 powerpc/64s: Leave interrupts hard enabled in context switch for radix
Commit 4387e9ff25 ("[POWERPC] Fix PMU + soft interrupt disable bug")
hard disabled interrupts over the low level context switch, because
the SLB management can't cope with a PMU interrupt accesing the stack
in that window.

Radix based kernel mapping does not use the SLB so it does not require
interrupts hard disabled here.

This is worth 1-2% in context switch performance on POWER9.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:34:39 +10:00
Nicholas Piggin
bc4f65e4cf powerpc/64: Avoid restore_math call if possible in syscall exit
The syscall exit code that branches to restore_math is quite heavy on
Book3S, consisting of 2 mtmsr instructions. Threads that don't use both
FP and vector can get caught here if the kernel ever uses FP or vector.
Lazy-FP/vec context switching also trips this case.

So check for lazy FP and vector before switching RI for restore_math.
Move most of this case out of line.

For threads that do want to restore math registers, the MSR switches are
still suboptimal. Future direction may be to use a soft-RI bit to avoid
MSR switches in kernel (similar to soft-EE), but for now at least the
no-restore

POWER9 context switch rate increases by about 5% due to sched_yield(2)
return performance. I haven't constructed a test to measure the syscall
cost.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:34:39 +10:00
Nicholas Piggin
acd7d8cef0 powerpc/64s: Optimize hypercall/syscall entry
After bc3551257a ("powerpc/64: Allow for relocation-on interrupts from
guest to host"), a getppid() system call goes from 307 cycles to 358
cycles (+17%) on POWER8. This is due significantly to the scratch SPR
used by the hypercall check.

It turns out there are a some volatile registers common to both system
call and hypercall (in particular, r12, cr0, ctr), which can be used to
avoid the SPR and some other overheads. This brings getppid to 320 cycles
(+4%).

Testing hcall entry performance by running "sc 1" in guest userspace
before this patch is 854 cycles, afterwards is 826. Also a small win
there.

POWER9 syscall is improved by about the same amount, hcall not tested.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:34:39 +10:00
Michael Ellerman
9abcc981de powerpc/mm/radix: Only add X for pages overlapping kernel text
Currently we map the whole linear mapping with PAGE_KERNEL_X. Instead we
should check if the page overlaps the kernel text and only then add
PAGE_KERNEL_X.

Note that we still use 1G pages if they're available, so this will
typically still result in a 1G executable page at KERNELBASE. So this fix is
primarily useful for catching stray branches to high linear mapping addresses.

Without this patch, we can execute at 1G in xmon using:

  0:mon> m c000000040000000
  c000000040000000  00 l
  c000000040000000  00000000 01006038
  c000000040000004  00000000 2000804e
  c000000040000008  00000000 x
  0:mon> di c000000040000000
  c000000040000000  38600001      li      r3,1
  c000000040000004  4e800020      blr
  0:mon> p c000000040000000
  return value is 0x1

After we get a 400 as expected:

  0:mon> p c000000040000000
  *** 400 exception occurred

Fixes: 2bfd65e45e ("powerpc/mm/radix: Add radix callbacks for early init routines")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
2017-06-15 16:34:39 +10:00
Michael Ellerman
0edc2ca9cc Revert "powerpc: Handle simultaneous interrupts at once"
This reverts commit 45cb08f479.

For some reason this is causing IRQ problems on Freescale Book3E
machines, eg on my p5020ds:

  irq 25: nobody cared (try booting with the "irqpoll" option)
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc3-gcc-6.3.1-00037-g45cb08f4791c #624
  Call Trace:
  [c0000000fffdbb10] [c00000000049962c] .dump_stack+0xa8/0xe8 (unreliable)
  [c0000000fffdbba0] [c0000000000babf4] .__report_bad_irq+0x54/0x140
  [c0000000fffdbc40] [c0000000000bb11c] .note_interrupt+0x324/0x380
  [c0000000fffdbd00] [c0000000000b7110] .handle_irq_event_percpu+0x68/0x88
  [c0000000fffdbd90] [c0000000000b718c] .handle_irq_event+0x5c/0xa8
  [c0000000fffdbe10] [c0000000000bc01c] .handle_fasteoi_irq+0xe4/0x298
  [c0000000fffdbe90] [c0000000000b59c4] .generic_handle_irq+0x50/0x74
  [c0000000fffdbf10] [c0000000000075d8] .__do_irq+0x74/0x1f0
  [c0000000fffdbf90] [c0000000000189f8] .call_do_irq+0x14/0x24
  [c0000000f7173060] [c0000000000077e4] .do_IRQ+0x90/0x120
  [c0000000f7173100] [c00000000001d93c] exc_0x500_common+0xfc/0x100
  --- interrupt: 501 at .prepare_to_wait_event+0xc/0x14c
      LR = .fsl_elbc_run_command+0xc8/0x23c
  [c0000000f71734d0] [c00000000065f418] .nand_reset+0xb8/0x168
  [c0000000f7173560] [c00000000065fec4] .nand_scan_ident+0x2b0/0x1638
  [c0000000f7173650] [c000000000666cd8] .fsl_elbc_nand_probe+0x34c/0x5f0
  ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
  [c0000000f7173750] [c0000000005a3c60] .platform_drv_probe+0x64/0xb0
  [c0000000f71737d0] [c0000000005a12e0] .really_probe+0x290/0x334
  [c0000000f7173870] [c0000000005a14a0] .__driver_attach+0x11c/0x120
  [c0000000f7173900] [c00000000059e6a0] .bus_for_each_dev+0x98/0xfc
  [c0000000f71739a0] [c0000000005a0b3c] .driver_attach+0x34/0x4c
  [c0000000f7173a20] [c0000000005a04b0] .bus_add_driver+0x1ac/0x2e0
  [c0000000f7173ac0] [c0000000005a2170] .driver_register+0x94/0x160
  [c0000000f7173b40] [c0000000005a3be0] .__platform_driver_register+0x60/0x7c
  [c0000000f7173bc0] [c000000000d6aab4] .fsl_elbc_nand_driver_init+0x24/0x38
  [c0000000f7173c30] [c000000000001934] .do_one_initcall+0x68/0x1b8
  [c0000000f7173d00] [c000000000d210f8] .kernel_init_freeable+0x260/0x338
  [c0000000f7173db0] [c0000000000021b0] .kernel_init+0x20/0xe70
  [c0000000f7173e30] [c0000000000009bc] .ret_from_kernel_thread+0x58/0x9c
  handlers:
  [<c000000000ed85c8>] .fsl_lbc_ctrl_irq
  Disabling IRQ #25

Ben also had concerns with the implementation being potentially slow on
some PICs, so revert it for now.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-15 16:20:46 +10:00
Paul Mackerras
46a704f840 KVM: PPC: Book3S HV: Preserve userspace HTM state properly
If userspace attempts to call the KVM_RUN ioctl when it has hardware
transactional memory (HTM) enabled, the values that it has put in the
HTM-related SPRs TFHAR, TFIAR and TEXASR will get overwritten by
guest values.  To fix this, we detect this condition and save those
SPR values in the thread struct, and disable HTM for the task.  If
userspace goes to access those SPRs or the HTM facility in future,
a TM-unavailable interrupt will occur and the handler will reload
those SPRs and re-enable HTM.

If userspace has started a transaction and suspended it, we would
currently lose the transactional state in the guest entry path and
would almost certainly get a "TM Bad Thing" interrupt, which would
cause the host to crash.  To avoid this, we detect this case and
return from the KVM_RUN ioctl with an EINVAL error, with the KVM
exit reason set to KVM_EXIT_FAIL_ENTRY.

Fixes: b005255e12 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-06-15 16:18:17 +10:00
Paul Mackerras
4c3bb4ccd0 KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit
This restores several special-purpose registers (SPRs) to sane values
on guest exit that were missed before.

TAR and VRSAVE are readable and writable by userspace, and we need to
save and restore them to prevent the guest from potentially affecting
userspace execution (not that TAR or VRSAVE are used by any known
program that run uses the KVM_RUN ioctl).  We save/restore these
in kvmppc_vcpu_run_hv() rather than on every guest entry/exit.

FSCR affects userspace execution in that it can prohibit access to
certain facilities by userspace.  We restore it to the normal value
for the task on exit from the KVM_RUN ioctl.

IAMR is normally 0, and is restored to 0 on guest exit.  However,
with a radix host on POWER9, it is set to a value that prevents the
kernel from executing user-accessible memory.  On POWER9, we save
IAMR on guest entry and restore it on guest exit to the saved value
rather than 0.  On POWER8 we continue to set it to 0 on guest exit.

PSPB is normally 0.  We restore it to 0 on guest exit to prevent
userspace taking advantage of the guest having set it non-zero
(which would allow userspace to set its SMT priority to high).

UAMOR is normally 0.  We restore it to 0 on guest exit to prevent
the AMR from being used as a covert channel between userspace
processes, since the AMR is not context-switched at present.

Fixes: b005255e12 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-06-15 16:17:09 +10:00
Al Viro
289dec5b89 ufs: more deadlock prevention on tail unpacking
->s_lock is not needed for ufs_change_blocknr()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-15 00:42:56 -04:00
Al Viro
09bf4f5b6e ufs: avoid grabbing ->truncate_mutex if possible
tail unpacking is done in a wrong place; the deadlocks galore
is best dealt with by doing that in ->write_iter() (and switching
to iomap, while we are at it), but that's rather painful to
backport.  The trouble comes from grabbing pages that cover
the beginning of tail from inside of ufs_new_fragments(); ongoing
pageout of any of those is going to deadlock on ->truncate_mutex
with process that got around to extending the tail holding that
and waiting for page to get unlocked, while ->writepage() on
that page is waiting on ->truncate_mutex.

The thing is, we don't need ->truncate_mutex when the fragment
we are trying to map is within the tail - the damn thing is
allocated (tail can't contain holes).

Let's do a plain lookup and if the fragment is present, we can
just pretend that we'd won the race in almost all cases.  The
only exception is a fragment between the end of tail and the
end of block containing tail.

Protect ->i_lastfrag with ->meta_lock - read_seqlock_excl() is
sufficient.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-15 00:41:18 -04:00
Prarit Bhargava
036e9ef8be dmaengine: Replace WARN_TAINT_ONCE() with pr_warn_once()
The WARN_TAINT_ONCE() prints out a loud stack trace on broken BIOSes.
The systems that have this problem are several years out of support and
no longer have BIOS updates available.  The stack trace isn't necessary
and a pr_warn_once() will do.

Change WARN_TAINT_ONCE() to pr_warn_once() and taint.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Duyck, Alexander H <alexander.h.duyck@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2017-06-15 09:50:37 +05:30
Fabio Estevam
d762e4f356 dmaengine: Kconfig: Extend the dependency for MXS_DMA
Currently it is not possible to select the mxs dma driver when only
mx6sx or mx7 are selected.

Extend the dependency to allow the mxs dma driver to be built whenever
ARCH_MXS or ARCH_MXC is selected.

This has the benefit to avoid having to add new entries in the
MXS_DMA Kconfig everytime a new i.MX SoC shows up and it also makes
it consistent with the other i.MX DMA engines, such as IMX_DMA and
IMX_SDMA.

While at it, also pass COMPILE_TEST for increasing the build coverage.

Acked-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2017-06-15 09:48:42 +05:30
Fabio Estevam
4aff2f9355 dmaengine: mxs: Use %zu for printing a size_t variable
Use %zu for printing a size_t variable in order to fix the following
build warning:

drivers/dma/mxs-dma.c: In function 'mxs_dma_prep_dma_cyclic':
drivers/dma/mxs-dma.c:621:5: warning: format '%d' expects argument of type 'int', but argument 3 has type 'size_t' [-Wformat]

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2017-06-15 09:44:45 +05:30
Jia-Ju Bai
640f93cc6e i40e: Fix a sleep-in-atomic bug
The driver may sleep under a spin lock, and the function call path is:
i40e_ndo_set_vf_port_vlan (acquire the lock by spin_lock_bh)
  i40e_vsi_remove_pvid
    i40e_vlan_stripping_disable
      i40e_aq_update_vsi_params
        i40e_asq_send_command
          mutex_lock --> may sleep

To fixed it, the spin lock is released before "i40e_vsi_remove_pvid", and
the lock is acquired again after this function.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 23:45:22 -04:00
Al Viro
267309f394 ufs_get_locked_page(): make sure we have buffer_heads
callers rely upon that, but find_lock_page() racing with attempt of
page eviction by memory pressure might have left us with
	* try_to_free_buffers() successfully done
	* __remove_mapping() failed, leaving the page in our mapping
	* find_lock_page() returning an uptodate page with no
buffer_heads attached.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-14 23:32:19 -04:00
Yuantian Tang
b6f5e70193 ARM: dts: ls1021a: update the clockgen node
qoriq clock driver has been updated to parse the clock configuration
information defined in driver itself not in dts.
Since the new implementation and the bindings have been merged,
it is time to update the clock related node and remove redundent clock
configuration information from the dts.

Signed-off-by: Tang Yuantian <andy.tang@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2017-06-15 11:28:20 +08:00
Christoph Hellwig
b014e96d1a PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()
Every method in struct device_driver or structures derived from it like
struct pci_driver MUST provide exclusion vs the driver's ->remove() method,
usually by using device_lock().

Protect use of pci_error_handlers->reset_notify() by holding the device
lock while calling it.

Note:

  - pci_dev_lock() calls device_lock() in addition to blocking user-space
    config accesses.

  - pci_err_handlers->reset_notify() is used inside
    pci_dev_save_and_disable() and pci_dev_restore().  We could hold the
    device lock directly in pci_reset_notify(), but we expand the region
    since we have several calls following each other.

Without this, ->reset_notify() may race with ->remove() calls, which can be
easily triggered in NVMe.

[bhelgaas: changelog, add pci_reset_notify() comment]
[bhelgaas: fold in fix from Dan Carpenter <dan.carpenter@oracle.com>:
http://lkml.kernel.org/r/20170701135323.x5vaj4e2wcs2mcro@mwanda]
Link: http://lkml.kernel.org/r/20170601111039.8913-2-hch@lst.de
Reported-by: Rakesh Pandit <rakesh@tuxera.com>
Tested-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-14 21:49:13 -05:00
Leonard Crestez
9b00064ba7 ARM: imx_v6_v7_defconfig: Set THERMAL_WRITABLE_TRIPS=y for testing
Setting trip points is supported by the imx thermal driver and it is
useful to be able to test this without adjusting config.

Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2017-06-15 09:35:09 +08:00
Dawid Kurek
d35fb61759 drm: Remove duplicate forward declaration
Forward declarations in C are great but I'm pretty sure one is enough.

Signed-off-by: Dawid Kurek <dawikur@gmail.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170614213518.GA3554@gmail.com
2017-06-14 21:25:17 -04:00
Tim Bird
48e42f91c1 kselftest: convert get_size to use stricter TAP13 format
1. Add the TAP13 header
2. remove variable data from the test description line
3. move the plan count to the end of the file, for consistency with
other kselftests
4. convert memory data from diagnostic (comment) format, to a YAML block

Signed-off-by: Tim Bird <tim.bird@sony.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-14 18:23:27 -06:00
Rafael J. Wysocki
9522933454 Merge branch 'acpica-fixes'
* acpica-fixes:
  ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance
  Revert "ACPICA: Disassembler: Enhance resource descriptor detection"
2017-06-15 01:52:32 +02:00
Rafael J. Wysocki
f63e4f7d41 Merge branches 'pm-cpufreq', 'pm-cpuidle' and 'pm-devfreq'
* pm-cpufreq:
  cpufreq: conservative: Allow down_threshold to take values from 1 to 10
  Revert "cpufreq: schedutil: Reduce frequencies slower"

* pm-cpuidle:
  cpuidle: dt: Add missing 'of_node_put()'

* pm-devfreq:
  PM / devfreq: exynos-ppmu: Staticize event list
  PM / devfreq: exynos-ppmu: Handle return value of clk_prepare_enable
  PM / devfreq: exynos-nocp: Handle return value of clk_prepare_enable
2017-06-15 01:51:33 +02:00
Stephen Boyd
9c861f3328 Merge branch 'clk-fixes' into clk-next
* clk-fixes:
  clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM
  clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM
  dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks
  clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM
  clk: sunxi-ng: v3s: Fix usb otg device reset bit
  clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
2017-06-14 16:48:21 -07:00
Stephen Boyd
949bdfed4b Merge tag 'sunxi-clk-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into clk-fixes
Allwinner clock fixes for 4.12

Some fixes that fix some bindings that went in 4.12, fix a few reset and
clock offsets and a build error fix

* tag 'sunxi-clk-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM
  clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM
  dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks
  clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM
  clk: sunxi-ng: v3s: Fix usb otg device reset bit
  clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
2017-06-14 16:48:03 -07:00
Rafael J. Wysocki
33e4f80ee6 ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ
during suspend-to-idle transitions and, consequently, any events
signaled through it wake up the system from that state.  However,
on some systems some of the events signaled via the ACPI SCI while
suspended to idle should not cause the system to wake up.  In fact,
quite often they should just be discarded.

Arguably, systems should not resume entirely on such events, but in
order to decide which events really should cause the system to resume
and which are spurious, it is necessary to resume up to the point
when ACPI SCIs are actually handled and processed, which is after
executing dpm_resume_noirq() in the system resume path.

For this reasons, add a loop around freeze_enter() in which the
platforms can process events signaled via multiplexed IRQ lines
like the ACPI SCI and add suspend-to-idle hooks that can be
used for this purpose to struct platform_freeze_ops.

In the ACPI case, the ->wake hook is used for checking if the SCI
has triggered while suspended and deferring the interrupt-induced
system wakeup until the events signaled through it are actually
processed sufficiently to decide whether or not the system should
resume.  In turn, the ->sync hook allows all of the relevant event
queues to be flushed so as to prevent events from being missed due
to race conditions.

In addition to that, some ACPI code processing wakeup events needs
to be modified to use the "hard" version of wakeup triggers, so that
it will cause a system resume to happen on device-induced wakeup
events even if the "soft" mechanism to prevent the system from
suspending is not enabled.  However, to preserve the existing
behavior with respect to suspend-to-RAM, this only is done in
the suspend-to-idle case and only if an SCI has occurred while
suspended.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-15 00:55:44 +02:00
Hans de Goede
63dada87f7 platform/x86: Add driver for ACPI INT0002 Virtual GPIO device
Some peripherals on Bay Trail and Cherry Trail platforms signal a
Power Management Event (PME) to the Power Management Controller (PMC)
to wakeup the system. When this happens software needs to explicitly
clear the PME bus 0 status bit in the GPE0a_STS register to avoid an
IRQ storm on IRQ 9.

This is modelled in ACPI through the INT0002 ACPI device, which is
called a "Virtual GPIO controller" in ACPI because it defines the
event handler to call when the PME triggers through _AEI and _L02
methods as would be done for a real GPIO interrupt in ACPI.

This commit adds a driver which registers the Virtual GPIOs expected
by the DSDT on these devices, letting gpiolib-acpi claim the
virtual GPIO and install a GPIO-interrupt handler which call the _L02
handler as it would for a real GPIO controller.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-15 00:55:44 +02:00
Rafael J. Wysocki
dc15e71eef PCI / PM: Restore PME Enable if skipping wakeup setup
The wakeup_prepared PCI device flag is used for preventing subsequent
changes of PCI device wakeup settings in the same way (e.g. enabling
device wakeup twice in a row).

However, in some cases PME Enable may be updated by things like PCI
configuration space restoration in the meantime and it may need to be
set again even though the rest of the settings need not change, so
modify __pci_enable_wake() to do that when it is about to return
early.

Also, it is reasonable to expect that __pci_enable_wake() will always
clear PME Status when invoked to disable device wakeup, so make it do
so even if it is going to return early then.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-15 00:55:44 +02:00
Rafael J. Wysocki
604d895857 PM / sleep: Print timing information if debug is enabled
Avoid printing the device suspend/resume timing information if
CONFIG_PM_DEBUG is not set to reduce the log noise level.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-15 00:55:43 +02:00
Rafael J. Wysocki
235d81a630 ACPI / PM: Clean up device wakeup enable/disable code
The wakeup.flags.enabled flag in struct acpi_device is not used
consistently, as there is no reason why it should only apply
to the enabling/disabling of the wakeup GPE, so put the invocation
of acpi_enable_wakeup_device_power() under it too.

Moreover, it is not necessary to call
acpi_enable_wakeup_devices() and acpi_disable_wakeup_devices() for
suspend-to-idle, so don't do that.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-15 00:55:43 +02:00
Rafael J. Wysocki
190cab8471 ACPI / PM: Change log level of wakeup-related message
Change the log level of the "System wakeup enabled/disabled by ACPI"
message in acpi_pm_device_sleep_wake() to "debug" to reduce to log
noise level.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-15 00:55:43 +02:00
Rafael J. Wysocki
d438aa223e USB / PCI / PM: Allow the PCI core to do the resume cleanup
hcd_pci_resume_noirq() used as a universal _resume_noirq handler for
PCI USB controllers calls pci_back_from_sleep() which is unnecessary
and may become problematic.

It is unnecessary, because the PCI bus type carries out post-suspend
cleanup of all PCI devices during resume and that covers all things
done by the pci_back_from_sleep().  There is no reason why USB cannot
follow all of the other PCI devices in that respect.

It will become problematic after subsequent changes that make it
possible to go back to sleep again after executing dpm_resume_noirq()
if no valid system wakeup events have been detected at that point.
Namely, calling pci_back_from_sleep() at the _resume_noirq stage
will cause the wakeup status of the devices in question to be cleared
and if any of them has triggered system wakeup, that event may be
missed then.

For the above reasons, drop the pci_back_from_sleep() invocation
from hcd_pci_resume_noirq().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-15 00:55:43 +02:00
Rafael J. Wysocki
64fd1c7040 ACPI / PM: Run wakeup notify handlers synchronously
The work functions provided by the users of acpi_add_pm_notifier()
should be run synchronously before re-enabling the wakeup GPE in
case they are used to clear the status and/or disable the wakeup
signaling at the source.  Otherwise, which is the case currently in
the PCI bus type code, the same wakeup event may be signaled for
multiple times while the execution of the work function in response
to it has already been queued up.

Fortunately, acpi_add_pm_notifier() is only used by PCI and by
ACPI device PM code internally, so the change is relatively
straightforward to make.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-15 00:55:42 +02:00
Jakub Kicinski
17530e71e0 PCI: Protect pci_driver->sriov_configure() usage with device_lock()
Every method in struct device_driver or structures derived from it like
struct pci_driver MUST provide exclusion vs the driver's ->remove() method,
usually by using device_lock().

Protect use of pci_driver->sriov_configure() by holding the device lock
while calling it.

The PCI core sets the pci_dev->driver pointer in local_pci_probe() before
calling ->probe() and only clears it after ->remove().  This means driver's
->sriov_configure() callback will happily race with probe() and remove(),
most likely leading to BUGs, since drivers don't expect this.

Remove the iov lock completely, since we remove the last user.

[bhelgaas: changelog, thanks to Christoph for locking rule]
Link: http://lkml.kernel.org/r/20170522225023.14010-1-jakub.kicinski@netronome.com
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-06-14 17:41:19 -05:00
Al Viro
c596961d1b ufs: fix s_size/s_dsize users
For UFS2 we need 64bit variants; we even store them in uspi, but
use 32bit ones instead.  One wrinkle is in handling of reserved
space - recalculating it every time had been stupid all along, but
now it would become really ugly.  Just calculate it once...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-14 16:43:03 -04:00
H. Nikolaus Schaller
5e6eb025b0 power: supply: twl4030-charger: move allocation of iio channel to the beginning
This is in prepraration for EPROBE_DEFER handling because it is quite
likely that geting the (madc) iio channel is deferred more often than
later steps.

Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
2017-06-14 22:10:44 +02:00
H. Nikolaus Schaller
e8847c5654 power: supply: twl4030-charger: allocate iio by devm_iio_channel_get() and fix error path
Suggested-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
2017-06-14 22:10:43 +02:00
Arvind Yadav
355679b270 power: supply: core: constify psy_tcd_ops.
File size before:
text	data	bss	dec   hex filename
4240	 200	 80	4520 11a8 drivers/power/supply/power_supply_core.o

File size After adding 'const':
text	data	bss	dec   hex filename
4296	 136	 80	4512 11a0 drivers/power/supply/power_supply_core.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
2017-06-14 22:10:43 +02:00
Tony Lindgren
1b0c6806d6 dt-bindings: power: supply: cpcap-battery: Add power-supplies property
The binding for cpcap-battery is missing the standard power-supplies
property as noted by Sebastian Reichel <sebastian.reichel@collabora.co.uk>.

Cc: devicetree@vger.kernel.org
Cc: Marcel Partap <mpartap@gmx.net>
Cc: Michael Scott <michael.scott@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
2017-06-14 22:10:42 +02:00
Sebastian Reichel
c159b38333 dt-bindings: power: supply: move max8903-charger.txt to proper location
This moves max8903-charger.txt to proper location
for power-supply bindings.

Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-06-14 22:10:37 +02:00
Sebastian Reichel
a9b819f5fb dt-bindings: power: supply: move maxim,max14656.txt to proper location
This moves maxim,max14656.txt to proper location for
power-supply bindings.

Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
2017-06-14 22:10:26 +02:00
Tejun Heo
b6053d40e3 cgroup: fix lockdep warning in debug controller
The debug controller grabs cgroup_mutex from interface file show
functions which can deadlock and triggers lockdep warnings.  Fix it by
using cgroup_kn_lock_live()/cgroup_kn_unlock() instead.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
2017-06-14 16:01:41 -04:00
Tejun Heo
2866c0b4cf cgroup: refactor cgroup_masks_read() in the debug controller
Factor out cgroup_masks_read_one() out of cgroup_masks_read() for
simplicity.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
2017-06-14 16:01:36 -04:00
Tejun Heo
8cc38fa7fa cgroup: make debug an implicit controller on cgroup2
Make debug an implicit controller on cgroup2 which is enabled by
"cgroup_debug" boot param.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
2017-06-14 16:01:32 -04:00
Waiman Long
575313f40f cgroup: Make debug cgroup support v2 and thread mode
Besides supporting cgroup v2 and thread mode, the following changes
are also made:
 1) current_* cgroup files now resides only at the root as we don't
    need duplicated files of the same function all over the cgroup
    hierarchy.
 2) The cgroup_css_links_read() function is modified to report
    the number of tasks that are skipped because of overflow.
 3) The number of extra unaccounted references are displayed.
 4) The current_css_set_read() function now prints out the addresses of
    the css'es associated with the current css_set.
 5) A new cgroup_subsys_states file is added to display the css objects
    associated with a cgroup.
 6) A new cgroup_masks file is added to display the various controller
    bit masks in the cgroup.

tj: Dropped thread mode related information for now so that debug
    controller changes aren't blocked on the thread mode.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-06-14 16:01:21 -04:00
Waiman Long
23b0be480f cgroup: Make Kconfig prompt of debug cgroup more accurate
The Kconfig prompt and description of the debug cgroup controller
more accurate by saying that it is for debug purpose only and its
interfaces are unstable.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-06-14 16:01:21 -04:00
Waiman Long
a28f8f5e99 cgroup: Move debug cgroup to its own file
The debug cgroup currently resides within cgroup-v1.c and is enabled
only for v1 cgroup. To enable the debug cgroup also for v2, it makes
sense to put the code into its own file as it will no longer be v1
specific. There is no change to the debug cgroup specific code.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-06-14 16:01:21 -04:00
Waiman Long
73a7242a06 cgroup: Keep accurate count of tasks in each css_set
The reference count in the css_set data structure was used as a
proxy of the number of tasks attached to that css_set. However, that
count is actually not an accurate measure especially with thread mode
support. So a new variable nr_tasks is added to the css_set to keep
track of the actual task count. This new variable is protected by
the css_set_lock. Functions that require the actual task count are
updated to use the new variable.

tj: s/task_count/nr_tasks/ for consistency with cgroup_root->nr_cgrps.
    Refreshed on top of cgroup/for-v4.13 which dropped on
    css_set_populated() -> nr_tasks conversion.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-06-14 16:01:21 -04:00
Al Viro
b451cec4bb ufs: fix reserved blocks check
a) honour ->s_minfree; don't just go with default (5)
b) don't bother with capability checks until we know we'll need them

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-14 15:46:05 -04:00
David Howells
f7aec129a3 rxrpc: Cache the congestion window setting
Cache the congestion window setting that was determined during a call's
transmission phase when it finishes so that it can be used by the next call
to the same peer, thereby shortcutting the slow-start algorithm.

The value is stored in the rxrpc_peer struct and is accessed without
locking.  Each call takes the value that happens to be there when it starts
and just overwrites the value when it finishes.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 15:42:45 -04:00
Weilin Chang
0430a26054 liquidio: fix VF driver off-by-one bug when setting ethtool -C ethX rx-frames
Signed-off-by: Weilin Chang <weilin.chang@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 15:42:20 -04:00