Commit Graph

588765 Commits

Author SHA1 Message Date
Tomasz Nowicki
ffa7d6166a irqchip/gic-v3: Add ACPI support for GICv3/4 initialization
With the refator of gic_of_init(), GICv3/4 can be initialized
by gic_init_bases() with gic distributor base address and gic
redistributor region(s).

So get the redistributor region base addresses from MADT GIC
redistributor subtable, and the distributor base address from
GICD subtable to init GICv3 irqchip in ACPI way.

Note: GIC redistributor base address may also be provided in
GICC structures on systems supporting GICv3 and above if the GIC
Redistributors are not in the always-on power domain, this
patch didn't implement such feature yet.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 02:44:01 +00:00
Tomasz Nowicki
db57d7460e irqchip/gic-v3: Refactor gic_of_init() for GICv3 driver
Isolate hardware abstraction (FDT) code to gic_of_init().
Rest of the logic goes to gic_init_bases() and expects well
defined data to initialize GIC properly. The same solution
is used for GICv2 driver.

This is needed for ACPI initialization later.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09 02:44:00 +00:00
Adam Baker
630300d5fc hwmon: Create an NSA320 hardware monitoring driver
Create a driver to support the hardware monitoring chip present in
the Zyxel NSA320 and some of the other Zyxel NAS devices.

The driver reads fan speed and temperature from a suitably
pre-programmed MCU on the device.

Signed-off-by: Adam Baker <linux@baker-net.org.uk>
[groeck: Dropped .owner field initialization]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-03-08 18:40:49 -08:00
Jon Hunter
49023d2e4e spi: core: Fix deadlock when sending messages
The function __spi_pump_messages() is called by spi_pump_messages() and
__spi_sync(). The function __spi_sync() has an argument 'bus_locked'
that indicates if it is called with the SPI bus mutex held or not. If
'bus_locked' is false then __spi_sync() will acquire the mutex itself.

Commit 556351f14e ("spi: introduce accelerated read support for spi
flash devices") made a change to acquire the SPI bus mutex within
__spi_pump_messages(). However, this change did not check to see if the
mutex is already held. If __spi_sync() is called with the mutex held
(ie. 'bus_locked' is true), then a deadlock occurs when
__spi_pump_messages() is called.

Fix this deadlock by passing the 'bus_locked' state from __spi_sync() to
__spi_pump_messages() and only acquire the mutex if not already held. In
the case where __spi_pump_messages() is called from spi_pump_messages()
it is assumed that the mutex is not held and so call
__spi_pump_messages() with 'bus_locked' set to false. Finally, move the
unlocking of the mutex to the end of the __spi_pump_messages() function
to simplify the code and only call cond_resched() if there are no
errors.

Fixes: 556351f14e ("spi: introduce accelerated read support for spi flash devices")

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-03-09 09:31:39 +07:00
Axel Lin
3fab91ea28 gpio: lp3943: Drop pin_used and lp3943_gpio_request/lp3943_gpio_free
The implementation of lp3943_gpio_request/lp3943_gpio_free test pin_used
for tracing the pin usage. However, gpiolib already checks FLAG_REQUESTED
flag for the same purpose. So remove the redundant implementation.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-03-09 09:17:55 +07:00
Manoj N. Kumar
83430833b4 cxlflash: Increase cmd_per_lun for better throughput
With the current value of cmd_per_lun at 16, the throughput
over a single adapter is limited to around 150kIOPS.

Increase the value of cmd_per_lun to 256 to improve
throughput. With this change a single adapter is able to
attain close to the maximum throughput (380kIOPS).
Also change the number of RRQ entries that can be queued.

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-08 21:17:33 -05:00
Manoj N. Kumar
603ecce95f cxlflash: Fix to avoid unnecessary scan with internal LUNs
When switching to the internal LUN defined on the
IBM CXL flash adapter, there is an unnecessary
scan occurring on the second port. This scan leads
to the following extra lines in the log:

Dec 17 10:09:00 tul83p1 kernel: [ 3708.561134] cxlflash 0008:00:00.0: cxlflash_queuecommand: (scp=c0000000fc1f0f00) 11/1/0/0 cdb=(A0000000-00000000-10000000-00000000)
Dec 17 10:09:00 tul83p1 kernel: [ 3708.561147] process_cmd_err: cmd failed afu_rc=32 scsi_rc=0 fc_rc=0 afu_extra=0xE, scsi_extra=0x0, fc_extra=0x0

By definition, both of the internal LUNs are on the first port/channel.

When the lun_mode is switched to internal LUN the
same value for host->max_channel is retained. This
causes an unnecessary scan over the second port/channel.

This fix alters the host->max_channel to 0 (1 port), if internal
LUNs are configured and switches it back to 1 (2 ports) while
going back to external LUNs.

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-08 21:17:33 -05:00
Uma Krishnan
5d1952acd0 cxlflash: Reorder user context initialization
In order to support cxlflash in the PowerVM environment, underlying
hypervisor APIs have imposed a kernel API ordering change.

For the superpipe access to LUN, user applications need a context.
The cxlflash module creates this context by making a sequence of
cxl calls. In the current code, a context is initialized via
cxl_dev_context_init() followed by cxl_process_element(), a function
that obtains the process element id. Finally, cxl_start_work()
is called to attach the process element.

In the PowerVM environment, a process element id cannot be obtained
from the hypervisor until the process element is attached. The
cxlflash module is unable to create contexts without a valid
process element id.

To fix this problem, cxl_start_work() is called before obtaining
the process element id.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-08 21:17:33 -05:00
Matthew R. Ochs
8a96b52af5 cxlflash: Simplify attach path error cleanup
The cxlflash_disk_attach() routine currently uses a cascading error
gate strategy for its error cleanup path. While this strategy is
commonly used to handle cleanup scenarios, it is too restrictive when
function callouts need to be restructured. Problems range from
inserting error path bugs in previously 'good' code to the cleanup
path imposing design changes to how the normal path is structured.
A less restrictive approach is needed to support ordering changes
that come about when operating in different environments.

To overcome this restriction, the error cleanup path is modified to
have a single entrypoint and use conditional logic to cleanup where
necessary. Entities that require multiple cleanup steps must be
carefully vetted to ensure their APIs support state. In cases where
they do not (none as of this commit) additional local variables can
be used to maintain state on their behalf.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-08 21:17:33 -05:00
Matthew R. Ochs
5e6632d19e cxlflash: Split out context initialization
Presently, context information structures are allocated and
initialized in the same routine, create_context(). This imposes
an ordering restriction such that all pieces of information needed
to initialize a context must be known before the context is even
allocated.

This design point is not flexible when the order of context
creation needs to be modified. Specifically, this can lead to
problems when members of the context information structure are
a part of an ordering dependency (i.e. - the 'work' structure
embedded within the context).

To remedy, the allocation is left as-is, inside of the existing
create_context() routine and the initialization is transitioned
to a new void routine, init_context(). At the same time, in
anticipation of these routines not being called in sequence, a
state boolean is added to the context information structure to
track when the context has been initilized. The context teardown
routine, destroy_context(), is modified to support being called
with a non-initialized context.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-08 21:17:33 -05:00
Uma Krishnan
6ded8b3cbd cxlflash: Unmap problem state area before detaching master context
When operating in the PowerVM environment, the cxlflash module can
receive an error from the hypervisor indicating that there are
existing mappings in the page table for the process MMIO space.

This issue exists because term_afu() currently invokes term_mc()
before stop_afu(), allowing for the master context to be detached
first and the problem state area to be unmapped second.

To resolve this issue, stop_afu() should be called before term_mc().

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-08 21:17:33 -05:00
Manoj N. Kumar
961487e46a cxlflash: Simplify PCI registration
The calls to pci_request_regions(), pci_resource_start(),
pci_set_dma_mask(), pci_set_master() and pci_save_state() are all
unnecessary for the IBM CXL flash adapter since data buffers
are not required to be mapped to the device's memory.

The use of services such as pci_set_dma_mask() are problematic on
hypervisor managed systems as the IBM CXL flash adapter is operating
under a virtual PCI Host Bridge (virtual PHB) which does not support
these services.

cxlflash 0001:00:00.0: init_pci: Failed to set PCI DMA mask rc=-5

The resolution is to simplify init_pci(), to a point where it does the
bare minimum (pci_enable_device). Similarly, remove the call the
pci_release_regions() from cxlflash_remove().

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-08 21:17:33 -05:00
Linus Walleij
cc2a73a4a9 pinctrl: pxa2xx: export symbols
The pxa2xxx fails some automated builds because of unexported
symbols.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-03-09 09:10:02 +07:00
Christophe Lombard
cbffa3a514 cxl: Separate bare-metal fields in adapter and AFU data structures
Introduce sub-structures containing the bare-metal specific fields in
the structures describing the adapter (struct cxl) and AFU (struct
cxl_afu).
Update all their references.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:54 +11:00
Christophe Lombard
444c4ba461 cxl: New hcalls to support cxl adapters
The hypervisor calls provide an interface with a coherent platform
facility and function. It matches version 0.16 of the 'PAPR changes'
document.

The following hcalls are supported:
H_ATTACH_CA_PROCESS    Attach a process element to a coherent platform
                       function.
H_DETACH_CA_PROCESS    Detach a process element from a coherent
                       platform function.
H_CONTROL_CA_FUNCTION  Allow the partition to manipulate or query
                       certain coherent platform function behaviors.
H_COLLECT_CA_INT_INFO  Collect interrupt info about a coherent.
                       platform function after an interrupt occurred
H_CONTROL_CA_FAULTS    Control the operation of a coherent platform
                       function after a fault occurs.
H_DOWNLOAD_CA_FACILITY Support for downloading a base adapter image to
                       the coherent platform facility, and for
                       validating the entire image after the download.
H_CONTROL_CA_FACILITY  Allow the partition to manipulate or query
                       certain coherent platform facility behaviors.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:52 +11:00
Christophe Lombard
c0efa9aee8 powerpc: New possible return value from hcall
The hcalls introduced for cxl use a possible new value:
H_STATE (invalid state).

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:51 +11:00
Frederic Barrat
73d55c3b59 cxl: IRQ allocation for guests
The PSL interrupt cannot be multiplexed in a guest, as it is not
supported by the hypervisor. So an interrupt will be allocated
for it for each context. It will still be the first interrupt found in
the first interrupt range, but is treated almost like any other AFU
interrupt when creating/deleting the context. Only the handler is
different. Rework the code so that the range 0 is treated like the
other ranges.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:50 +11:00
Frederic Barrat
6d625ed9a7 cxl: Update cxl_irq() prototype
The context parameter when calling cxl_irq() should be strongly typed.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:49 +11:00
Frederic Barrat
ea2d1f95ef cxl: Isolate a few bare-metal-specific calls
A few functions are mostly common between bare-metal and guest and
just need minor tuning. To avoid crowding the backend API, introduce a
few 'if' based on the CPU being in HV mode.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:47 +11:00
Frederic Barrat
2b04cf310b cxl: Rename some bare-metal specific functions
Rename a few functions, changing the 'cxl_' prefix to either
'cxl_pci_' or 'cxl_native_', to make clear that the implementation is
bare-metal specific.

Those functions will have an equivalent implementation for a guest in
a later patch.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:46 +11:00
Frederic Barrat
5be587b111 cxl: Introduce implementation-specific API
The backend API (in cxl.h) lists some low-level functions whose
implementation is different on bare-metal and in a guest. Each
environment implements its own functions, and the common code uses
them through function pointers, defined in cxl_backend_ops

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:43 +11:00
Frederic Barrat
cca44c0192 cxl: Define process problem state area at attach time only
CXL kernel API was defining the process problem state area during
context initialization, making it possible to map the problem state
area before attaching the context. This won't work on a powerVM
guest. So force the logical behavior, like in userspace: attach first,
then map the problem state area.
Remove calls to cxl_assign_psn_space during init. The function is
already called on the attach paths.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:41 +11:00
Frederic Barrat
d56d301b51 cxl: Move bare-metal specific code to specialized files
Move a few functions around to better separate code specific to
bare-metal environment from code which will be commonly used between
guest and bare-metal.

Code specific to bare-metal is meant to be in native.c or pci.c
only. It's basically anything which touches the card p1 registers,
some p2 registers not needed from a guest and the PCI interface.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:39 +11:00
Christophe Lombard
8633186209 cxl: Move common code away from bare-metal-specific files
Move around some functions which will be accessed from the bare-metal
and guest environments.
Code in native.c and pci.c is meant to be bare-metal specific.
Other files contain code which may be shared with guests.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:37 +11:00
Vitaly Kuznetsov
ff06c5ffbc scsi: storvsc: fix SRB_STATUS_ABORTED handling
Commit 3209f9d780 ("scsi: storvsc: Fix a bug in the handling of SRB
status flags") filtered SRB_STATUS_AUTOSENSE_VALID out effectively making
the (SRB_STATUS_ABORTED | SRB_STATUS_AUTOSENSE_VALID) case a dead code. The
logic from this branch (e.g. storvsc_device_scan() call) is still required,
fix the check.

Cc: <stable@vger.kernel.org> #v4.4+
Fixes: 3209f9d780 ("scsi: storvsc: Fix a bug in the handling of SRB status flags")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-08 20:58:31 -05:00
Maurizio Lombardi
84bd64993f be2iscsi: set the boot_kset pointer to NULL in case of failure
In beiscsi_setup_boot_info(), the boot_kset pointer should be set to
NULL in case of failure otherwise an invalid pointer dereference may
occur later.

Cc: <stable@vger.kernel.org>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-08 20:58:31 -05:00
Martin K. Petersen
6540a65da9 sd: Fix discard granularity when LBPRZ=1
Commit 397737223c ("sd: Make discard granularity match logical block
size when LBPRZ=1") accidentally set the granularity to one byte instead
of one logical block on devices that provide deterministic zeroes after
UNMAP.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Fixes: 397737223c
Cc: <stable@vger.kernel.org> #v4.4+
2016-03-08 20:58:20 -05:00
Andrew Donnellan
949e9b827e powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
In eeh_pci_enable(), after making the request to set the new options, we
call eeh_ops->wait_state() to check that the request finished successfully.

At the moment, if eeh_ops->wait_state() returns 0, we return 0 without
checking that it reflects the expected outcome. This can lead to callers
further up the chain incorrectly assuming the slot has been successfully
unfrozen and continuing to attempt recovery.

On powernv, this will occur if pnv_eeh_get_pe_state() or
pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL
call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or
OPAL_EEH_PHB_ERROR respectively.

On pseries, this will occur if pseries_eeh_get_state() returns 0, which in
turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA
Stopped states.

Obviously, none of these cases represent a successful completion of a
request to thaw MMIO or DMA.

Fix the check so that a wait_state() return value of 0 won't be considered
successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 11:33:30 +11:00
Chen Fan
e237a55184 x86/ACPI/PCI: Recognize that Interrupt Line 255 means "not connected"
Per the x86-specific footnote to PCI spec r3.0, sec 6.2.4, the value 255 in
the Interrupt Line register means "unknown" or "no connection."
Previously, when we couldn't derive an IRQ from the _PRT, we fell back to
using the value from Interrupt Line as an IRQ.  It's questionable whether
we should do that at all, but the spec clearly suggests we shouldn't do it
for the value 255 on x86.

Calling request_irq() with IRQ 255 may succeed, but the driver won't
receive any interrupts.  Or, if IRQ 255 is shared with another device, it
may succeed, and the driver's ISR will be called at random times when the
*other* device interrupts.  Or it may fail if another device is using IRQ
255 with incompatible flags.  What we *want* is for request_irq() to fail
predictably so the driver can fall back to polling.

On x86, assume 255 in the Interrupt Line means the INTx line is not
connected.  In that case, set dev->irq to IRQ_NOTCONNECTED so request_irq()
will fail gracefully with -ENOTCONN.

We found this problem on a system where Secure Boot firmware assigned
Interrupt Line 255 to an i801_smbus device and another device was already
using MSI-X IRQ 255.  This was in v3.10, where i801_probe() fails if
request_irq() fails:

  i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
  i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C
  i801_smbus 0000:00:1f.3: PCI INT C: no GSI
  genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa)
  CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1
  Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5
  Call Trace:
    dump_stack+0x19/0x1b
    __setup_irq+0x54a/0x570
    request_threaded_irq+0xcc/0x170
    i801_probe+0x32f/0x508 [i2c_i801]
    local_pci_probe+0x45/0xa0
  i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16
  i801_smbus: probe of 0000:00:1f.3 failed with error -16

After aeb8a3d16a ("i2c: i801: Check if interrupts are disabled"),
i801_probe() will fall back to polling if request_irq() fails.  But we
still need this patch because request_irq() may succeed or fail depending
on other devices in the system.  If request_irq() fails, i801_smbus will
work by falling back to polling, but if it succeeds, i801_smbus won't work
because it expects interrupts that it may not receive.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 01:23:35 +01:00
Heikki Krogerus
7781203416 device property: fwnode->secondary may contain ERR_PTR(-ENODEV)
This fixes BUG triggered when fwnode->secondary is not NULL,
but has ERR_PTR(-ENODEV) instead.

BUG: unable to handle kernel paging request at ffffffffffffffed
IP: [<ffffffff81677b86>] __fwnode_property_read_string+0x26/0x160
PGD 200e067 PUD 2010067 PMD 0
Oops: 0000 [#1] SMP KASAN
Modules linked in: dwc3_pci(+) dwc3
CPU: 0 PID: 1138 Comm: modprobe Not tainted 4.5.0-rc5+ #61
task: ffff88015aaf5b00 ti: ffff88007b958000 task.ti: ffff88007b958000
RIP: 0010:[<ffffffff81677b86>]  [<ffffffff81677b86>] __fwnode_property_read_string+0x26/0x160
RSP: 0018:ffff88007b95eff8  EFLAGS: 00010246
RAX: fffffbfffffffffd RBX: ffffffffffffffed RCX: ffff88015999cd37
RDX: dffffc0000000000 RSI: ffffffff81e11bc0 RDI: ffffffffffffffed
RBP: ffff88007b95f020 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88007b90f7cf R11: 0000000000000000 R12: ffff88007b95f0a0
R13: 00000000fffffffa R14: ffffffff81e11bc0 R15: ffff880159ea37a0
FS:  00007ff35f46c700(0000) GS:ffff88015b800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffffffffffed CR3: 000000007b8be000 CR4: 00000000001006f0
Stack:
 ffff88015999cd20 ffffffff81e11bc0 ffff88007b95f0a0 ffff88007b383dd8
 ffff880159ea37a0 ffff88007b95f048 ffffffff81677d03 ffff88007b952460
 ffffffff81e11bc0 ffff88007b95f0a0 ffff88007b95f070 ffffffff81677d40
Call Trace:
 [<ffffffff81677d03>] fwnode_property_read_string+0x43/0x50
 [<ffffffff81677d40>] device_property_read_string+0x30/0x40
...

Fixes: 362c0b3024 (device property: Fallback to secondary fwnode if primary misses the property)
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 01:07:43 +01:00
Jon Hunter
41795a8a3c PM / Domains: Fix potential NULL pointer dereference
In the function of_genpd_get_from_provider(), we never check to see if
the argument 'genpdspec' is NULL before dereferencing it. Add error
checking to handle any NULL pointers.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 00:41:07 +01:00
Jon Hunter
beda5fc1ff PM / Domains: Fix removal of a subdomain
Commit 30e7a65b3f (PM / Domains: Ensure subdomain is not in use
before removing) added a test to ensure that a subdomain is not a
master to another subdomain or if any devices are using the subdomain
before removing. This change incorrectly used the "slave_links" list to
determine if the subdomain is a master to another subdomain, where it
should have been using the "master_links" list instead. The
"slave_links" list will never be empty for a subdomain and so a
subdomain can never be removed. Fix this by testing if the
"master_links" list is empty instead.

Fixes: 30e7a65b3f (PM / Domains: Ensure subdomain is not in use before removing)
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 00:41:06 +01:00
Laurent Pinchart
076395cae2 PM / Domains: Propagate start and restore errors during runtime resume
During runtime resume the return values of the start and restore steps
are ignored. As a result drivers are not notified of runtime resume
failures and can't propagate them up. Fix it by returning an error if
either the start or restore step fails, and clean up properly in the
error path.

Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 00:37:59 +01:00
Geert Uytterhoeven
0ba554e45c PM / Domains: Join state name and index in debugfs output
For low-power states, the state index is part of the state, hence join
them with a hyphen in the /sys/kernel/debug/pm_genpd/pm_genpd_summary
output. E.g. "off 0" becomes "off-0".

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 00:36:03 +01:00
Geert Uytterhoeven
6954d43292 PM / Domains: Restore alignment of slaves in debugfs output
The slave domains are no longer aligned with the table header in the
/sys/kernel/debug/pm_genpd/pm_genpd_summary output. Worse, the alignment
differs depending on the actual name of the state.

Format the state name and index into a buffer, and print that like
before to restore alignment.

Use "%u" for unsigned int while we're at it.

Fixes: fc5cbf0c94 (PM / Domains: Support for multiple states)
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 00:36:03 +01:00
Jacob Pan
323ee64aa1 powercap/rapl: track lead cpu per package
RAPL driver operates on MSRs that are under package/socket
scope instead of core scope. However, the current code does not
keep track of which CPUs are available on each package for MSR
access. Therefore it has to search for an active CPU on a given
package each time.

This patch optimizes the package level operations by tracking a
per package lead CPU during initialization and CPU hotplug. The
runtime search for active CPU is avoided.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 00:26:44 +01:00
Jacob Pan
309557f558 powercap/rapl: add package reference per domain
This patch adds to each rapl domain a reference of the package
it belongs to. At runtime, we can then avoid searching the package
data for each access. It simplifies the domain level operations
which depend on package level information.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 00:26:44 +01:00
Jacob Pan
f14a1396d8 powercap/rapl: reduce ipi calls
Reduce remote CPU calls for MSR access by combining read
modify write into one function.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 00:26:43 +01:00
Jacob Pan
16ccbd87c2 cpumask: export cpumask_any_but
Export cpumask_any_but() for module use. This will be used by drivers
such as intel_rapl to locate an active cpu on a socket.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 00:26:43 +01:00
Gavin Shan
b6c7347f2f powerpc/eeh: Remove duplicated check in eeh_dump_pe_log()
When eeh_dump_pe_log() is only called by eeh_slot_error_detail(),
we already have the check that the PE isn't in PCI config blocked
state in eeh_slot_error_detail(). So we needn't the duplicated
check in eeh_dump_pe_log().

This removes the duplicated check in eeh_dump_pe_log(). No logical
changes introduced.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 10:25:35 +11:00
Gavin Shan
eca036ee1b powerpc/eeh: Synchronize recovery in host/guest
When passing through SRIOV VFs to guest, we possibly encounter EEH
error on PF. In this case, the VF PEs are put into frozen state.
The error could be reported to guest before it's captured by the
host. That means the guest could attempt to recover errors on VFs
before host gets chance to recover errors on PFs. The VFs won't be
recovered successfully.

This enforces the recovery order for above case: the recovery on
child PE in guest is hold until the recovery on parent PE in host
is completed.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:28 +11:00
Gavin Shan
3fa7bf7229 powerpc/eeh: Don't remove passed VFs
When we have partial hotplug as part of the error recovery on PF,
the VFs that are bound with vfio-pci driver will experience hotplug.
That's not allowed.

This checks if the VF PE is passed or not. If it does, we leave
the VF without removing it.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:27 +11:00
Gavin Shan
2311cca555 powerpc/eeh: Don't propagate error to guest
When EEH error happened to the parent PE of those PEs that have
been passed through to guest, the error is propagated to guest
domain and the VFIO driver's error handlers are called. It's not
correct as the error in the host domain shouldn't be propagated
to guests and affect them.

This adds one more limitation when calling EEH error handlers.
If the PE has been passed through to guest, the error handlers
won't be called.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:25 +11:00
Wei Yang
67086e32b5 powerpc/eeh: powerpc/eeh: Support error recovery for VF PE
PFs are enumerated on PCI bus, while VFs are created by PF's driver.

In EEH recovery, it has two cases:
1. Device and driver is EEH aware, error handlers are called.
2. Device and driver is not EEH aware, un-plug the device and plug it again
by enumerating it.

The special thing happens on the second case. For a PF, we could use the
original pci core to enumerate the bus, while for VF we need to record the
VFs which aer un-plugged then plug it again.

Also The patch caches the VF index in pci_dn, which can be used to
calculate VF's bus, device and function number. Those information helps to
locate the VF's PCI device instance when doing hotplug during EEH recovery
if necessary.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:23 +11:00
Wei Yang
0dc2830e0a powerpc/powernv: Support PCI config restore for VFs
After PE reset, OPAL API opal_pci_reinit() is called on all devices
contained in the PE to reinitialize them. While skiboot is not aware of
VFs, we have to implement the function in kernel to reinitialize VFs after
reset on PE for VFs.

In this patch, two functions pnv_pci_fixup_vf_mps() and
pnv_eeh_restore_vf_config() both manipulate the MPS of the VF, since for a
VF it has three cases.

1. Normal creation for a VF
   In this case, pnv_pci_fixup_vf_mps() is called to make the MPS a proper
   value compared with its parent.
2. EEH recovery without VF removed
   In this case, MPS is stored in pci_dn and pnv_eeh_restore_vf_config() is
   called to restore it and reinitialize other part.
3. EEH recovery with VF removed
   In this case, VF will be removed then re-created. Both functions are
   called. First pnv_pci_fixup_vf_mps() is called to store the proper MPS
   to pci_dn and then pnv_eeh_restore_vf_config() is called to do proper
   thing.

This introduces two functions: pnv_pci_fixup_vf_mps() to fixup the VF's
MPS to make sure it is equal to parent's and store this value in pci_dn
for future use. pnv_eeh_restore_vf_config() to re-initialize on VF by
restoring MPS, disabling completion timeout, enabling SERR, etc.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:22 +11:00
Wei Yang
9312bc5bab powerpc/powernv: Support EEH reset for VF PE
PEs for VFs don't have primary bus. So they have to have their own reset
backend, which is used during EEH recovery. The patch implements the reset
backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained
in the PE.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:21 +11:00
Wei Yang
c29fa27d26 powerpc/eeh: Create PE for VFs
This creates PEs for VFs in the weak function pcibios_bus_add_device().
Those PEs for VFs are identified with newly introduced flag EEH_PE_VF
so that we treat them differently during EEH recovery.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:19 +11:00
Wei Yang
39218cd00e powerpc/eeh: EEH device for VF
VFs and their corresponding pdn are created and released dynamically
when their PF's SRIOV capability is enabled and disabled. This creates
and releases EEH devices for VFs when creating and releasing their pdn
instances, which means EEH devices and pdn instances have same life
cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn).

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:18 +11:00
Wei Yang
51c0e87e9a powerpc/eeh: Cache normal BARs, not windows or IOV BARs
This restricts the EEH address cache to use only the first 7 BARs. This
makes __eeh_addr_cache_insert_dev() ignore PCI bridge window and IOV BARs.
As the result of this change, eeh_addr_cache_get_dev() will return VFs from
VF's resource addresses instead of parent PFs.

This also removes PCI bridge check as we limit __eeh_addr_cache_insert_dev()
to 7 BARs and this effectively excludes PCI bridges from being cached.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:17 +11:00
Wei Yang
971427f582 powerpc/pci: Remove VFs prior to PF
As commit ac205b7bb7 ("PCI: make sriov work with hotplug remove")
indicates, VFs which is on the same PCI bus as their PF, should be
removed before the PF. Otherwise, we might run into kernel crash
at PCI unplugging time.

This applies the above pattern to powerpc PCI hotplug path.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 09:58:15 +11:00