Commit Graph

1284 Commits

Author SHA1 Message Date
Ilpo Järvinen
73d73556ed PCI: pciehp: Use RMW accessors for changing LNKCTL
[ Upstream commit 5f75f96c61 ]

As hotplug is not the only driver touching LNKCTL, use the RMW capability
accessor which handles concurrent changes correctly.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Fixes: 7f822999e1 ("PCI: pciehp: Add Disable/enable link functions")
Link: https://lore.kernel.org/r/20230717120503.15276-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-23 10:59:47 +02:00
Igor Mammedov
9e5fe282f9 PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus
commit cc22522fd5 upstream.

40613da52b ("PCI: acpiphp: Reassign resources on bridge if necessary")
changed acpiphp hotplug to use pci_assign_unassigned_bridge_resources()
which depends on bridge being available, however enable_slot() can be
called without bridge associated:

  1. Legitimate case of hotplug on root bus (widely used in virt world)

  2. A (misbehaving) firmware, that sends ACPI Bus Check notifications to
     non existing root ports (Dell Inspiron 7352/0W6WV0), which end up at
     enable_slot(..., bridge = 0) where bus has no bridge assigned to it.
     acpihp doesn't know that it's a bridge, and bus specific 'PCI
     subsystem' can't augment ACPI context with bridge information since
     the PCI device to get this data from is/was not available.

Issue is easy to reproduce with QEMU's 'pc' machine, which supports PCI
hotplug on hostbridge slots. To reproduce, boot kernel at commit
40613da52b in VM started with following CLI (assuming guest root fs is
installed on sda1 partition):

  # qemu-system-x86_64 -M pc -m 1G -enable-kvm -cpu host \
        -monitor stdio -serial file:serial.log           \
        -kernel arch/x86/boot/bzImage                    \
        -append "root=/dev/sda1 console=ttyS0"           \
        guest_disk.img

Once guest OS is fully booted at qemu prompt:

  (qemu) device_add e1000

(check serial.log) it will cause NULL pointer dereference at:

  void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
  {
    struct pci_bus *parent = bridge->subordinate;

  BUG: kernel NULL pointer dereference, address: 0000000000000018

   ? pci_assign_unassigned_bridge_resources+0x1f/0x260
   enable_slot+0x21f/0x3e0
   acpiphp_hotplug_notify+0x13d/0x260
   acpi_device_hotplug+0xbc/0x540
   acpi_hotplug_work_fn+0x15/0x20
   process_one_work+0x1f7/0x370
   worker_thread+0x45/0x3b0

The issue was discovered on Dell Inspiron 7352/0W6WV0 laptop with following
sequence:

  1. Suspend to RAM
  2. Wake up with the same backtrace being observed:
  3. 2nd suspend to RAM attempt makes laptop freeze

Fix it by using __pci_bus_assign_resources() instead of
pci_assign_unassigned_bridge_resources() as we used to do, but only in case
when bus doesn't have a bridge associated (to cover for the case of ACPI
event on hostbridge or non existing root port).

That lets us keep hotplug on root bus working like it used to and at the
same time keeps resource reassignment usable on root ports (and other 1st
level bridges) that was fixed by 40613da52b.

Fixes: 40613da52b ("PCI: acpiphp: Reassign resources on bridge if necessary")
Link: https://lore.kernel.org/r/20230726123518.2361181-2-imammedo@redhat.com
Reported-by: Woody Suwalski <terraluna977@gmail.com>
Tested-by: Woody Suwalski <terraluna977@gmail.com>
Tested-by: Michal Koutný <mkoutny@suse.com>
Link: https://lore.kernel.org/r/11fc981c-af49-ce64-6b43-3e282728bd1a@gmail.com
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-30 16:27:26 +02:00
Igor Mammedov
7abd6dce29 PCI: acpiphp: Reassign resources on bridge if necessary
[ Upstream commit 40613da52b ]

When using ACPI PCI hotplug, hotplugging a device with large BARs may fail
if bridge windows programmed by firmware are not large enough.

Reproducer:
  $ qemu-kvm -monitor stdio -M q35  -m 4G \
      -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=on \
      -device id=rp1,pcie-root-port,bus=pcie.0,chassis=4 \
      disk_image

 wait till linux guest boots, then hotplug device:
   (qemu) device_add qxl,bus=rp1

 hotplug on guest side fails with:
   pci 0000:01:00.0: [1b36:0100] type 00 class 0x038000
   pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x03ffffff]
   pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x03ffffff]
   pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x00001fff]
   pci 0000:01:00.0: reg 0x1c: [io  0x0000-0x001f]
   pci 0000:01:00.0: BAR 0: no space for [mem size 0x04000000]
   pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x04000000]
   pci 0000:01:00.0: BAR 1: no space for [mem size 0x04000000]
   pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x04000000]
   pci 0000:01:00.0: BAR 2: assigned [mem 0xfe800000-0xfe801fff]
   pci 0000:01:00.0: BAR 3: assigned [io  0x1000-0x101f]
   qxl 0000:01:00.0: enabling device (0000 -> 0003)
   Unable to create vram_mapping
   qxl: probe of 0000:01:00.0 failed with error -12

However when using native PCIe hotplug
  '-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off'
it works fine, since kernel attempts to reassign unused resources.

Use the same machinery as native PCIe hotplug to (re)assign resources.

Link: https://lore.kernel.org/r/20230424191557.2464760-1-imammedo@redhat.com
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-30 16:27:20 +02:00
Rongguang Wei
331dce61c0 PCI: pciehp: Cancel bringup sequence if card is not present
[ Upstream commit e8afd0d9fc ]

If a PCIe hotplug slot has an Attention Button, the normal hot-add flow is:

  - Slot is empty and slot power is off
  - User inserts card in slot and presses Attention Button
  - OS blinks Power Indicator for 5 seconds
  - After 5 seconds, OS turns on Power Indicator, turns on slot power, and
    enumerates the device

Previously, if a user pressed the Attention Button on an *empty* slot,
pciehp logged the following messages and blinked the Power Indicator
until a second button press:

  [0.000] pciehp: Button press: will power on in 5 sec
  [0.001] # Power Indicator starts blinking
  [5.001] # 5 second timeout; slot is empty, so we should cancel the
            request to power on and turn off Power Indicator

  [7.000] # Power Indicator still blinking
  [8.000] # possible card insertion
  [9.000] pciehp: Button press: canceling request to power on

The first button press incorrectly left the slot in BLINKINGON_STATE, so
the second was interpreted as a "cancel power on" event regardless of
whether a card was present.

If the slot is empty, turn off the Power Indicator and return from
BLINKINGON_STATE to OFF_STATE after 5 seconds, effectively canceling the
request to power on.  Putting the slot in OFF_STATE also means the second
button press will correctly request a slot power on if the slot is
occupied.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20230512021518.336460-1-clementwei90@163.com
Fixes: d331710ea7 ("PCI: pciehp: Become resilient to missed events")
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Rongguang Wei <weirongguang@kylinos.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27 08:37:16 +02:00
Lukas Wunner
c2c3ffc798 PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock
commit f5eff5591b upstream.

In 2013, commits

  2e35afaefe ("PCI: pciehp: Add reset_slot() method")
  608c388122 ("PCI: Add slot reset option to pci_dev_reset()")

amended PCIe hotplug to mask Presence Detect Changed events during a
Secondary Bus Reset.  The reset thus no longer causes gratuitous slot
bringdown and bringup.

However the commits neglected to serialize reset with code paths reading
slot registers.  For instance, a slot bringup due to an earlier hotplug
event may see the Presence Detect State bit cleared during a concurrent
Secondary Bus Reset.

In 2018, commit

  5b3f7b7d06 ("PCI: pciehp: Avoid slot access during reset")

retrofitted the missing locking.  It introduced a reset_lock which
serializes a Secondary Bus Reset with other parts of pciehp.

Unfortunately the locking turns out to be overzealous:  reset_lock is
held for the entire enumeration and de-enumeration of hotplugged devices,
including driver binding and unbinding.

Driver binding and unbinding acquires device_lock while the reset_lock
of the ancestral hotplug port is held.  A concurrent Secondary Bus Reset
acquires the ancestral reset_lock while already holding the device_lock.
The asymmetric locking order in the two code paths can lead to AB-BA
deadlocks.

Michael Haeuptle reports such deadlocks on simultaneous hot-removal and
vfio release (the latter implies a Secondary Bus Reset):

  pciehp_ist()                                    # down_read(reset_lock)
    pciehp_handle_presence_or_link_change()
      pciehp_disable_slot()
        __pciehp_disable_slot()
          remove_board()
            pciehp_unconfigure_device()
              pci_stop_and_remove_bus_device()
                pci_stop_bus_device()
                  pci_stop_dev()
                    device_release_driver()
                      device_release_driver_internal()
                        __device_driver_lock()    # device_lock()

  SYS_munmap()
    vfio_device_fops_release()
      vfio_device_group_close()
        vfio_device_close()
          vfio_device_last_close()
            vfio_pci_core_close_device()
              vfio_pci_core_disable()             # device_lock()
                __pci_reset_function_locked()
                  pci_reset_bus_function()
                    pci_dev_reset_slot_function()
                      pci_reset_hotplug_slot()
                        pciehp_reset_slot()       # down_write(reset_lock)

Ian May reports the same deadlock on simultaneous hot-removal and an
AER-induced Secondary Bus Reset:

  aer_recover_work_func()
    pcie_do_recovery()
      aer_root_reset()
        pci_bus_error_reset()
          pci_slot_reset()
            pci_slot_lock()                       # device_lock()
            pci_reset_hotplug_slot()
              pciehp_reset_slot()                 # down_write(reset_lock)

Fix by releasing the reset_lock during driver binding and unbinding,
thereby splitting and shrinking the critical section.

Driver binding and unbinding is protected by the device_lock() and thus
serialized with a Secondary Bus Reset.  There's no need to additionally
protect it with the reset_lock.  However, pciehp does not bind and
unbind devices directly, but rather invokes PCI core functions which
also perform certain enumeration and de-enumeration steps.

The reset_lock's purpose is to protect slot registers, not enumeration
and de-enumeration of hotplugged devices.  That would arguably be the
job of the PCI core, not the PCIe hotplug driver.  After all, an
AER-induced Secondary Bus Reset may as well happen during boot-time
enumeration of the PCI hierarchy and there's no locking to prevent that
either.

Exempting *de-enumeration* from the reset_lock is relatively harmless:
A concurrent Secondary Bus Reset may foil config space accesses such as
PME interrupt disablement.  But if the device is physically gone, those
accesses are pointless anyway.  If the device is physically present and
only logically removed through an Attention Button press or the sysfs
"power" attribute, PME interrupts as well as DMA cannot come through
because pciehp_unconfigure_device() disables INTx and Bus Master bits.
That's still protected by the reset_lock in the present commit.

Exempting *enumeration* from the reset_lock also has limited impact:
The exempted call to pci_bus_add_device() may perform device accesses
through pcibios_bus_add_device() and pci_fixup_device() which are now
no longer protected from a concurrent Secondary Bus Reset.  Otherwise
there should be no impact.

In essence, the present commit seeks to fix the AB-BA deadlocks while
still retaining a best-effort reset protection for enumeration and
de-enumeration of hotplugged devices -- until a general solution is
implemented in the PCI core.

Link: https://lore.kernel.org/linux-pci/CS1PR8401MB0728FC6FDAB8A35C22BD90EC95F10@CS1PR8401MB0728.NAMPRD84.PROD.OUTLOOK.COM
Link: https://lore.kernel.org/linux-pci/20200615143250.438252-1-ian.may@canonical.com
Link: https://lore.kernel.org/linux-pci/ce878dab-c0c4-5bd0-a725-9805a075682d@amd.com
Link: https://lore.kernel.org/linux-pci/ed831249-384a-6d35-0831-70af191e9bce@huawei.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590
Fixes: 5b3f7b7d06 ("PCI: pciehp: Avoid slot access during reset")
Link: https://lore.kernel.org/r/fef2b2e9edf245c049a8c5b94743c0f74ff5008a.1681191902.git.lukas@wunner.de
Reported-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reported-by: Ian May <ian.may@canonical.com>
Reported-by: Andrey Grodzovsky <andrey2805@gmail.com>
Reported-by: Rahul Kumar <rahul.kumar1@amd.com>
Reported-by: Jialin Zhang <zhangjialin11@huawei.com>
Tested-by: Anatoli Antonovitch <Anatoli.Antonovitch@amd.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v4.19+
Cc: Dan Stein <dstein@hpe.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Alex Michon <amichon@kalrayinc.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17 11:36:04 +02:00
Hans de Goede
6a24285f64 PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
commit 085a9f4343 upstream.

Use down_read_nested() and down_write_nested() when taking the
ctrl->reset_lock rw-sem, passing the number of PCIe hotplug controllers in
the path to the PCI root bus as lock subclass parameter.

This fixes the following false-positive lockdep report when unplugging a
Lenovo X1C8 from a Lenovo 2nd gen TB3 dock:

  pcieport 0000:06:01.0: pciehp: Slot(1): Link Down
  pcieport 0000:06:01.0: pciehp: Slot(1): Card not present
  ============================================
  WARNING: possible recursive locking detected
  5.16.0-rc2+ #621 Not tainted
  --------------------------------------------
  irq/124-pciehp/86 is trying to acquire lock:
  ffff8e5ac4299ef8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_check_presence+0x23/0x80

  but task is already holding lock:
  ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180

   other info that might help us debug this:
   Possible unsafe locking scenario:

	 CPU0
	 ----
    lock(&ctrl->reset_lock);
    lock(&ctrl->reset_lock);

   *** DEADLOCK ***

   May be due to missing lock nesting notation

  3 locks held by irq/124-pciehp/86:
   #0: ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
   #1: ffffffffa3b024e8 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_unconfigure_device+0x31/0x110
   #2: ffff8e5ac1ee2248 (&dev->mutex){....}-{3:3}, at: device_release_driver+0x1c/0x40

  stack backtrace:
  CPU: 4 PID: 86 Comm: irq/124-pciehp Not tainted 5.16.0-rc2+ #621
  Hardware name: LENOVO 20U90SIT19/20U90SIT19, BIOS N2WET30W (1.20 ) 08/26/2021
  Call Trace:
   <TASK>
   dump_stack_lvl+0x59/0x73
   __lock_acquire.cold+0xc5/0x2c6
   lock_acquire+0xb5/0x2b0
   down_read+0x3e/0x50
   pciehp_check_presence+0x23/0x80
   pciehp_runtime_resume+0x5c/0xa0
   device_for_each_child+0x45/0x70
   pcie_port_device_runtime_resume+0x20/0x30
   pci_pm_runtime_resume+0xa7/0xc0
   __rpm_callback+0x41/0x110
   rpm_callback+0x59/0x70
   rpm_resume+0x512/0x7b0
   __pm_runtime_resume+0x4a/0x90
   __device_release_driver+0x28/0x240
   device_release_driver+0x26/0x40
   pci_stop_bus_device+0x68/0x90
   pci_stop_bus_device+0x2c/0x90
   pci_stop_and_remove_bus_device+0xe/0x20
   pciehp_unconfigure_device+0x6c/0x110
   pciehp_disable_slot+0x5b/0xe0
   pciehp_handle_presence_or_link_change+0xc3/0x2f0
   pciehp_ist+0x179/0x180

This lockdep warning is triggered because with Thunderbolt, hotplug ports
are nested. When removing multiple devices in a daisy-chain, each hotplug
port's reset_lock may be acquired recursively. It's never the same lock, so
the lockdep splat is a false positive.

Because locks at the same hierarchy level are never acquired recursively, a
per-level lockdep class is sufficient to fix the lockdep warning.

The choice to use one lockdep subclass per pcie-hotplug controller in the
path to the root-bus was made to conserve class keys because their number
is limited and the complexity grows quadratically with number of keys
according to Documentation/locking/lockdep-design.rst.

Link: https://lore.kernel.org/linux-pci/20190402021933.GA2966@mit.edu/
Link: https://lore.kernel.org/linux-pci/de684a28-9038-8fc6-27ca-3f6f2f6400d7@redhat.com/
Link: https://lore.kernel.org/r/20211217141709.379663-1-hdegoede@redhat.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208855
Reported-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org
[lukas: backport to v5.4-stable]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17 11:36:04 +02:00
Manivannan Sadhasivam
6e2dff272c PCI: pciehp: Add Qualcomm quirk for Command Completed erratum
[ Upstream commit 9f72d4757c ]

The Qualcomm PCI bridge device (Device ID 0x0110) found in chipsets such as
SM8450 does not set the Command Completed bit unless writes to the Slot
Command register change "Control" bits.

This results in timeouts like below:

  pcieport 0001:00:00.0: pciehp: Timeout on hotplug command 0x03c0 (issued 2020 msec ago)

Add the device to the Command Completed quirk to mark commands "completed"
immediately unless they change the "Control" bits.

Link: https://lore.kernel.org/r/20220210145003.135907-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15 14:18:33 +02:00
Liguang Zhang
accf175d0c PCI: pciehp: Clear cmd_busy bit in polling mode
commit 92912b1751 upstream.

Writes to a Downstream Port's Slot Control register are PCIe hotplug
"commands."  If the Port supports Command Completed events, software must
wait for a command to complete before writing to Slot Control again.

pcie_do_write_cmd() sets ctrl->cmd_busy when it writes to Slot Control.  If
software notification is enabled, i.e., PCI_EXP_SLTCTL_HPIE and
PCI_EXP_SLTCTL_CCIE are set, ctrl->cmd_busy is cleared by pciehp_isr().

But when software notification is disabled, as it is when pcie_init()
powers off an empty slot, pcie_wait_cmd() uses pcie_poll_cmd() to poll for
command completion, and it neglects to clear ctrl->cmd_busy, which leads to
spurious timeouts:

  pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x01c0 (issued 2264 msec ago)
  pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x05c0 (issued 2288 msec ago)

Clear ctrl->cmd_busy in pcie_poll_cmd() when it detects a Command Completed
event (PCI_EXP_SLTSTA_CC).

[bhelgaas: commit log]
Fixes: a5dd4b4b05 ("PCI: pciehp: Wait for hotplug command completion where necessary")
Link: https://lore.kernel.org/r/20211111054258.7309-1-zhangliguang@linux.alibaba.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215143
Link: https://lore.kernel.org/r/20211126173309.GA12255@wunner.de
Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org	# v4.19+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15 14:18:04 +02:00
Lukas Wunner
464da38ba8 PCI: pciehp: Fix infinite loop in IRQ handler upon power fault
commit 23584c1ed3 upstream.

The Power Fault Detected bit in the Slot Status register differs from
all other hotplug events in that it is sticky:  It can only be cleared
after turning off slot power.  Per PCIe r5.0, sec. 6.7.1.8:

  If a power controller detects a main power fault on the hot-plug slot,
  it must automatically set its internal main power fault latch [...].
  The main power fault latch is cleared when software turns off power to
  the hot-plug slot.

The stickiness used to cause interrupt storms and infinite loops which
were fixed in 2009 by commits 5651c48cfa ("PCI pciehp: fix power fault
interrupt storm problem") and 99f0169c17 ("PCI: pciehp: enable
software notification on empty slots").

Unfortunately in 2020 the infinite loop issue was inadvertently
reintroduced by commit 8edf5332c3 ("PCI: pciehp: Fix MSI interrupt
race"):  The hardirq handler pciehp_isr() clears the PFD bit until
pciehp's power_fault_detected flag is set.  That happens in the IRQ
thread pciehp_ist(), which never learns of the event because the hardirq
handler is stuck in an infinite loop.  Fix by setting the
power_fault_detected flag already in the hardirq handler.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=214989
Link: https://lore.kernel.org/linux-pci/DM8PR11MB5702255A6A92F735D90A4446868B9@DM8PR11MB5702.namprd11.prod.outlook.com
Fixes: 8edf5332c3 ("PCI: pciehp: Fix MSI interrupt race")
Link: https://lore.kernel.org/r/66eaeef31d4997ceea357ad93259f290ededecfd.1637187226.git.lukas@wunner.de
Reported-by: Joseph Bao <joseph.bao@intel.com>
Tested-by: Joseph Bao <joseph.bao@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v4.19+
Cc: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-05 12:35:36 +01:00
Vishal Aslot
f336aa92b4 PCI: ibmphp: Fix double unmap of io_mem
[ Upstream commit faa2e05ad0 ]

ebda_rsrc_controller() calls iounmap(io_mem) on the error path. Its caller,
ibmphp_access_ebda(), also calls iounmap(io_mem) on good and error paths.

Remove the iounmap(io_mem) invocation from ebda_rsrc_controller().

[bhelgaas: remove item from TODO]
Link: https://lore.kernel.org/r/20210818165751.591185-1-os.vaslot@gmail.com
Signed-off-by: Vishal Aslot <os.vaslot@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-22 12:26:44 +02:00
Feilong Lin
bbd7ba95bb ACPI / hotplug / PCI: Fix reference count leak in enable_slot()
[ Upstream commit 3bbfd31903 ]

In enable_slot(), if pci_get_slot() returns NULL, we clear the SLOT_ENABLED
flag. When pci_get_slot() finds a device, it increments the device's
reference count.  In this case, we did not call pci_dev_put() to decrement
the reference count, so the memory of the device (struct pci_dev type) will
eventually leak.

Call pci_dev_put() to decrement its reference count when pci_get_slot()
returns a PCI device.

Link: https://lore.kernel.org/r/b411af88-5049-a1c6-83ac-d104a1f429be@huawei.com
Signed-off-by: Feilong Lin <linfeilong@huawei.com>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-22 11:38:28 +02:00
Tyrel Datwyler
51a2b19b55 PCI: rpadlpar: Fix potential drc_name corruption in store functions
commit cc7a0bb058 upstream.

Both add_slot_store() and remove_slot_store() try to fix up the
drc_name copied from the store buffer by placing a NUL terminator at
nbyte + 1 or in place of a '\n' if present. However, the static buffer
that we copy the drc_name data into is not zeroed and can contain
anything past the n-th byte.

This is problematic if a '\n' byte appears in that buffer after nbytes
and the string copied into the store buffer was not NUL terminated to
start with as the strchr() search for a '\n' byte will mark this
incorrectly as the end of the drc_name string resulting in a drc_name
string that contains garbage data after the n-th byte.

Additionally it will cause us to overwrite that '\n' byte on the stack
with NUL, potentially corrupting data on the stack.

The following debugging shows an example of the drmgr utility writing
"PHB 4543" to the add_slot sysfs attribute, but add_slot_store()
logging a corrupted string value.

  drmgr: drmgr: -c phb -a -s PHB 4543 -d 1
  add_slot_store: drc_name = PHB 4543°|<82>!, rc = -19

Fix this by using strscpy() instead of memcpy() to ensure the string
is NUL terminated when copied into the static drc_name buffer.
Further, since the string is now NUL terminated the code only needs to
change '\n' to '\0' when present.

Cc: stable@vger.kernel.org
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
[mpe: Reformat change log and add mention of possible stack corruption]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210315214821.452959-1-tyreld@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 11:26:43 +01:00
Stuart Hayes
4667358dab PCI: pciehp: Fix MSI interrupt race
[ Upstream commit 8edf5332c3 ]

Without this commit, a PCIe hotplug port can stop generating interrupts on
hotplug events, so device adds and removals will not be seen:

The pciehp interrupt handler pciehp_isr() reads the Slot Status register
and then writes back to it to clear the bits that caused the interrupt.  If
a different interrupt event bit gets set between the read and the write,
pciehp_isr() returns without having cleared all of the interrupt event
bits.  If this happens when the MSI isn't masked (which by default it isn't
in handle_edge_irq(), and which it will never be when MSI per-vector
masking is not supported), we won't get any more hotplug interrupts from
that device.

That is expected behavior, according to the PCIe Base Spec r5.0, section
6.7.3.4, "Software Notification of Hot-Plug Events".

Because the Presence Detect Changed and Data Link Layer State Changed event
bits can both get set at nearly the same time when a device is added or
removed, this is more likely to happen than it might seem.  The issue was
found (and can be reproduced rather easily) by connecting and disconnecting
an NVMe storage device on at least one system model where the NVMe devices
were being connected to an AMD PCIe port (PCI device 0x1022/0x1483).

Fix the issue by modifying pciehp_isr() to loop back and re-read the Slot
Status register immediately after writing to it, until it sees that all of
the event status bits have been cleared.

[lukas: drop loop count limitation, write "events" instead of "status",
don't loop back in INTx and poll modes, tweak code comment & commit msg]
Link: https://lore.kernel.org/r/78b4ced5072bfe6e369d20e8b47c279b8c7af12e.1582121613.git.lukas@wunner.de
Tested-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:17:52 +02:00
Rafael J. Wysocki
ae86233204 PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context()
commit dae68d7fd4 upstream.

If context is not NULL in acpiphp_grab_context(), but the
is_going_away flag is set for the device's parent, the reference
counter of the context needs to be decremented before returning
NULL or the context will never be freed, so make that happen.

Fixes: edf5bf34d4 ("ACPI / dock: Use callback pointers from devices' ACPI hotplug contexts")
Reported-by: Vasily Averin <vvs@virtuozzo.com>
Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-21 13:05:20 +02:00
Mika Westerberg
01fad934f1 PCI: pciehp: Prevent deadlock on disconnect
[ Upstream commit 87d0f2a553 ]

This addresses deadlocks in these common cases in hierarchies containing
two switches:

  - All involved ports are runtime suspended and they are unplugged. This
    can happen easily if the drivers involved automatically enable runtime
    PM (xHCI for example does that).

  - System is suspended (e.g., closing the lid on a laptop) with a dock +
    something else connected, and the dock is unplugged while suspended.

These cases lead to the following deadlock:

  INFO: task irq/126-pciehp:198 blocked for more than 120 seconds.
  irq/126-pciehp  D    0   198      2 0x80000000
  Call Trace:
   schedule+0x2c/0x80
   schedule_timeout+0x246/0x350
   wait_for_completion+0xb7/0x140
   kthread_stop+0x49/0x110
   free_irq+0x32/0x70
   pcie_shutdown_notification+0x2f/0x50
   pciehp_remove+0x27/0x50
   pcie_port_remove_service+0x36/0x50
   device_release_driver+0x12/0x20
   bus_remove_device+0xec/0x160
   device_del+0x13b/0x350
   device_unregister+0x1a/0x60
   remove_iter+0x1e/0x30
   device_for_each_child+0x56/0x90
   pcie_port_device_remove+0x22/0x40
   pcie_portdrv_remove+0x20/0x60
   pci_device_remove+0x3e/0xc0
   device_release_driver_internal+0x18c/0x250
   device_release_driver+0x12/0x20
   pci_stop_bus_device+0x6f/0x90
   pci_stop_bus_device+0x31/0x90
   pci_stop_and_remove_bus_device+0x12/0x20
   pciehp_unconfigure_device+0x88/0x140
   pciehp_disable_slot+0x6a/0x110
   pciehp_handle_presence_or_link_change+0x263/0x400
   pciehp_ist+0x1c9/0x1d0
   irq_thread_fn+0x24/0x60
   irq_thread+0xeb/0x190
   kthread+0x120/0x140

  INFO: task irq/190-pciehp:2288 blocked for more than 120 seconds.
  irq/190-pciehp  D    0  2288      2 0x80000000
  Call Trace:
   __schedule+0x2a2/0x880
   schedule+0x2c/0x80
   schedule_preempt_disabled+0xe/0x10
   mutex_lock+0x2c/0x30
   pci_lock_rescan_remove+0x15/0x20
   pciehp_unconfigure_device+0x4d/0x140
   pciehp_disable_slot+0x6a/0x110
   pciehp_handle_presence_or_link_change+0x263/0x400
   pciehp_ist+0x1c9/0x1d0
   irq_thread_fn+0x24/0x60
   irq_thread+0xeb/0x190
   kthread+0x120/0x140

What happens here is that the whole hierarchy is runtime resumed and the
parent PCIe downstream port, which got the hot-remove event, starts
removing devices below it, taking pci_lock_rescan_remove() lock. When the
child PCIe port is runtime resumed it calls pciehp_check_presence() which
ends up calling pciehp_card_present() and pciehp_check_link_active().  Both
of these use pcie_capability_read_word(), which notices that the underlying
device is already gone and returns PCIBIOS_DEVICE_NOT_FOUND with the
capability value set to 0. When pciehp gets this value it thinks that its
child device is also hot-removed and schedules its IRQ thread to handle the
event.

The deadlock happens when the child's IRQ thread runs and tries to acquire
pci_lock_rescan_remove() which is already taken by the parent and the
parent waits for the child's IRQ thread to finish.

Prevent this from happening by checking the return value of
pcie_capability_read_word() and if it is PCIBIOS_DEVICE_NOT_FOUND stop
performing any hot-removal activities.

[bhelgaas: add common scenarios to commit log]
Link: https://lore.kernel.org/r/20191029170022.57528-2-mika.westerberg@linux.intel.com
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-29 16:33:04 +02:00
Lukas Wunner
463181e64f PCI: pciehp: Fix indefinite wait on sysfs requests
commit 3e487d2e4a upstream.

David Hoyer reports that powering pciehp slots up or down via sysfs may
hang:  The call to wait_event() in pciehp_sysfs_enable_slot() and
_disable_slot() does not return because ctrl->ist_running remains true.

This flag, which was introduced by commit 157c1062fc ("PCI: pciehp: Avoid
returning prematurely from sysfs requests"), signifies that the IRQ thread
pciehp_ist() is running.  It is set to true at the top of pciehp_ist() and
reset to false at the end.  However there are two additional return
statements in pciehp_ist() before which the commit neglected to reset the
flag to false and wake up waiters for the flag.

That omission opens up the following race when powering up the slot:

* pciehp_ist() runs because a PCI_EXP_SLTSTA_PDC event was requested
  by pciehp_sysfs_enable_slot()

* pciehp_ist() turns on slot power via the following call stack:
  pciehp_handle_presence_or_link_change() -> pciehp_enable_slot() ->
  __pciehp_enable_slot() -> board_added() -> pciehp_power_on_slot()

* after slot power is turned on, the link comes up, resulting in a
  PCI_EXP_SLTSTA_DLLSC event

* the IRQ handler pciehp_isr() stores the event in ctrl->pending_events
  and returns IRQ_WAKE_THREAD

* the IRQ thread is already woken (it's bringing up the slot), but the
  genirq code remembers to re-run the IRQ thread after it has finished
  (such that it can deal with the new event) by setting IRQTF_RUNTHREAD
  via __handle_irq_event_percpu() -> __irq_wake_thread()

* the IRQ thread removes PCI_EXP_SLTSTA_DLLSC from ctrl->pending_events
  via board_added() -> pciehp_check_link_status() in order to deal with
  presence and link flaps per commit 6c35a1ac3d ("PCI: pciehp:
  Tolerate initially unstable link")

* after pciehp_ist() has successfully brought up the slot, it resets
  ctrl->ist_running to false and wakes up the sysfs requester

* the genirq code re-runs pciehp_ist(), which sets ctrl->ist_running
  to true but then returns with IRQ_NONE because ctrl->pending_events
  is empty

* pciehp_sysfs_enable_slot() is finally woken but notices that
  ctrl->ist_running is true, hence continues waiting

The only way to get the hung task going again is to trigger a hotplug
event which brings down the slot, e.g. by yanking out the card.

The same race exists when powering down the slot because remove_board()
likewise clears link or presence changes in ctrl->pending_events per commit
3943af9d01 ("PCI: pciehp: Ignore Link State Changes after powering off a
slot") and thereby may cause a re-run of pciehp_ist() which returns with
IRQ_NONE without resetting ctrl->ist_running to false.

Fix by adding a goto label before the teardown steps at the end of
pciehp_ist() and jumping to that label from the two return statements which
currently neglect to reset the ctrl->ist_running flag.

Fixes: 157c1062fc ("PCI: pciehp: Avoid returning prematurely from sysfs requests")
Link: https://lore.kernel.org/r/cca1effa488065cb055120aa01b65719094bdcb5.1584530321.git.lukas@wunner.de
Reported-by: David Hoyer <David.Hoyer@netapp.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Cc: stable@vger.kernel.org	# v4.19+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-17 10:50:10 +02:00
Mika Westerberg
ee30108f7a PCI: pciehp: Do not disable interrupt twice on suspend
commit 75fcc0ce72 upstream.

We try to keep PCIe hotplug ports runtime suspended when entering system
suspend. Because the PCIe portdrv sets the DPM_FLAG_NEVER_SKIP flag, the PM
core always calls system suspend/resume hooks even if the device is left
runtime suspended. Since PCIe hotplug driver re-used the same function for
both runtime suspend and system suspend, it ended up disabling hotplug
interrupt twice and the second time following was printed:

  pciehp 0000:03:01.0:pcie204: pcie_do_write_cmd: no response from device

Prevent this from happening by checking whether the device is already
runtime suspended when the system suspend hook is called.

Fixes: 9c62f0bfb8 ("PCI: pciehp: Implement runtime PM callbacks")
Link: https://lore.kernel.org/r/20191029170022.57528-1-mika.westerberg@linux.intel.com
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17 19:48:51 +01:00
Tyrel Datwyler
03c90248c5 PCI: rpaphp: Correctly match ibm, my-drc-index to drc-name when using drc-info
[ Upstream commit 4f9f2d3d7a ]

The newer ibm,drc-info property is a condensed description of the old
ibm,drc-* properties (ie. names, types, indexes, and power-domains).
When matching a drc-index to a drc-name we need to verify that the
index is within the start and last drc-index range and map it to a
drc-name using the drc-name-prefix and logical index.

Fix the mapping by checking that the index is within the range of the
current drc-info entry, and build the name from the drc-name-prefix
concatenated with the starting drc-name-suffix value and the sequential
index obtained by subtracting ibm,my-drc-index from this entries
drc-start-index.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1573449697-5448-10-git-send-email-tyreld@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:18:00 +01:00
Tyrel Datwyler
1b49ded444 PCI: rpaphp: Annotate and correctly byte swap DRC properties
[ Upstream commit 0737686778 ]

The device tree is in big endian format and any properties directly
retrieved using OF helpers that don't explicitly byte swap should
be annotated. In particular there are several places where we grab
the opaque property value for the old ibm,drc-* properties and the
ibm,my-drc-index property.

Fix this for better static checking by annotating values we know to
explicitly big endian, and byte swap where appropriate.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1573449697-5448-9-git-send-email-tyreld@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:17:57 +01:00
Tyrel Datwyler
7aae44bbc9 PCI: rpaphp: Don't rely on firmware feature to imply drc-info support
[ Upstream commit 52e2b0f165 ]

In the event that the partition is migrated to a platform with older
firmware that doesn't support the ibm,drc-info property the device
tree is modified to remove the ibm,drc-info property and replace it
with the older style ibm,drc-* properties for types, names, indexes,
and power-domains. One of the requirements of the drc-info firmware
feature is that the client is able to handle both the new property,
and old style properties at runtime. Therefore we can't rely on the
firmware feature alone to dictate which property is currently
present in the device tree.

Fix this short coming by checking explicitly for the ibm,drc-info
property, and falling back to the older ibm,drc-* properties if it
doesn't exist.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1573449697-5448-6-git-send-email-tyreld@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:17:55 +01:00
Tyrel Datwyler
b8361f9a86 PCI: rpaphp: Fix up pointer to first drc-info entry
[ Upstream commit 9723c25f99 ]

The first entry of the ibm,drc-info property is an int encoded count
of the number of drc-info entries that follow. The "value" pointer
returned by of_prop_next_u32() is still pointing at the this value
when we call of_read_drc_info_cell(), but the helper function
expects that value to be pointing at the first element of an entry.

Fix up by incrementing the "value" pointer to point at the first
element of the first drc-info entry prior.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1573449697-5448-5-git-send-email-tyreld@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:17:44 +01:00
Lukas Wunner
9bd9d12339 PCI: pciehp: Avoid returning prematurely from sysfs requests
commit 157c1062fc upstream.

A sysfs request to enable or disable a PCIe hotplug slot should not
return before it has been carried out.  That is sought to be achieved by
waiting until the controller's "pending_events" have been cleared.

However the IRQ thread pciehp_ist() clears the "pending_events" before
it acts on them.  If pciehp_sysfs_enable_slot() / _disable_slot() happen
to check the "pending_events" after they have been cleared but while
pciehp_ist() is still running, the functions may return prematurely
with an incorrect return value.

Fix by introducing an "ist_running" flag which must be false before a sysfs
request is allowed to return.

Fixes: 32a8cef274 ("PCI: pciehp: Enable/disable exclusively from IRQ thread")
Link: https://lore.kernel.org/linux-pci/1562226638-54134-1-git-send-email-wangxiongfeng2@huawei.com
Link: https://lore.kernel.org/r/4174210466e27eb7e2243dd1d801d5f75baaffd8.1565345211.git.lukas@wunner.de
Reported-and-tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21 11:04:27 +01:00
Mika Westerberg
ebbc1380a3 ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
commit 77adf93553 upstream.

Valerio and others reported that commit 84c8b58ed3 ("ACPI / hotplug /
PCI: Don't scan bridges managed by native hotplug") prevents some recent
LG and HP laptops from booting with endless loop of:

  ACPI Error: No handler or method for GPE 08, disabling event (20190215/evgpe-835)
  ACPI Error: No handler or method for GPE 09, disabling event (20190215/evgpe-835)
  ACPI Error: No handler or method for GPE 0A, disabling event (20190215/evgpe-835)
  ...

What seems to happen is that during boot, after the initial PCI enumeration
when EC is enabled the platform triggers ACPI Notify() to one of the root
ports. The root port itself looks like this:

  pci 0000:00:1b.0: PCI bridge to [bus 02-3a]
  pci 0000:00:1b.0:   bridge window [mem 0xc4000000-0xda0fffff]
  pci 0000:00:1b.0:   bridge window [mem 0x80000000-0xa1ffffff 64bit pref]

The BIOS has configured the root port so that it does not have I/O bridge
window.

Now when the ACPI Notify() is triggered ACPI hotplug handler calls
acpiphp_native_scan_bridge() for each non-hotplug bridge (as this system is
using native PCIe hotplug) and pci_assign_unassigned_bridge_resources() to
allocate resources.

The device connected to the root port is a PCIe switch (Thunderbolt
controller) with two hotplug downstream ports. Because of the hotplug ports
__pci_bus_size_bridges() tries to add "additional I/O" of 256 bytes to each
(DEFAULT_HOTPLUG_IO_SIZE). This gets further aligned to 4k as that's the
minimum I/O window size so each hotplug port gets 4k I/O window and the
same happens for the root port (which is also hotplug port). This means
3 * 4k = 12k I/O window.

Because of this pci_assign_unassigned_bridge_resources() ends up opening a
I/O bridge window for the root port at first available I/O address which
seems to be in range 0x1000 - 0x3fff. Normally this range is used for ACPI
stuff such as GPE bits (below is part of /proc/ioports):

    1800-1803 : ACPI PM1a_EVT_BLK
    1804-1805 : ACPI PM1a_CNT_BLK
    1808-180b : ACPI PM_TMR
    1810-1815 : ACPI CPU throttle
    1850-1850 : ACPI PM2_CNT_BLK
    1854-1857 : pnp 00:05
    1860-187f : ACPI GPE0_BLK

However, when the ACPI Notify() happened this range was not yet reserved
for ACPI/PNP (that happens later) so PCI gets it. It then starts writing to
this range and accidentally stomps over GPE bits among other things causing
the endless stream of messages about missing GPE handler.

This problem does not happen if "pci=hpiosize=0" is passed in the kernel
command line. The reason is that then the kernel does not try to allocate
the additional 256 bytes for each hotplug port.

Fix this by allocating resources directly below the non-hotplug bridges
where a new device may appear as a result of ACPI Notify(). This avoids the
hotplug bridges and prevents opening the additional I/O window.

Fixes: 84c8b58ed3 ("ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203617
Link: https://lore.kernel.org/r/20191030150545.19885-1-mika.westerberg@linux.intel.com
Reported-by: Valerio Passini <passini.valerio@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 19:56:32 +01:00
Linus Torvalds
299d14d4c3 Merge tag 'pci-v5.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Consolidate _HPP/_HPX stuff in pci-acpi.c and simplify it
     (Krzysztof Wilczynski)

   - Fix incorrect PCIe device types and remove dev->has_secondary_link
     to simplify code that deals with upstream/downstream ports (Mika
     Westerberg)

   - After suspend, restore Resizable BAR size bits correctly for 1MB
     BARs (Sumit Saxena)

   - Enable PCI_MSI_IRQ_DOMAIN support for RISC-V (Wesley Terpstra)

  Virtualization:

   - Add ACS quirks for iProc PAXB (Abhinav Ratna), Amazon Annapurna
     Labs (Ali Saidi)

   - Move sysfs SR-IOV functions to iov.c (Kelsey Skunberg)

   - Remove group write permissions from sysfs sriov_numvfs,
     sriov_drivers_autoprobe (Kelsey Skunberg)

  Hotplug:

   - Simplify pciehp indicator control (Denis Efremov)

  Peer-to-peer DMA:

   - Allow P2P DMA between root ports for whitelisted bridges (Logan
     Gunthorpe)

   - Whitelist some Intel host bridges for P2P DMA (Logan Gunthorpe)

   - DMA map P2P DMA requests that traverse host bridge (Logan
     Gunthorpe)

  Amazon Annapurna Labs host bridge driver:

   - Add DT binding and controller driver (Jonathan Chocron)

  Hyper-V host bridge driver:

   - Fix hv_pci_dev->pci_slot use-after-free (Dexuan Cui)

   - Fix PCI domain number collisions (Haiyang Zhang)

   - Use instance ID bytes 4 & 5 as PCI domain numbers (Haiyang Zhang)

   - Fix build errors on non-SYSFS config (Randy Dunlap)

  i.MX6 host bridge driver:

   - Limit DBI register length (Stefan Agner)

  Intel VMD host bridge driver:

   - Fix config addressing issues (Jon Derrick)

  Layerscape host bridge driver:

   - Add bar_fixed_64bit property to endpoint driver (Xiaowei Bao)

   - Add CONFIG_PCI_LAYERSCAPE_EP to build EP/RC drivers separately
     (Xiaowei Bao)

  Mediatek host bridge driver:

   - Add MT7629 controller support (Jianjun Wang)

  Mobiveil host bridge driver:

   - Fix CPU base address setup (Hou Zhiqiang)

   - Make "num-lanes" property optional (Hou Zhiqiang)

  Tegra host bridge driver:

   - Fix OF node reference leak (Nishka Dasgupta)

   - Disable MSI for root ports to work around design problem (Vidya
     Sagar)

   - Add Tegra194 DT binding and controller support (Vidya Sagar)

   - Add support for sideband pins and slot regulators (Vidya Sagar)

   - Add PIPE2UPHY support (Vidya Sagar)

  Misc:

   - Remove unused pci_block_cfg_access() et al (Kelsey Skunberg)

   - Unexport pci_bus_get(), etc (Kelsey Skunberg)

   - Hide PM, VC, link speed, ATS, ECRC, PTM constants and interfaces in
     the PCI core (Kelsey Skunberg)

   - Clean up sysfs DEVICE_ATTR() usage (Kelsey Skunberg)

   - Mark expected switch fall-through (Gustavo A. R. Silva)

   - Propagate errors for optional regulators and PHYs (Thierry Reding)

   - Fix kernel command line resource_alignment parameter issues (Logan
     Gunthorpe)"

* tag 'pci-v5.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (112 commits)
  PCI: Add pci_irq_vector() and other stubs when !CONFIG_PCI
  arm64: tegra: Add PCIe slot supply information in p2972-0000 platform
  arm64: tegra: Add configuration for PCIe C5 sideband signals
  PCI: tegra: Add support to enable slot regulators
  PCI: tegra: Add support to configure sideband pins
  PCI: vmd: Fix shadow offsets to reflect spec changes
  PCI: vmd: Fix config addressing when using bus offsets
  PCI: dwc: Add validation that PCIe core is set to correct mode
  PCI: dwc: al: Add Amazon Annapurna Labs PCIe controller driver
  dt-bindings: PCI: Add Amazon's Annapurna Labs PCIe host bridge binding
  PCI: Add quirk to disable MSI-X support for Amazon's Annapurna Labs Root Port
  PCI/VPD: Prevent VPD access for Amazon's Annapurna Labs Root Port
  PCI: Add ACS quirk for Amazon Annapurna Labs root ports
  PCI: Add Amazon's Annapurna Labs vendor ID
  MAINTAINERS: Add PCI native host/endpoint controllers designated reviewer
  PCI: hv: Use bytes 4 and 5 from instance ID as the PCI domain numbers
  dt-bindings: PCI: tegra: Add PCIe slot supplies regulator entries
  dt-bindings: PCI: tegra: Add sideband pins configuration entries
  PCI: tegra: Add Tegra194 PCIe support
  PCI: Get rid of dev->has_secondary_link flag
  ...
2019-09-23 19:16:01 -07:00
Bjorn Helgaas
c5048a73b4 Merge branch 'pci/trivial'
- Fix typos and whitespace errors (Bjorn Helgaas, Krzysztof Wilczynski)

  - Remove unnecessary "return" statements (Krzysztof Wilczynski)

  - Correct of_irq_parse_pci() function documentation (Lubomir Rintel)

* pci/trivial:
  PCI: Remove unnecessary returns
  PCI: OF: Correct of_irq_parse_pci() documentation
  PCI: Fix typos and whitespace errors
2019-09-23 16:10:31 -05:00
Bjorn Helgaas
203104c8d5 Merge branch 'pci/pciehp'
- Cleanup pciehp LED/indicator control with a new consolidated
    pciehp_set_indicators() interface that controls both Attention and
    Power Indicators (Denis Efremov)

* pci/pciehp:
  PCI: pciehp: Refer to "Indicators" instead of "LEDs" in comments
  PCI: pciehp: Remove pciehp_green_led_{on,off,blink}()
  PCI: pciehp: Remove pciehp_set_attention_status()
  PCI: pciehp: Combine adjacent indicator updates
  PCI: pciehp: Add pciehp_set_indicators() to set both indicators
2019-09-23 16:10:14 -05:00
Linus Torvalds
45824fc0da Merge tag 'powerpc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
 "This is a bit late, partly due to me travelling, and partly due to a
  power outage knocking out some of my test systems *while* I was
  travelling.

   - Initial support for running on a system with an Ultravisor, which
     is software that runs below the hypervisor and protects guests
     against some attacks by the hypervisor.

   - Support for building the kernel to run as a "Secure Virtual
     Machine", ie. as a guest capable of running on a system with an
     Ultravisor.

   - Some changes to our DMA code on bare metal, to allow devices with
     medium sized DMA masks (> 32 && < 59 bits) to use more than 2GB of
     DMA space.

   - Support for firmware assisted crash dumps on bare metal (powernv).

   - Two series fixing bugs in and refactoring our PCI EEH code.

   - A large series refactoring our exception entry code to use gas
     macros, both to make it more readable and also enable some future
     optimisations.

  As well as many cleanups and other minor features & fixups.

  Thanks to: Adam Zerella, Alexey Kardashevskiy, Alistair Popple, Andrew
  Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anshuman Khandual,
  Balbir Singh, Benjamin Herrenschmidt, Cédric Le Goater, Christophe
  JAILLET, Christophe Leroy, Christopher M. Riedl, Christoph Hellwig,
  Claudio Carvalho, Daniel Axtens, David Gibson, David Hildenbrand,
  Desnes A. Nunes do Rosario, Ganesh Goudar, Gautham R. Shenoy, Greg
  Kurz, Guerney Hunt, Gustavo Romero, Halil Pasic, Hari Bathini, Joakim
  Tjernlund, Jonathan Neuschafer, Jordan Niethe, Leonardo Bras, Lianbo
  Jiang, Madhavan Srinivasan, Mahesh Salgaonkar, Mahesh Salgaonkar,
  Masahiro Yamada, Maxiwell S. Garcia, Michael Anderson, Nathan
  Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver
  O'Halloran, Qian Cai, Ram Pai, Ravi Bangoria, Reza Arbab, Ryan Grimm,
  Sam Bobroff, Santosh Sivaraj, Segher Boessenkool, Sukadev Bhattiprolu,
  Thiago Bauermann, Thiago Jung Bauermann, Thomas Gleixner, Tom
  Lendacky, Vasant Hegde"

* tag 'powerpc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (264 commits)
  powerpc/mm/mce: Keep irqs disabled during lockless page table walk
  powerpc: Use ftrace_graph_ret_addr() when unwinding
  powerpc/ftrace: Enable HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
  ftrace: Look up the address of return_to_handler() using helpers
  powerpc: dump kernel log before carrying out fadump or kdump
  docs: powerpc: Add missing documentation reference
  powerpc/xmon: Fix output of XIVE IPI
  powerpc/xmon: Improve output of XIVE interrupts
  powerpc/mm/radix: remove useless kernel messages
  powerpc/fadump: support holes in kernel boot memory area
  powerpc/fadump: remove RMA_START and RMA_END macros
  powerpc/fadump: update documentation about option to release opalcore
  powerpc/fadump: consider f/w load area
  powerpc/opalcore: provide an option to invalidate /sys/firmware/opal/core file
  powerpc/opalcore: export /sys/firmware/opal/core for analysing opal crashes
  powerpc/fadump: update documentation about CONFIG_PRESERVE_FA_DUMP
  powerpc/fadump: add support to preserve crash data on FADUMP disabled kernel
  powerpc/fadump: improve how crashed kernel's memory is reserved
  powerpc/fadump: consider reserved ranges while releasing memory
  powerpc/fadump: make crash memory ranges array allocation generic
  ...
2019-09-20 11:48:06 -07:00
Bjorn Helgaas
4a06c2c382 PCI: pciehp: Refer to "Indicators" instead of "LEDs" in comments
The PCIe spec doesn't mention "green LEDs" or "amber LEDs".  Replace those
terms with "Power Indicator" and "Attention Indicator" so the comments
match the spec language.  Comment changes only.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-09-05 15:52:24 -05:00
Denis Efremov
9194094be4 PCI: pciehp: Remove pciehp_green_led_{on,off,blink}()
Remove pciehp_green_led_{on,off,blink}() and use pciehp_set_indicators()
instead, since the code is mostly the same.

[bhelgaas: drop set_power_indicator() wrapper to reduce the number of
interfaces]
Link: https://lore.kernel.org/r/20190903111021.1559-5-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2019-09-05 15:45:00 -05:00
Denis Efremov
106feb2fdc PCI: pciehp: Remove pciehp_set_attention_status()
Remove pciehp_set_attention_status() and use pciehp_set_indicators()
instead, since the code is mostly the same.

Link: https://lore.kernel.org/r/20190903111021.1559-4-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2019-09-05 15:44:08 -05:00
Denis Efremov
94719ba090 PCI: pciehp: Combine adjacent indicator updates
Combine adjacent updates of power and attention indicators into a single
pciehp_set_indicators() call.  This sends one command to the hotplug
controller instead of two.

Link: https://lore.kernel.org/r/20190903111021.1559-3-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2019-09-05 15:43:33 -05:00
Denis Efremov
688033f52d PCI: pciehp: Add pciehp_set_indicators() to set both indicators
Add pciehp_set_indicators() to set power and attention indicators with a
single register write.

This is a minor optimization because we frequently set both indicators and
this can do it with a single command.  It also reduces the number of
interfaces related to the indicators and makes them more discoverable
because callers use the PCI_EXP_SLTCTL_ATTN_IND_* and
PCI_EXP_SLTCTL_PWR_IND_* definitions directly.

[bhelgaas: extend commit log, s/PCI_EXP_SLTCTL_.*_IND_NONE/INDICATOR_NOOP/
so they don't look like things defined by the spec, add function doc, mask
commands to make it obvious we only send valid commands
(pcie_do_write_cmd() does mask it, but requires more effort to verify)]
Link: https://lore.kernel.org/r/20190903111021.1559-2-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-09-05 15:39:36 -05:00
Oliver O'Halloran
018c49e999 pci-hotplug/pnv_php: Add attention indicator support
pnv_php is generally used with PCIe bridges which provide a native
interface for setting the attention and power indicator LEDs. Wire up
those interfaces even if firmware does not have support for them (yet...)

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-11-oohall@gmail.com
2019-09-05 14:22:39 +10:00
Oliver O'Halloran
a839bd87a2 pci-hotplug/pnv_php: Add support for IODA3 Power9 PHBs
Currently we check that an IODA2 compatible PHB is upstream of this slot.
This is mainly to avoid pnv_php creating slots for the various "virtual
PHBs" that we create for NVLink. There's no real need for this restriction
so allow it on IODA3.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-10-oohall@gmail.com
2019-09-05 14:22:39 +10:00
Oliver O'Halloran
7fd1fe4e48 pci-hotplug/pnv_php: Add a reset_slot() callback
When performing EEH recovery of devices in a hotplug slot we need to use
the slot driver's ->reset_slot() callback to prevent spurious hotplug
events due to spurious DLActive and PresDet change interrupts. Add a
reset_slot() callback to pnv_php so we can handle recovery of devices
in pnv_php managed slots.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-9-oohall@gmail.com
2019-09-05 14:22:39 +10:00
Krzysztof Wilczynski
55507aea58 PCI: Remove unnecessary returns
Remove unnecessary "return" statements at the end of void functions.  No
functional change intended.

Link: https://lore.kernel.org/r/20190825221039.6977-1-kw@linux.com
Link: https://lore.kernel.org/r/20190826095143.21353-1-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-08-30 14:00:34 -05:00
Christoph Hellwig
c7532b601e PCI/hotplug: remove the sgi_hotplug driver
The SGI SN2 support is about to be removed.  Remove this driver that
depends on the SN2 support.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lkml.kernel.org/r/20190813072514.23299-6-hch@lst.de
Signed-off-by: Tony Luck <tony.luck@intel.com>
2019-08-16 11:33:56 -07:00
Gustavo A. R. Silva
5a2e340690 PCI: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.

This fixes the following warning (Building: allmodconfig i386):

  drivers/pci/hotplug/ibmphp_res.c: In function ‘update_bridge_ranges’:
  drivers/pci/hotplug/ibmphp_res.c:1943:16: warning: this statement may fall through [-Wimplicit-fallthrough=]
	 function = 0x8;
	 ~~~~~~~~~^~~~~
  drivers/pci/hotplug/ibmphp_res.c:1944:6: note: here
	case PCI_HEADER_TYPE_MULTIBRIDGE:
	^~~~

Link: https://lore.kernel.org/r/20190802012248.GA22622@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2019-08-08 15:13:34 -05:00
Nathan Chancellor
0df3e42167 PCI: rpaphp: Avoid a sometimes-uninitialized warning
When building with -Wsometimes-uninitialized, clang warns:

drivers/pci/hotplug/rpaphp_core.c:243:14: warning: variable 'fndit' is
used uninitialized whenever 'for' loop exits because its condition is
false [-Wsometimes-uninitialized]
        for (j = 0; j < entries; j++) {
                    ^~~~~~~~~~~
drivers/pci/hotplug/rpaphp_core.c:256:6: note: uninitialized use occurs
here
        if (fndit)
            ^~~~~
drivers/pci/hotplug/rpaphp_core.c:243:14: note: remove the condition if
it is always true
        for (j = 0; j < entries; j++) {
                    ^~~~~~~~~~~
drivers/pci/hotplug/rpaphp_core.c:233:14: note: initialize the variable
'fndit' to silence this warning
        int j, fndit;
                    ^
                     = 0

fndit is only used to gate a sprintf call, which can be moved into the
loop to simplify the code and eliminate the local variable, which will
fix this warning.

Fixes: 2fcf3ae508 ("hotplug/drc-info: Add code to search ibm,drc-info property")
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Acked-by: Joel Savitz <jsavitz@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/ClangBuiltLinux/linux/issues/504
Link: https://lore.kernel.org/r/20190603221157.58502-1-natechancellor@gmail.com
2019-08-05 18:53:03 +10:00
Linus Torvalds
414147d99b Merge tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
 "Enumeration changes:

   - Add _HPX Type 3 settings support, which gives firmware more
     influence over device configuration (Alexandru Gagniuc)

   - Support fixed bus numbers from bridge Enhanced Allocation
     capabilities (Subbaraya Sundeep)

   - Add "external-facing" DT property to identify cases where we
     require IOMMU protection against untrusted devices (Jean-Philippe
     Brucker)

   - Enable PCIe services for host controller drivers that use managed
     host bridge alloc (Jean-Philippe Brucker)

   - Log PCIe port service messages with pci_dev, not the pcie_device
     (Frederick Lawler)

   - Convert pciehp from pciehp_debug module parameter to generic
     dynamic debug (Frederick Lawler)

  Peer-to-peer DMA:

   - Add whitelist of Root Complexes that support peer-to-peer DMA
     between Root Ports (Christian König)

  Native controller drivers:

   - Add PCI host bridge DMA ranges for bridges that can't DMA
     everywhere, e.g., iProc (Srinath Mannam)

   - Add Amazon Annapurna Labs PCIe host controller driver (Jonathan
     Chocron)

   - Fix Tegra MSI target allocation so DMA doesn't generate unwanted
     MSIs (Vidya Sagar)

   - Fix of_node reference leaks (Wen Yang)

   - Fix Hyper-V module unload & device removal issues (Dexuan Cui)

   - Cleanup R-Car driver (Marek Vasut)

   - Cleanup Keystone driver (Kishon Vijay Abraham I)

   - Cleanup i.MX6 driver (Andrey Smirnov)

  Significant bug fixes:

   - Reset Lenovo ThinkPad P50 GPU so nouveau works after reboot (Lyude
     Paul)

   - Fix Switchtec firmware update performance issue (Wesley Sheng)

   - Work around Pericom switch link retraining erratum (Stefan Mätje)"

* tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (141 commits)
  MAINTAINERS: Add Karthikeyan Mitran and Hou Zhiqiang for Mobiveil PCI
  PCI: pciehp: Remove pointless MY_NAME definition
  PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition
  PCI: pciehp: Remove unused dbg/err/info/warn() wrappers
  PCI: pciehp: Log messages with pci_dev, not pcie_device
  PCI: pciehp: Replace pciehp_debug module param with dyndbg
  PCI: pciehp: Remove pciehp_debug uses
  PCI/AER: Log messages with pci_dev, not pcie_device
  PCI/DPC: Log messages with pci_dev, not pcie_device
  PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info()
  PCI/AER: Replace dev_printk(KERN_DEBUG) with dev_info()
  PCI: Replace dev_printk(KERN_DEBUG) with dev_info(), etc
  PCI: Replace printk(KERN_INFO) with pr_info(), etc
  PCI: Use dev_printk() when possible
  PCI: Cleanup setup-bus.c comments and whitespace
  PCI: imx6: Allow asynchronous probing
  PCI: dwc: Save root bus for driver remove hooks
  PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code
  PCI: dwc: Free MSI in dw_pcie_host_init() error path
  PCI: dwc: Free MSI IRQ page in dw_pcie_free_msi()
  ...
2019-05-14 10:30:10 -07:00
Bjorn Helgaas
f8587c80c6 Merge branch 'pci/printk-portdrv'
- Change some desirable KERN_DEBUG messages to KERN_INFO/KERN_ERR
    (Frederick Lawler)

  - Log PCIe port service messages with pci_dev, not the pcie_device
    (Frederick Lawler)

  - Convert pciehp from pciehp_debug module parameter to generic dynamic
    debug (Frederick Lawler)

* pci/printk-portdrv:
  PCI: pciehp: Remove pointless MY_NAME definition
  PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition
  PCI: pciehp: Remove unused dbg/err/info/warn() wrappers
  PCI: pciehp: Log messages with pci_dev, not pcie_device
  PCI: pciehp: Replace pciehp_debug module param with dyndbg
  PCI: pciehp: Remove pciehp_debug uses
  PCI/AER: Log messages with pci_dev, not pcie_device
  PCI/DPC: Log messages with pci_dev, not pcie_device
  PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info()
  PCI/AER: Replace dev_printk(KERN_DEBUG) with dev_info()
2019-05-13 18:34:47 -05:00
Bjorn Helgaas
e07ca82a5f PCI: pciehp: Remove pointless MY_NAME definition
MY_NAME is only used once and offers no benefit, so remove it.

Link: https://lore.kernel.org/lkml/20190509141456.223614-11-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2019-05-09 16:45:21 -05:00
Bjorn Helgaas
b498b6872d PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition
PCIE_MODULE_NAME is only used once and offers no benefit, so remove it.

Link: https://lore.kernel.org/lkml/20190509141456.223614-10-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2019-05-09 16:45:21 -05:00
Frederick Lawler
742ee16bc3 PCI: pciehp: Remove unused dbg/err/info/warn() wrappers
Replace the last uses of dbg() with the equivalent pr_debug(), then remove
unused dbg(), err(), info(), and warn() wrappers.

Link: https://lore.kernel.org/lkml/20190509141456.223614-9-helgaas@kernel.org
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2019-05-09 16:45:20 -05:00
Frederick Lawler
94dbc9562e PCI: pciehp: Log messages with pci_dev, not pcie_device
Log messages with pci_dev, not pcie_device.  Factor out common message
prefixes with dev_fmt().

Example output change:

  - pciehp 0000:00:06.0:pcie004: Slot(0) Powering on due to button press
  + pcieport 0000:00:06.0: pciehp: Slot(0) Powering on due to button press

Link: https://lore.kernel.org/lkml/20190509141456.223614-8-helgaas@kernel.org
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2019-05-09 16:45:20 -05:00
Frederick Lawler
017124725c PCI: pciehp: Replace pciehp_debug module param with dyndbg
Previously pciehp debug messages were enabled by the pciehp_debug module
parameter, e.g., by booting with this kernel command line option:

  pciehp.pciehp_debug=1

Convert this mechanism to use the generic dynamic debug (dyndbg) feature.
After this commit, pciehp debug messages are enabled by building the kernel
with CONFIG_DYNAMIC_DEBUG=y and booting with this command line option:

  dyndbg="file pciehp* +p"

The dyndbg facility is much more flexible: messages can be enabled at boot-
or run-time based on the file name, function name, line number, message
test, etc.  See Documentation/admin-guide/dynamic-debug-howto.rst for more
details.

Link: https://lore.kernel.org/lkml/20190509141456.223614-7-helgaas@kernel.org
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
[bhelgaas: commit log, comment, remove pciehp_debug parameter]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2019-05-09 16:45:20 -05:00
Bjorn Helgaas
7e696b8ae9 PCI: pciehp: Remove pciehp_debug uses
We're about to convert pciehp to the dyndbg mechanism, which means we can
eventually remove pciehp_debug.

Replace uses of pciehp_debug with dbg() and ctrl_dbg(), which check
pciehp_debug internally.

Link: https://lore.kernel.org/lkml/20190509141456.223614-6-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2019-05-09 16:45:19 -05:00
Tyrel Datwyler
91800660bb PCI: rpaphp: Get/put device node reference during slot alloc/dealloc
When allocating the slot structure we store a pointer to the associated
device_node.  We really should be incrementing the reference count, so add
an of_node_get() during slot alloc and an of_node_put() during slot
dealloc.

Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-04-10 16:07:12 -05:00
Tyrel Datwyler
fb26228bfc PCI: rpadlpar: Fix leaked device_node references in add/remove paths
The find_dlpar_node() helper returns a device node with its reference
incremented.  Both the add and remove paths use this helper for find the
appropriate node, but fail to release the reference when done.

Annotate the find_dlpar_node() helper with a comment about the incremented
reference count and call of_node_put() on the obtained device_node in the
add and remove paths.  Also, fixup a reference leak in the find_vio_slot()
helper where we fail to call of_node_put() on the vdevice node after we
iterate over its children.

Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-04-10 16:07:12 -05:00
Sergey Miroshnichenko
3943af9d01 PCI: pciehp: Ignore Link State Changes after powering off a slot
During a safe hot remove, the OS powers off the slot, which may cause a
Data Link Layer State Changed event.  The slot has already been set to
OFF_STATE, so that event results in re-enabling the device, making it
impossible to safely remove it.

Clear out the Presence Detect Changed and Data Link Layer State Changed
events when the disabled slot has settled down.

It is still possible to re-enable the device if it remains in the slot
after pressing the Attention Button by pressing it again.

Fixes the problem that Micah reported below: an NVMe drive power button may
not actually turn off the drive.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=203237
Reported-by: Micah Parrish <micah.parrish@hpe.com>
Tested-by: Micah Parrish <micah.parrish@hpe.com>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
[bhelgaas: changelog, add bugzilla URL]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org	# v4.19+
2019-04-10 16:06:43 -05:00