linux

mirror of https://github.com/hardkernel/linux.git synced 2026-04-07 13:43:09 +09:00

Author	SHA1	Message	Date
Mahesh Rajashekhara	b23b417427	scsi: sd: Defer spinning up drive while SANITIZE is in progress commit `505aa4b6a8` upstream. A drive being sanitized will return NOT READY / ASC 0x4 / ASCQ 0x1b ("LOGICAL UNIT NOT READY. SANITIZE IN PROGRESS"). Prevent spinning up the drive until this condition clears. [mkp: tweaked commit message] Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-05-01 12:58:19 -07:00
Arnd Bergmann	8afed2798e	scsi: fas216: fix sense buffer initialization [ Upstream commit `96d5eaa9bb` ] While testing with the ARM specific memset() macro removed, I ran into a compiler warning that shows an old bug: drivers/scsi/arm/fas216.c: In function 'fas216_rq_sns_done': drivers/scsi/arm/fas216.c:2014:40: error: argument to 'sizeof' in 'memset' call is the same expression as the destination; did you mean to provide an explicit length? [-Werror=sizeof-pointer-memaccess] It turns out that the definition of the scsi_cmd structure changed back in linux-2.6.25, so now we clear only four bytes (sizeof(pointer)) instead of 96 (SCSI_SENSE_BUFFERSIZE). I did not check whether we actually need to initialize the buffer here, but it's clear that if we do it, we should use the correct size. Fixes: `de25deb180` ("[SCSI] use dynamically allocated sense buffer") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-26 11:02:10 +02:00
Xose Vazquez Perez	800fda575b	scsi: devinfo: fix format of the device list [ Upstream commit `3f884a0a8b` ] Replace "" with NULL for product revision level, and merge TEXEL duplicate entries. Cc: Hannes Reinecke <hare@suse.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: SCSI ML <linux-scsi@vger.kernel.org> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-26 11:02:10 +02:00
himanshu.madhani@cavium.com	f3a7d11834	scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout() [ Upstream commit `7ac0c332f9` ] This patch fixes following Smatch warning: drivers/scsi/qla2xxx/qla_init.c:130 qla2x00_async_iocb_timeout() error: we previously assumed 'fcport' could be null (see line 107) Fixes: `5c25d45116` ("scsi: qla2xxx: Fix NULL pointer access for fcport structure") Reported by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-26 11:02:05 +02:00
Bill Kuzeja	b18daa09fe	scsi: qla2xxx: Fix small memory leak in qla2x00_probe_one on probe failure commit `6d6340672b` upstream. The code that fixes the crashes in the following commit introduced a small memory leak: commit `6a2cf8d366` ("scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure") Fixing this requires a bit of reworking, which I've explained. Also provide some code cleanup. There is a small window in qla2x00_probe_one where if qla2x00_alloc_queues fails, we end up never freeing req and rsp and leak 0xc0 and 0xc8 bytes respectively (the sizes of req and rsp). I originally put in checks to test for this condition which were based on the incorrect assumption that if ha->rsp_q_map and ha->req_q_map were allocated, then rsp and req were allocated as well. This is incorrect. There is a window between these allocations: ret = qla2x00_mem_alloc(ha, req_length, rsp_length, &req, &rsp); goto probe_hw_failed; [if successful, both rsp and req allocated] base_vha = qla2x00_create_host(sht, ha); goto probe_hw_failed; ret = qla2x00_request_irqs(ha, rsp); goto probe_failed; if (qla2x00_alloc_queues(ha, req, rsp)) { goto probe_failed; [if successful, now ha->rsp_q_map and ha->req_q_map allocated] To simplify this, we should just set req and rsp to NULL after we free them. Sounds simple enough? The problem is that req and rsp are pointers defined in the qla2x00_probe_one and they are not always passed by reference to the routines that free them. Here are paths which can free req and rsp: PATH 1: qla2x00_probe_one ret = qla2x00_mem_alloc(ha, req_length, rsp_length, &req, &rsp); [req and rsp are passed by reference, but if this fails, we currently do not NULL out req and rsp. Easily fixed] PATH 2: qla2x00_probe_one failing in qla2x00_request_irqs or qla2x00_alloc_queues probe_failed: qla2x00_free_device(base_vha); qla2x00_free_req_que(ha, req) qla2x00_free_rsp_que(ha, rsp) PATH 3: qla2x00_probe_one: failing in qla2x00_mem_alloc or qla2x00_create_host probe_hw_failed: qla2x00_free_req_que(ha, req) qla2x00_free_rsp_que(ha, rsp) PATH 1: This should currently work, but it doesn't because rsp and rsp are not set to NULL in qla2x00_mem_alloc. Easily remedied. PATH 2: req and rsp aren't passed in at all to qla2x00_free_device but are derived from ha->req_q_map[0] and ha->rsp_q_map[0]. These are only set up if qla2x00_alloc_queues succeeds. In qla2x00_free_queues, we are protected from crashing if these don't exist because req_qid_map and rsp_qid_map are only set on their allocation. We are guarded in this way: for (cnt = 0; cnt < ha->max_req_queues; cnt++) { if (!test_bit(cnt, ha->req_qid_map)) continue; PATH 3: This works. We haven't freed req or rsp yet (or they were never allocated if qla2x00_mem_alloc failed), so we'll attempt to free them here. To summarize, there are a few small changes to make this work correctly and (and for some cleanup): 1) (For PATH 1) Set rsp and req to NULL in case of failure in qla2x00_mem_alloc so these are correctly set to NULL back in qla2x00_probe_one 2) After jumping to probe_failed: and calling qla2x00_free_device, explicitly set rsp and req to NULL so further calls with these pointers do not crash, i.e. the free queue calls in the probe_hw_failed section we fall through to. 3) Fix return code check in the call to qla2x00_alloc_queues. We currently drop the return code on the floor. The probe fails but the caller of the probe doesn't have an error code, so it attaches to pci. This can result in a crash on module shutdown. 4) Remove unnecessary NULL checks in qla2x00_free_req_que, qla2x00_free_rsp_que, and the egregious NULL checks before kfrees and vfrees in qla2x00_mem_free. I tested this out running a scenario where the card breaks at various times during initialization. I made sure I forced every error exit path in qla2x00_probe_one. Cc: <stable@vger.kernel.org> # v4.16 Fixes: `6a2cf8d366` ("scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure") Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com> Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-19 08:56:18 +02:00
Shivasharan S	15dfb9baba	scsi: megaraid_sas: unload flag should be set after scsi_remove_host is called [ Upstream commit `f3f7920b39` ] Issue - Driver returns DID_NO_CONNECT when unload is in progress, indicated using instance->unload flag. In case of dynamic unload of driver, this flag is set before calling scsi_remove_host(). While doing manual driver unload, user will see lots of prints for Sync Cache command with DID_NO_CONNECT status. Fix - Set the instance->unload flag after scsi_remove_host(). Allow device removal process to be completed and do not block any command before that. SCSI commands (like SYNC_CACHE) are received (as part of scsi_remove_host) by driver during unload will be submitted further down to the drives. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-12 12:32:19 +02:00
Shivasharan S	7007705438	scsi: megaraid_sas: Error handling for invalid ldcount provided by firmware in RAID map [ Upstream commit `7ada701d0d` ] Currently driver does not validate ldcount provided by firmware. If the value is invalid, fail RAID map validation accordingly. This issue is rare to hit in field and is fixed as part of code review. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-12 12:32:19 +02:00
chenxiang	b728b7e24f	scsi: libsas: initialize sas_phy status according to response of DISCOVER [ Upstream commit `affc67788f` ] The status of SAS PHY is in sas_phy->enabled. There is an issue that the status of a remote SAS PHY may be initialized incorrectly: if disable remote SAS PHY through sysfs interface (such as echo 0 > /sys/class/sas_phy/phy-1:0:0/enable), then reboot the system, and we will find the status of remote SAS PHY which is disabled before is 1 (cat /sys/class/sas_phy/phy-1:0:0/enable). But actually the status of remote SAS PHY is disabled and the device attached is not found. In SAS protocol, NEGOTIATED LOGICAL LINK RATE field of DISCOVER response is 0x1 when remote SAS PHY is disabled. So initialize sas_phy->enabled according to the value of NEGOTIATED LOGICAL LINK RATE field. Signed-off-by: chenxiang <chenxiang66@hisilicon.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-12 12:32:18 +02:00
Jason Yan	f890a23603	scsi: libsas: fix error when getting phy events [ Upstream commit `2b23d9509f` ] The intend purpose here was to goto out if smp_execute_task() returned error. Obviously something got screwed up. We will never get these link error statistics below: ~:/sys/class/sas_phy/phy-1:0:12 # cat invalid_dword_count 0 ~:/sys/class/sas_phy/phy-1:0:12 # cat running_disparity_error_count 0 ~:/sys/class/sas_phy/phy-1:0:12 # cat loss_of_dword_sync_count 0 ~:/sys/class/sas_phy/phy-1:0:12 # cat phy_reset_problem_count 0 Obviously we should goto error handler if smp_execute_task() returns non-zero. Fixes: `2908d778ab` ("[SCSI] aic94xx: new driver") Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: chenqilin <chenqilin2@huawei.com> CC: chenxiang <chenxiang66@hisilicon.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-12 12:32:18 +02:00
Jason Yan	8644d14c32	scsi: libsas: fix memory leak in sas_smp_get_phy_events() [ Upstream commit `4a491b1ab1` ] We've got a memory leak with the following producer: while true; do cat /sys/class/sas_phy/phy-1:0:12/invalid_dword_count >/dev/null; done The buffer req is allocated and not freed after we return. Fix it. Fixes: `2908d778ab` ("[SCSI] aic94xx: new driver") Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: chenqilin <chenqilin2@huawei.com> CC: chenxiang <chenxiang66@hisilicon.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-12 12:32:18 +02:00
Chaitra P B	b5d2cafbe3	scsi: mpt3sas: Proper handling of set/clear of "ATA command pending" flag. [ Upstream commit `f49d4aed13` ] 1. In IO path, setting of "ATA command pending" flag early before device removal, invalid device handle etc., checks causes any new commands to be always returned with SAM_STAT_BUSY and when the driver removes the drive the SML issues SYNC Cache command and that command is always returned with SAM_STAT_BUSY and thus making SYNC Cache command to requeued. 2. If the driver gets an ATA PT command for a SATA drive then the driver set "ATA command pending" flag in device specific data structure not to allow any further commands until the ATA PT command is completed. However, after setting the flag if the driver decides to return the command back to upper layers without actually issuing to the firmware (i.e., returns from qcmd failure return paths) then the corresponding flag is not cleared and this prevents the driver from sending any new commands to the drive. This patch fixes above two issues by setting of "ATA command pending" flag after checking for whether device deleted, invalid device handle, device busy with task management. And by setting "ATA command pending" flag to false in all of the qcmd failure return paths after setting the flag. Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-12 12:32:14 +02:00
Rafael David Tinoco	3807b6fec6	scsi: libiscsi: Allow sd_shutdown on bad transport [ Upstream commit `d754941225` ] If, for any reason, userland shuts down iscsi transport interfaces before proper logouts - like when logging in to LUNs manually, without logging out on server shutdown, or when automated scripts can't umount/logout from logged LUNs - kernel will hang forever on its sd_sync_cache() logic, after issuing the SYNCHRONIZE_CACHE cmd to all still existent paths. PID: 1 TASK: ffff8801a69b8000 CPU: 1 COMMAND: "systemd-shutdow" #0 [ffff8801a69c3a30] __schedule at ffffffff8183e9ee #1 [ffff8801a69c3a80] schedule at ffffffff8183f0d5 #2 [ffff8801a69c3a98] schedule_timeout at ffffffff81842199 #3 [ffff8801a69c3b40] io_schedule_timeout at ffffffff8183e604 #4 [ffff8801a69c3b70] wait_for_completion_io_timeout at ffffffff8183fc6c #5 [ffff8801a69c3bd0] blk_execute_rq at ffffffff813cfe10 #6 [ffff8801a69c3c88] scsi_execute at ffffffff815c3fc7 #7 [ffff8801a69c3cc8] scsi_execute_req_flags at ffffffff815c60fe #8 [ffff8801a69c3d30] sd_sync_cache at ffffffff815d37d7 #9 [ffff8801a69c3da8] sd_shutdown at ffffffff815d3c3c This happens because iscsi_eh_cmd_timed_out(), the transport layer timeout helper, would tell the queue timeout function (scsi_times_out) to reset the request timer over and over, until the session state is back to logged in state. Unfortunately, during server shutdown, this might never happen again. Other option would be "not to handle" the issue in the transport layer. That would trigger the error handler logic, which would also need the session state to be logged in again. Best option, for such case, is to tell upper layers that the command was handled during the transport layer error handler helper, marking it as DID_NO_CONNECT, which will allow completion and inform about the problem. After the session was marked as ISCSI_STATE_FAILED, due to the first timeout during the server shutdown phase, all subsequent cmds will fail to be queued, allowing upper logic to fail faster. Signed-off-by: Rafael David Tinoco <rafael.tinoco@canonical.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-04-12 12:32:14 +02:00
Sreekanth Reddy	3748694f1b	scsi: mpt3sas: wait for and flush running commands on shutdown/unload commit `c666d3be99` upstream. This patch finishes all outstanding SCSI IO commands (but not other commands, e.g., task management) in the shutdown and unload paths. It first waits for the commands to complete (this is done after setting 'ioc->remove_host = 1 ', which prevents new commands to be queued) then it flushes commands that might still be running. This avoids triggering error handling (e.g., abort command) for all commands possibly completed by the adapter after interrupts disabled. [mauricfo: introduced something in commit message.] Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Tested-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [mauricfo: backport to linux-4.14.y (a few updates to context lines)] Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-24 11:01:28 +01:00
Mauricio Faria de Oliveira	9d72b2696e	scsi: mpt3sas: fix oops in error handlers after shutdown/unload commit `9ff549ffb4` upstream. This patch adds checks for 'ioc->remove_host' in the SCSI error handlers, so not to access pointers/resources potentially freed in the PCI shutdown/module unload path. The error handlers may be invoked after shutdown/unload, depending on other components. This problem was observed with kexec on a system with a mpt3sas based adapter and an infiniband adapter which takes long enough to shutdown: The mpt3sas driver finished shutting down / disabled interrupt handling, thus some commands have not finished and timed out. Since the system was still running (waiting for the infiniband adapter to shutdown), the scsi error handler for task abort of mpt3sas was invoked, and hit an oops -- either in scsih_abort() because 'ioc->scsi_lookup' was NULL without commit `dbec4c9040` ("scsi: mpt3sas: lockless command submission"), or later up in scsih_host_reset() (with or without that commit), because it eventually called mpt3sas_base_get_iocstate(). After the above commit, the oops in scsih_abort() does not occur anymore (_scsih_scsi_lookup_find_by_scmd() is no longer called), but that commit is too big and out of the scope of linux-stable, where this patch might help, so still go for the changes. Also, this might help to prevent similar errors in the future, in case code changes and possibly tries to access freed stuff. Note the fix in scsih_host_reset() is still important anyway. Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-24 11:01:28 +01:00
James Smart	7b7e076f8c	scsi: lpfc: Fix issues connecting with nvme initiator [ Upstream commit `e06351a002` ] In the lpfc discovery engine, when as a nvme target, where the driver was performing mailbox io with the adapter for port login when a NVME PRLI is received from the host. Rather than queue and eventually get back to sending a response after the mailbox traffic, the driver rejected the io with an error response. Turns out this particular initiator didn't like the rejection values (unable to process command/command in progress) so it never attempted a retry of the PRLI. Thus the host never established nvme connectivity with the lpfc target. By changing the rejection values (to Logical Busy/nothing more), the initiator accepted the response and would retry the PRLI, resulting in nvme connectivity. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-24 11:01:27 +01:00
James Smart	1626beb0b5	scsi: lpfc: Fix SCSI LUN discovery when SCSI and NVME enabled [ Upstream commit `9de416ac67` ] When enabled for both SCSI and NVME support, and connected pt2pt to a SCSI only target, the driver nodelist entry for the remote port is left in PRLI_ISSUE state and no SCSI LUNs are discovered. Works fine if only configured for SCSI support. Error was due to some of the prli points still reflecting the need to send only 1 PRLI. On a lot of fabric configs, targets were NVME only, which meant the fabric-reported protocol attributes were only telling the driver one protocol or the other. Thus things worked fine. With pt2pt, the driver must send a PRLI for both protocols as there are no hints on what the target supports. Thus pt2pt targets were hitting the multiple PRLI issues. Complete the dual PRLI support. Track explicitly whether scsi (fcp) or nvme prli's have been sent. Accurately track protocol support detected on each node as reported by the fabric or probed by PRLI traffic. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-24 11:01:27 +01:00
Shivasharan S	23e73e2ab4	scsi: megaraid_sas: Do not use 32-bit atomic request descriptor for Ventura controllers commit `9ff97fa8db` upstream. Problem Statement: Sending I/O through 32 bit descriptors to Ventura series of controller results in IO timeout on certain conditions. This error only occurs on systems with high I/O activity on Ventura series controllers. Changes in this patch will prevent driver from using 32 bit descriptor and use 64 bit Descriptors. Cc: <stable@vger.kernel.org> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-24 11:01:21 +01:00
Bill Kuzeja	c209d68794	scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure commit `6a2cf8d366` upstream. Because of the shifting around of code in qla2x00_probe_one recently, failures during adapter initialization can lead to problems, i.e. NULL pointer crashes and doubly freed data structures which cause eventual panics. This V2 version makes the relevant memory free routines idempotent, so repeat calls won't cause any harm. I also removed the problematic probe_init_failed exit point as it is not needed. Fixes: `d64d6c5671` ("scsi: qla2xxx: Fix NULL pointer crash due to probe failure") Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com> Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-21 12:06:45 +01:00
Himanshu Madhani	91cb90636e	scsi: qla2xxx: Fix logo flag for qlt_free_session_done() commit `a2390348c1` upstream. Commit `3515832cc6` ("scsi: qla2xxx: Reset the logo flag, after target re-login.")fixed the target re-login after session relogin is complete, but missed out the qlt_free_session_done() path. This patch clears send_els_logo flag in qlt_free_session_done() callback. [mkp: checkpatch] Fixes: `3515832cc6` ("scsi: qla2xxx: Reset the logo flag, after target re-login.") Signed-off-by: Himanshu Madhani <hmadhani@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-21 12:06:45 +01:00
Quinn Tran	31de69d5c9	scsi: qla2xxx: Fix NULL pointer access for fcport structure commit `5c25d45116` upstream. when processing iocb in a timeout case, driver was trying to log messages without verifying if the fcport structure could have valid data. This results in a NULL pointer access. Fixes: 726b85487067("qla2xxx: Add framework for async fabric discovery") Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-21 12:06:45 +01:00
Himanshu Madhani	8cdd1908c1	scsi: qla2xxx: Fix smatch warning in qla25xx_delete_{rsp\|req}_que commit `62aa281470` upstream. This patch fixes following warnings reported by smatch: drivers/scsi/qla2xxx/qla_mid.c:586 qla25xx_delete_req_que() error: we previously assumed 'req' could be null (see line 580) drivers/scsi/qla2xxx/qla_mid.c:602 qla25xx_delete_rsp_que() error: we previously assumed 'rsp' could be null (see line 596) Fixes: `7867b98dce` ("scsi: qla2xxx: Fix memory leak in dual/target mode") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-21 12:06:45 +01:00
Xose Vazquez Perez	32097005dd	scsi: dh: add new rdac devices [ Upstream commit `4b3aec2bbb` ] Add IBM 3542 and 3552, arrays: FAStT200 and FAStT500. Add full STK OPENstorage family, arrays: 9176, D173, D178, D210, D220, D240 and D280. Add STK BladeCtlr family, arrays: B210, B220, B240 and B280. These changes were done in multipath-tools time ago. Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Christophe Varoqui <christophe.varoqui@opensvc.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: SCSI ML <linux-scsi@vger.kernel.org> Cc: device-mapper development <dm-devel@redhat.com> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-19 08:42:53 +01:00
Xose Vazquez Perez	97b8a9a878	scsi: devinfo: apply to HP XP the same flags as Hitachi VSP [ Upstream commit `b369a04715` ] Commit `56f3d383f3` ("scsi: scsi_devinfo: Add TRY_VPD_PAGES to HITACHI OPEN-V blacklist entry") modified some Hitachi entries: HITACHI is always supporting VPD pages, even though it's claiming to support SCSI Revision 3 only. The same should have been done also for HP-rebranded. [mkp: checkpatch and tweaked commit message] Cc: Hannes Reinecke <hare@suse.de> Cc: Takahiro Yasui <takahiro.yasui@hds.com> Cc: Matthias Rudolph <Matthias.Rudolph@hitachivantara.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: SCSI ML <linux-scsi@vger.kernel.org> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-19 08:42:53 +01:00
Bart Van Assche	a60a3523b3	scsi: core: scsi_get_device_flags_keyed(): Always return device flags [ Upstream commit `a44c9d3650` ] Since scsi_get_device_flags_keyed() callers do not check whether or not the returned value is an error code, change that function such that it returns a flags value even if the 'key' argument is invalid. Note: since commit `28a0bc4120` ("scsi: sd: Implement blacklist option for WRITE SAME w/ UNMAP") bit 31 is a valid device information flag so checking whether bit 31 is set in the return value is not sufficient to tell the difference between an error code and a flags value. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-19 08:42:53 +01:00
Li Dongyang	65722e7308	scsi: ses: don't ask for diagnostic pages repeatedly during probe [ Upstream commit `9c0a50022b` ] We are testing if there is a match with the ses device in a loop by calling ses_match_to_enclosure(), which will issue scsi receive diagnostics commands to the ses device for every device on the same host. On one of our boxes with 840 disks, it takes a long time to load the driver: [root@g1b-oss06 ~]# time modprobe ses real 40m48.247s user 0m0.001s sys 0m0.196s With the patch: [root@g1b-oss06 ~]# time modprobe ses real 0m17.915s user 0m0.008s sys 0m0.053s Note that we still need to refresh page 10 when we see a new disk to create the link. Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au> Tested-by: Jason Ozolins <jason.ozolins@hpe.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-19 08:42:51 +01:00
himanshu.madhani@cavium.com	608d96fc43	scsi: qla2xxx: Fix recursion while sending terminate exchange commit `3efc31f76d` upstream. During error test case where switch port status is toggled from enable to disable, following stack trace is seen which indicates recursion trying to send terminate exchange. This regression was introduced by commit `82de802ad4` ("scsi: qla2xxx: Preparation for Target MQ.") BUG: stack guard page was hit at ffffb96488383ff8 (stack is ffffb96488384000..ffffb96488387fff) BUG: stack guard page was hit at ffffb964886c3ff8 (stack is ffffb964886c4000..ffffb964886c7fff) kernel stack overflow (double-fault): 0000 [#1] SMP qlt_term_ctio_exchange+0x9c/0xb0 [qla2xxx] qlt_term_ctio_exchange+0x9c/0xb0 [qla2xxx] qlt_term_ctio_exchange+0x9c/0xb0 [qla2xxx] qlt_term_ctio_exchange+0x9c/0xb0 [qla2xxx] qlt_term_ctio_exchange+0x9c/0xb0 [qla2xxx] Fixes: `82de802ad4` ("scsi: qla2xxx: Preparation for Target MQ.") Cc: <stable@vger.kernel.org> #4.10 Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:35 +01:00
himanshu.madhani@cavium.com	8540351ee8	scsi: qla2xxx: Fix NULL pointer crash due to probe failure commit `d64d6c5671` upstream. This patch fixes regression added by commit `d74595278f` ("scsi: qla2xxx: Add multiple queue pair functionality."). When driver is not able to get reqeusted IRQs from the system, driver will attempt tp clean up memory before failing hardware probe. During this cleanup, driver assigns NULL value to the pointer which has not been allocated by driver yet. This results in a NULL pointer access. Log file will show following message and stack trace qla2xxx [0000:a3:00.1]-00c7:21: MSI-X: Failed to enable support, giving up -- 32/-1. qla2xxx [0000:a3:00.1]-0037:21: Falling back-to MSI mode --1. qla2xxx [0000:a3:00.1]-003a:21: Failed to reserve interrupt 821 already in use. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffffc010c4b6>] qla2x00_probe_one+0x18b6/0x2730 [qla2xxx] PGD 0 Oops: 0002 [#1] SMP Fixes: `d74595278f` ("scsi: qla2xxx: Add multiple queue pair functionality."). Cc: <stable@vger.kernel.org> # 4.10 Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:35 +01:00
himanshu.madhani@cavium.com	07b7495465	scsi: qla2xxx: Fix NULL pointer crash due to active timer for ABTS commit `1514839b36` upstream. This patch fixes NULL pointer crash due to active timer running for abort IOCB. From crash dump analysis it was discoverd that get_next_timer_interrupt() encountered a corrupted entry on the timer list. #9 [ffff95e1f6f0fd40] page_fault at ffffffff914fe8f8 [exception RIP: get_next_timer_interrupt+440] RIP: ffffffff90ea3088 RSP: ffff95e1f6f0fdf0 RFLAGS: 00010013 RAX: ffff95e1f6451028 RBX: 000218e2389e5f40 RCX: 00000001232ad600 RDX: 0000000000000001 RSI: ffff95e1f6f0fdf0 RDI: 0000000001232ad6 RBP: ffff95e1f6f0fe40 R8: ffff95e1f6451188 R9: 0000000000000001 R10: 0000000000000016 R11: 0000000000000016 R12: 00000001232ad5f6 R13: ffff95e1f6450000 R14: ffff95e1f6f0fdf8 R15: ffff95e1f6f0fe10 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Looking at the assembly of get_next_timer_interrupt(), address came from %r8 (ffff95e1f6451188) which is pointing to list_head with single entry at ffff95e5ff621178. 0xffffffff90ea307a <get_next_timer_interrupt+426>: mov (%r8),%rdx 0xffffffff90ea307d <get_next_timer_interrupt+429>: cmp %r8,%rdx 0xffffffff90ea3080 <get_next_timer_interrupt+432>: je 0xffffffff90ea30a7 <get_next_timer_interrupt+471> 0xffffffff90ea3082 <get_next_timer_interrupt+434>: nopw 0x0(%rax,%rax,1) 0xffffffff90ea3088 <get_next_timer_interrupt+440>: testb $0x1,0x18(%rdx) crash> rd ffff95e1f6451188 10 ffff95e1f6451188: ffff95e5ff621178 ffff95e5ff621178 x.b.....x.b..... ffff95e1f6451198: ffff95e1f6451198 ffff95e1f6451198 ..E.......E..... ffff95e1f64511a8: ffff95e1f64511a8 ffff95e1f64511a8 ..E.......E..... ffff95e1f64511b8: ffff95e77cf509a0 ffff95e77cf509a0 ...\|.......\|.... ffff95e1f64511c8: ffff95e1f64511c8 ffff95e1f64511c8 ..E.......E..... crash> rd ffff95e5ff621178 10 ffff95e5ff621178: 0000000000000001 ffff95e15936aa00 ..........6Y.... ffff95e5ff621188: 0000000000000000 00000000ffffffff ................ ffff95e5ff621198: 00000000000000a0 0000000000000010 ................ ffff95e5ff6211a8: ffff95e5ff621198 000000000000000c ..b............. ffff95e5ff6211b8: 00000f5800000000 ffff95e751f8d720 ....X... ..Q.... ffff95e5ff621178 belongs to freed mempool object at ffff95e5ff621080. CACHE NAME OBJSIZE ALLOCATED TOTAL SLABS SSIZE ffff95dc7fd74d00 mnt_cache 384 19785 24948 594 16k SLAB MEMORY NODE TOTAL ALLOCATED FREE ffffdc5dabfd8800 ffff95e5ff620000 1 42 29 13 FREE / [ALLOCATED] ffff95e5ff621080 (cpu 6 cache) Examining the contents of that memory reveals a pointer to a constant string in the driver, "abort\0", which is set by qla24xx_async_abort_cmd(). crash> rd ffffffffc059277c 20 ffffffffc059277c: 6e490074726f6261 0074707572726574 abort.Interrupt. ffffffffc059278c: 00676e696c6c6f50 6920726576697244 Polling.Driver i ffffffffc059279c: 646f6d207325206e 6974736554000a65 n %s mode..Testi ffffffffc05927ac: 636976656420676e 786c252074612065 ng device at %lx ffffffffc05927bc: 6b63656843000a2e 646f727020676e69 ...Checking prod ffffffffc05927cc: 6f20444920746375 0a2e706968632066 uct ID of chip.. ffffffffc05927dc: 5120646e756f4600 204130303232414c .Found QLA2200A ffffffffc05927ec: 43000a2e70696843 20676e696b636568 Chip...Checking ffffffffc05927fc: 65786f626c69616d 6c636e69000a2e73 mailboxes...incl ffffffffc059280c: 756e696c2f656475 616d2d616d642f78 ude/linux/dma-ma crash> struct -ox srb_iocb struct srb_iocb { union { struct {...} logio; struct {...} els_logo; struct {...} tmf; struct {...} fxiocb; struct {...} abt; struct ct_arg ctarg; struct {...} mbx; struct {...} nack; [0x0 ] } u; [0xb8] struct timer_list timer; [0x108] void (timeout)(void ); } SIZE: 0x110 crash> ! bc ibase=16 obase=10 B8+40 F8 The object is a srb_t, and at offset 0xf8 within that structure (i.e. ffff95e5ff621080 + f8 -> ffff95e5ff621178) is a struct timer_list. Cc: <stable@vger.kernel.org> #4.4+ Fixes: `4440e46d5d` ("[SCSI] qla2xxx: Add IOCB Abort command asynchronous handling.") Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:29 +01:00
Bart Van Assche	4dbc3e4d8b	scsi: core: Avoid that ATA error handling can trigger a kernel hang or oops commit `3be8828fc5` upstream. Avoid that the recently introduced call_rcu() call in the SCSI core triggers a double call_rcu() call. Reported-by: Natanael Copa <ncopa@alpinelinux.org> Reported-by: Damien Le Moal <damien.lemoal@wdc.com> References: https://bugzilla.kernel.org/show_bug.cgi?id=198861 Fixes: `3bd6f43f5c` ("scsi: core: Ensure that the SCSI error handler gets woken up") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: Natanael Copa <ncopa@alpinelinux.org> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Alexandre Oliva <oliva@gnu.org> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:29 +01:00
himanshu.madhani@cavium.com	29060ff7c1	scsi: qla2xxx: Fix memory leak in dual/target mode commit `7867b98dce` upstream. When driver is loaded in Target/Dual mode, it creates QPair to support MQ and allocates resources for each QPair. This Qpair initialization is delayed until the FW personality is changed to Dual/Target mode by issuing chip reset. At the time of chip reset firmware is re-initilized in correct personality all the QPairs are initialized by sending MBC_INITIALIZE_MULTIQ (001Fh). This patch fixes memory leak by adding check to issue MBC_INITIALIZE_MULTIQ command only while deleting rsp/req queue when the flag is set for initiator mode, and clean up QPair resources correctly during the driver unload. This MBX does not need to be issued for Target/Dual mode because chip reset will reset ISP. Fixes: `d65237c7f0` ("scsi: qla2xxx: Fix mailbox failure while deleting Queue pairs") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:27 +01:00
Quinn Tran	0393270e9e	scsi: qla2xxx: Fix system crash in qlt_plogi_ack_unref commit `19759033e0` upstream. Fix system crash due to NULL pointer access. qlt_plogi_ack_t and fc_port structures were not properly bound before calling qlt_plogi_ack_unref(). RIP: 0010:qlt_plogi_ack_unref+0xa1/0x150 [qla2xxx] Call Trace: qla24xx_create_new_sess+0xb1/0x320 [qla2xxx] qla2x00_do_work+0x123/0x260 [qla2xxx] qla2x00_iocb_work_fn+0x30/0x40 [qla2xxx] process_one_work+0x1f3/0x530 worker_thread+0x4e/0x480 kthread+0x10c/0x140 Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:27 +01:00
Giridhar Malavali	e62c1051a4	scsi: qla2xxx: Remove aborting ELS IOCB call issued as part of timeout. commit `bf07ef86e8` upstream. This fix the spinlock recursion issue seen while unloading the driver. 14 [ffff9f2e21e03db8] native_queued_spin_lock_slowpath at ffffffffad0d8802 15 [ffff9f2e21e03dc0] do_raw_spin_lock at ffffffffad0d99e4 16 [ffff9f2e21e03dd8] _raw_spin_lock_irqsave at ffffffffad652471 17 [ffff9f2e21e03e00] qla2x00_els_dcmd_iocb_timeout at ffffffffc070cd63 18 [ffff9f2e21e03e40] qla2x00_sp_timeout at ffffffffc06f06d3 [qla2xxx] 19 [ffff9f2e21e03e68] call_timer_fn at ffffffffad0f97d8 20 [ffff9f2e21e03ed8] run_timer_softirq at ffffffffad0faf47 21 [ffff9f2e21e03f68] __softirqentry_text_start at ffffffffad655f32 Fixes: `6eb54715b5` ("qla2xxx: Added interface to send explicit LOGO.") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:27 +01:00
Giridhar Malavali	f5ff7098d9	scsi: qla2xxx: Defer processing of GS IOCB calls commit `5d3300a9b8` upstream. This patch defers processing of GS IOCB calls from interrupt context to avoid hardware spinlock recursion. Following stack trace is seen ? mod_timer+0x193/0x330 ? ql_dbg+0xa7/0xf0 [qla2xxx] _raw_spin_lock_irqsave+0x31/0x40 qla2x00_start_sp+0x3b/0x250 [qla2xxx] qla24xx_async_gnl+0x1d3/0x240 [qla2xxx] qla24xx_fcport_handle_login+0x285/0x290 [qla2xxx] ? vprintk_func+0x20/0x50 Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:27 +01:00
Quinn Tran	1bc43df121	scsi: qla2xxx: Clear loop id after delete commit `ba743f9148` upstream. Clear loop id after delete to prevent session invalidation of stale session. Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:27 +01:00
Quinn Tran	21e4e9c6d8	scsi: qla2xxx: Fix scan state field for fcport commit `76f9a2dd4c` upstream. Add correct value of scan_state field indicating state of the FC port Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:26 +01:00
Quinn Tran	0b42928ca5	scsi: qla2xxx: Replace fcport alloc with qla2x00_alloc_fcport commit `063b36d6b0` upstream. Current code manually allocate an fcport structure that is not properly initialize. Replace kzalloc with qla2x00_alloc_fcport, so that all fields are initialized. Also set set scan flag to port found Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:26 +01:00
Quinn Tran	11739154e6	scsi: qla2xxx: Fix abort command deadlock due to spinlock commit `b0dcce746b` upstream. Original code acquires hardware_lock to add Abort IOCB onto driver request queue for processing. However, abort_command() will also acquire hardware lock to look up sp pointer before issuing abort IOCB command resulting into a deadlock. This patch safely removes the possible deadlock scenario by removing extra spinlock. Fixes: `6eb54715b5` ("qla2xxx: Added interface to send explicit LOGO.") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:26 +01:00
Quinn Tran	4929c45233	scsi: qla2xxx: Fix PRLI state check commit `23c645595d` upstream. Get Port Database MBX cmd is to validate current Login state upon PRLI completion. Current code looks at the last login state for re-validation which was incorrect. This patch removed incorrect state check. Fixes: `15f30a5752` ("qla2xxx: Use IOCB interface to submit non-critical MBX.") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:26 +01:00
Quinn Tran	f92ec32f33	scsi: qla2xxx: Fix Relogin being triggered too fast commit `4005a99566` upstream. Current driver design schedules relogin process via DPC thread every 1 second. In a large fabric, this DPC thread tries to schedule too many jobs and might get overloaded. As a result of this processing of DPC thread, it can schedule relogin earlier than 1 second. Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:26 +01:00
Sawan Chandak	1411448e0a	scsi: qla2xxx: Fix NPIV host cleanup in target mode commit `3be63b1e18` upstream. Add check to make sure we are cleaning up global target host list only for NPIV hosts Fixes: `bdbe24de28` ("scsi: qla2xxx: Cleanup NPIV host in target mode during config teardown") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:26 +01:00
Quinn Tran	4274e4a3be	scsi: qla2xxx: Fix login state machine stuck at GPDB commit `414d9ff3f8` upstream. This patch returns discovery state machine back to Login Complete. Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:26 +01:00
Quinn Tran	585f4ebd9e	scsi: qla2xxx: Serialize GPNID for multiple RSCN commit `2d73ac6102` upstream. GPNID is triggered by RSCN. For multiple RSCNs of the same affected NPORT ID, serialize the GPNID to prevent confusion. Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:26 +01:00
Quinn Tran	a6d50e89f1	scsi: qla2xxx: Retry switch command on time out commit `25ad76b703` upstream. Retry GID_PN & GPN_ID switch commands for time out case. Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:25 +01:00
Quinn Tran	8e6cbe51af	scsi: qla2xxx: Fix re-login for Nport Handle in use commit `a084fd68e1` upstream. When NPort Handle is in use, driver needs to mark the handle as used and pick another. Instead, the code clears the handle and re-pick the same handle. Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:25 +01:00
Quinn Tran	fae72a2710	scsi: qla2xxx: Skip IRQ affinity for Target QPairs commit `d68b850e1b` upstream. Fix co-existence between Block MQ and Target Mode. Block MQ and initiator mode requires midlayer queue mapping to check for IRQ to be affinitized. For target mode, it's not the case. Fixes: `09620eeb62` ("scsi: qla2xxx: Add debug knob for user control workload") Cc: <stable@vger.kernel.org> # 4.12+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:25 +01:00
Quinn Tran	2cd1f76b29	scsi: qla2xxx: Move session delete to driver work queue commit `a01c77d2cb` upstream. Move session delete from system work queue to driver's work queue for in time processing. Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:25 +01:00
Quinn Tran	e0be82d780	scsi: qla2xxx: Fix gpnid error processing commit `22e786ea47` upstream. Stop GPNID command from advancing if command has failed. Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:25 +01:00
Quinn Tran	f58abb5bbd	scsi: qla2xxx: Fix system crash for Notify ack timeout handling commit `2e01d0ba86` upstream. Fix NULL pointer crash due to missing timeout handling callback for Notify Ack IOCB. Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-15 10:54:25 +01:00
Cathy Avery	b7b0385937	scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error [ Upstream commit `d1b8b2391c` ] When an I/O is returned with an srb_status of SRB_STATUS_INVALID_LUN which has zero good_bytes it must be assigned an error. Otherwise the I/O will be continuously requeued and will cause a deadlock in the case where disks are being hot added and removed. sd_probe_async will wait forever for its I/O to complete while holding scsi_sd_probe_domain. Also returning the default error of DID_TARGET_FAILURE causes multipath to not retry the I/O resulting in applications receiving I/O errors before a failover can occur. Signed-off-by: Cathy Avery <cavery@redhat.com> Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-03 10:24:28 +01:00
Prasad B Munirathnam	7cfa95893c	scsi: aacraid: Fix I/O drop during reset [ Upstream commit `5771cfffdf` ] "FIB_CONTEXT_FLAG_TIMEDOUT" flag is set in aac_eh_abort to indicate command timeout. Using the same flag in reset handler causes the command to time out and the I/Os were dropped. Define a new flag "FIB_CONTEXT_FLAG_EH_RESET" to make sure I/O is properly handled in eh_reset handler. [mkp: tweaked commit message] Signed-off-by: Prasad B Munirathnam <prasad.munirathnam@microsemi.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-03 10:24:22 +01:00

1 2 3 4 5 ...

15569 Commits