linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-09 20:32:04 +09:00

Author	SHA1	Message	Date
Mikulas Patocka	ad6bf88a6c	block: fix an integer overflow in logical block size Logical block size has type unsigned short. That means that it can be at most 32768. However, there are architectures that can run with 64k pages (for example arm64) and on these architectures, it may be possible to create block devices with 64k block size. For exmaple (run this on an architecture with 64k pages): Mount will fail with this error because it tries to read the superblock using 2-sector access: device-mapper: writecache: I/O is not aligned, sector 2, size 1024, block size 65536 EXT4-fs (dm-0): unable to read superblock This patch changes the logical block size from unsigned short to unsigned int to avoid the overflow. Cc: stable@vger.kernel.org Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-01-15 21:43:09 -07:00
Bijan Mottahedeh	797f3f535d	io_uring: clear req->result always before issuing a read/write request req->result is cleared when io_issue_sqe() calls io_read/write_pre() routines. Those routines however are not called when the sqe argument is NULL, which is the case when io_issue_sqe() is called from io_wq_submit_work(). io_issue_sqe() may then examine a stale result if a polled request had previously failed with -EAGAIN: if (ctx->flags & IORING_SETUP_IOPOLL) { if (req->result == -EAGAIN) return -EAGAIN; io_iopoll_req_issued(req); } and in turn cause a subsequently completed request to be re-issued in io_wq_submit_work(). Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-01-15 21:36:13 -07:00
Anand Lodnoor	824b72db50	scsi: megaraid_sas: Update driver version to 07.713.01.00-rc1 Link: https://lore.kernel.org/r/1579000882-20246-12-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:03 -05:00
Anand Lodnoor	4d1634b8d1	scsi: megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests Remove usage of device_busy counter from driver. Instead of device_busy counter now driver uses 'nr_active' counter of request_queue to get the number of inflight request for a LUN. Link: https://lore.kernel.org/r/1579000882-20246-11-git-send-email-anand.lodnoor@broadcom.com Link : https://patchwork.kernel.org/patch/11249297/ Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:03 -05:00
Anand Lodnoor	56ee0c5856	scsi: megaraid_sas: Limit the number of retries for the IOCTLs causing firmware fault IOCTLs causing firmware fault may end up in failed controller resets and finally killing the adapter. This patch fixes this problem as stated below: In OCR sequence, driver will attempt refiring pended IOCTLs upto two times. If first two attempts fail, then in third attempt driver will return pended IOCTLs with EBUSY status to application. These changes are done to ensure if any of pended IOCTLs is causing firmware fault and resulting into OCR failure, then in last attempt of OCR driver will refrain firing it to firmware and saving adapter from being killed due to faulty IOCTL. Link: https://lore.kernel.org/r/1579000882-20246-10-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:03 -05:00
Anand Lodnoor	6d7537270e	scsi: megaraid_sas: Do not initiate OCR if controller is not in ready state Driver initiates OCR if a DCMD command times out. But there is a deadlock if the driver attempts to invoke another OCR before the mutex lock (reset_mutex) is released from the previous session of OCR. This patch takes care of the above scenario using new flag MEGASAS_FUSION_OCR_NOT_POSSIBLE to indicate if OCR is possible. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1579000882-20246-9-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:03 -05:00
Anand Lodnoor	201a810cc1	scsi: megaraid_sas: Re-Define enum DCMD_RETURN_STATUS DCMD_INIT is introduced to indicate the initial DCMD status, which was earlier set to MFI status. DCMD_BUSY indicates the resource is busy or locked. Link: https://lore.kernel.org/r/1579000882-20246-8-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:03 -05:00
Anand Lodnoor	eeb63c23ff	scsi: megaraid_sas: Do not set HBA Operational if FW is not in operational state After issuing a adapter reset, driver blindly used to set adprecovery flag to OPERATIONAL state. Add a check to see if the FW is operational before setting the flag and marking reset adapter successful. Link: https://lore.kernel.org/r/1579000882-20246-7-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:03 -05:00
Anand Lodnoor	9330a0fd82	scsi: megaraid_sas: Do not kill HBA if JBOD Seqence map or RAID map is disabled At the time of firmware initialization, if JBOD map or RAID map is not available, driver can function without these features in a limited functionality mode. Link: https://lore.kernel.org/r/1579000882-20246-6-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:03 -05:00
Anand Lodnoor	eb974f34bb	scsi: megaraid_sas: Do not kill host bus adapter, if adapter is already dead Link: https://lore.kernel.org/r/1579000882-20246-5-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:02 -05:00
Anand Lodnoor	6e73550670	scsi: megaraid_sas: Update optimal queue depth for SAS and NVMe devices Ideally, optimal queue depth will be provided by firmware. The driver defines will be used as a fallback mechanism in case the FW assisted QD is not supported. The driver defined values provide optimal queue depth for most of the drives and the workloads, as is learned from the firmware assisted QD results. Link: https://lore.kernel.org/r/1579000882-20246-4-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:02 -05:00
Anand Lodnoor	a7faf81d78	scsi: megaraid_sas: Set no_write_same only for Virtual Disk Disable WRITE_SAME (no_write_same) for Virtual Disks only. For System PDs and EPDs (Enhanced PDs), WRITE_SAME need not be disabled by default. Link: https://lore.kernel.org/r/1579000882-20246-3-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:02 -05:00
Anand Lodnoor	499e7246d6	scsi: megaraid_sas: Reset adapter if FW is not in READY state after device resume After device resume we expect the firmware to be in READY state. Transition to READY might fail due to unhandled exceptions, such as an internal error or a hardware failure. Retry initiating chip reset and wait for the controller to come to ready state. Link: https://lore.kernel.org/r/1579000882-20246-2-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:21:02 -05:00
Thomas Bogendoerfer	ba304e5b44	scsi: qla1280: Fix dma firmware download, if dma address is 64bit Do firmware download with 64bit LOAD_RAM command, if driver is using 64bit addressing. Link: https://lore.kernel.org/r/20200114160936.1517-1-tbogendoerfer@suse.de Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:09:11 -05:00
Dan Carpenter	28d76df18f	scsi: mptfusion: Fix double fetch bug in ioctl Tom Hatskevich reported that we look up "iocp" then, in the called functions we do a second copy_from_user() and look it up again. The problem that could cause is: drivers/message/fusion/mptctl.c 674 /* All of these commands require an interrupt or 675 * are unknown/illegal. 676 */ 677 if ((ret = mptctl_syscall_down(iocp, nonblock)) != 0) ^^^^ We take this lock. 678 return ret; 679 680 if (cmd == MPTFWDOWNLOAD) 681 ret = mptctl_fw_download(arg); ^^^ Then the user memory changes and we look up "iocp" again but a different one so now we are holding the incorrect lock and have a race condition. 682 else if (cmd == MPTCOMMAND) 683 ret = mptctl_mpt_command(arg); The security impact of this bug is not as bad as it could have been because these operations are all privileged and root already has enormous destructive power. But it's still worth fixing. This patch passes the "iocp" pointer to the functions to avoid the second lookup. That deletes 100 lines of code from the driver so it's a nice clean up as well. Link: https://lore.kernel.org/r/20200114123414.GA7957@kadam Reported-by: Tom Hatskevich <tom2001tom.23@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:05:52 -05:00
Long Li	7b571c19d4	scsi: storvsc: Correctly set number of hardware queues for IDE disk Commit `0ed8810276` ("scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue") introduced a regression for disks attached to IDE. For these disks the host VSP only offers one VMBUS channel. Setting multiple queues can overload the VMBUS channel and result in performance drop for high queue depth workload on system with large number of CPUs. Fix it by leaving the number of hardware queues to 1 (default value) for IDE disks. Fixes: `0ed8810276` ("scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue") Link: https://lore.kernel.org/r/1578960516-108228-1-git-send-email-longli@linuxonhyperv.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 23:02:24 -05:00
Arnd Bergmann	42ec15ceae	scsi: fnic: fix invalid stack access gcc -O3 warns that some local variables are not properly initialized: drivers/scsi/fnic/vnic_dev.c: In function 'fnic_dev_hang_notify': drivers/scsi/fnic/vnic_dev.c:511:16: error: 'a0' is used uninitialized in this function [-Werror=uninitialized] vdev->args[0] = a0; ~~~~~~~~~~~~~~^~~~~ drivers/scsi/fnic/vnic_dev.c:691:6: note: 'a0' was declared here u64 a0, a1; ^~ drivers/scsi/fnic/vnic_dev.c:512:16: error: 'a1' is used uninitialized in this function [-Werror=uninitialized] vdev->args[1] = a1; ~~~~~~~~~~~~~~^~~~~ drivers/scsi/fnic/vnic_dev.c:691:10: note: 'a1' was declared here u64 a0, a1; ^~ drivers/scsi/fnic/vnic_dev.c: In function 'fnic_dev_mac_addr': drivers/scsi/fnic/vnic_dev.c:512:16: error: 'a1' is used uninitialized in this function [-Werror=uninitialized] vdev->args[1] = *a1; ~~~~~~~~~~~~~~^~~~~ drivers/scsi/fnic/vnic_dev.c:698:10: note: 'a1' was declared here u64 a0, a1; ^~ Apparently the code relies on the local variables occupying adjacent memory locations in the same order, but this is of course not guaranteed. Use an array of two u64 variables where needed to make it work correctly. I suspect there is also an endianness bug here, but have not digged in deep enough to be sure. Fixes: `5df6d737dd` ("[SCSI] fnic: Add new Cisco PCI-Express FCoE HBA") Fixes: mmtom ("init/Kconfig: enable -O3 for all arches") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200107201602.4096790-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 22:59:59 -05:00
Gabriel Krisman Bertazi	f3c893e3db	scsi: iscsi: Fail session and connection on transport registration failure If the transport cannot be registered, the session/connection creation needs to be failed early to let the initiator know. Otherwise, the system will have an outstanding connection that cannot be used nor removed by open-iscsi. The result is similar to the error below, triggered by injecting a failure in the transport's registration path. openiscsi reports success: root@debian-vm:~# iscsiadm -m node -T iqn:lun1 -p 127.0.0.1 -l Logging in to [iface: default, target: iqn:lun1, portal: 127.0.0.1,3260] Login to [iface: default, target: iqn:lun1, portal:127.0.0.1,3260] successful. But cannot remove the session afterwards, since the kernel is in an inconsistent state. root@debian-vm:~# iscsiadm -m node -T iqn:lun1 -p 127.0.0.1 -u iscsiadm: No matching sessions found Link: https://lore.kernel.org/r/20200106185817.640331-4-krisman@collabora.com Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 22:55:37 -05:00
Gabriel Krisman Bertazi	cd7ea70bb0	scsi: drivers: base: Propagate errors through the transport component The transport registration may fail. Make sure the errors are propagated to the callers. Link: https://lore.kernel.org/r/20200106185817.640331-3-krisman@collabora.com Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 22:55:37 -05:00
Gabriel Krisman Bertazi	7c1ef33870	scsi: drivers: base: Support atomic version of attribute_container_device_trigger attribute_container_device_trigger invokes callbacks that may fail for one or more classdevs, for instance, the transport_add_class_device callback, called during transport creation, does memory allocation. This information, though, is not propagated to upper layers, and any driver using the attribute_container_device_trigger API will not know whether any, some, or all callbacks succeeded. This patch implements a safe version of this dispatcher, to either succeed all the callbacks or revert to the original state. Link: https://lore.kernel.org/r/20200106185817.640331-2-krisman@collabora.com Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 22:55:36 -05:00
Nick Black	54155ed419	scsi: iscsi: Don't destroy session if there are outstanding connections A faulty userspace that calls destroy_session() before destroying the connections can trigger the failure. This patch prevents the issue by refusing to destroy the session if there are outstanding connections. ------------[ cut here ]------------ kernel BUG at mm/slub.c:306! invalid opcode: 0000 [#1] SMP PTI CPU: 1 PID: 1224 Comm: iscsid Not tainted 5.4.0-rc2.iscsi+ #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:__slab_free+0x181/0x350 [...] [ 1209.686056] RSP: 0018:ffffa93d4074fae0 EFLAGS: 00010246 [ 1209.686694] RAX: ffff934efa5ad800 RBX: 000000008010000a RCX: ffff934efa5ad800 [ 1209.687651] RDX: ffff934efa5ad800 RSI: ffffeb4041e96b00 RDI: ffff934efd402c40 [ 1209.688582] RBP: ffffa93d4074fb80 R08: 0000000000000001 R09: ffffffffbb5dfa26 [ 1209.689425] R10: ffff934efa5ad800 R11: 0000000000000001 R12: ffffeb4041e96b00 [ 1209.690285] R13: ffff934efa5ad800 R14: ffff934efd402c40 R15: 0000000000000000 [ 1209.691213] FS: 00007f7945dfb540(0000) GS:ffff934efda80000(0000) knlGS:0000000000000000 [ 1209.692316] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1209.693013] CR2: 000055877fd3da80 CR3: 0000000077384000 CR4: 00000000000006e0 [ 1209.693897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1209.694773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1209.695631] Call Trace: [ 1209.695957] ? __wake_up_common_lock+0x8a/0xc0 [ 1209.696712] iscsi_pool_free+0x26/0x40 [ 1209.697263] iscsi_session_teardown+0x2f/0xf0 [ 1209.698117] iscsi_sw_tcp_session_destroy+0x45/0x60 [ 1209.698831] iscsi_if_rx+0xd88/0x14e0 [ 1209.699370] netlink_unicast+0x16f/0x200 [ 1209.699932] netlink_sendmsg+0x21a/0x3e0 [ 1209.700446] sock_sendmsg+0x4f/0x60 [ 1209.700902] ___sys_sendmsg+0x2ae/0x320 [ 1209.701451] ? cp_new_stat+0x150/0x180 [ 1209.701922] __sys_sendmsg+0x59/0xa0 [ 1209.702357] do_syscall_64+0x52/0x160 [ 1209.702812] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1209.703419] RIP: 0033:0x7f7946433914 [...] [ 1209.706084] RSP: 002b:00007fffb99f2378 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 1209.706994] RAX: ffffffffffffffda RBX: 000055bc869eac20 RCX: 00007f7946433914 [ 1209.708082] RDX: 0000000000000000 RSI: 00007fffb99f2390 RDI: 0000000000000005 [ 1209.709120] RBP: 00007fffb99f2390 R08: 000055bc84fe9320 R09: 00007fffb99f1f07 [ 1209.710110] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000038 [ 1209.711085] R13: 000055bc8502306e R14: 0000000000000000 R15: 0000000000000000 Modules linked in: ---[ end trace a2d933ede7f730d8 ]--- Link: https://lore.kernel.org/r/20191226203148.2172200-1-krisman@collabora.com Signed-off-by: Nick Black <nlb@google.com> Co-developed-by: Salman Qazi <sqazi@google.com> Signed-off-by: Salman Qazi <sqazi@google.com> Co-developed-by: Junho Ryu <jayr@google.com> Signed-off-by: Junho Ryu <jayr@google.com> Co-developed-by: Khazhismel Kumykov <khazhy@google.com> Signed-off-by: Khazhismel Kumykov <khazhy@google.com> Co-developed-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 22:48:34 -05:00
Stanley Chu	ea92c32bd3	scsi: ufs-mediatek: add apply_dev_quirks variant operation Add vendor-specific variant callback "apply_dev_quirks" to MediaTek UFS driver. Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Asutosh Das <asutoshd@codeaurora.org> Cc: Avri Altman <avri.altman@wdc.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Bean Huo <beanhuo@micron.com> Cc: Can Guo <cang@codeaurora.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/1578726707-6596-3-git-send-email-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 22:23:13 -05:00
Stanley Chu	c40ad6b7fc	scsi: ufs: pass device information to apply_dev_quirks Pass UFS device information to vendor-specific variant callback "apply_dev_quirks" because some platform vendors need to know such information to apply special handling or quirks in specific devices. At the same time, modify existing vendor implementations according to the new interface for those vendor drivers which will be built-in or built as a module alone with UFS core driver. [mkp: clarified commit desc] Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Asutosh Das <asutoshd@codeaurora.org> Cc: Avri Altman <avri.altman@wdc.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Bean Huo <beanhuo@micron.com> Cc: Can Guo <cang@codeaurora.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/1578726707-6596-2-git-send-email-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 22:23:13 -05:00
Colin Ian King	4362269711	scsi: BusLogic: use %lX for unsigned long rather than %X Currently the incorrect %X print format specifier is being used for several unsigned longs. Fix these by using %lX instead. Also join up some literal strings that are split. Link: https://lore.kernel.org/r/20200108193800.96706-1-colin.king@canonical.com Addresses-Coverity: ("Invalid type in argument to printf format specifier") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Khalid Aziz <khalid@gonehiking.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 22:19:49 -05:00
Stanley Chu	fd1fb4d556	scsi: ufs: remove "errors" word in ufshcd_print_err_hist() Remove "errors" word in output string by ufshcd_print_err_hist() since not all printed targets are "errors". Sometimes they are just "events". In addition, all events which can be treated as "errors" already have "err" or "fail" words in their names. Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Asutosh Das <asutoshd@codeaurora.org> Cc: Avri Altman <avri.altman@wdc.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Bean Huo <beanhuo@micron.com> Cc: Can Guo <cang@codeaurora.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/1578147968-30938-4-git-send-email-stanley.chu@mediatek.com Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 21:59:44 -05:00
Stanley Chu	a5fe372d92	scsi: ufs: add device reset history for vendor implementations Device reset history shall be also added for vendor's device reset variant operation implementation. Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Asutosh Das <asutoshd@codeaurora.org> Cc: Avri Altman <avri.altman@wdc.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Bean Huo <beanhuo@micron.com> Cc: Can Guo <cang@codeaurora.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/1578147968-30938-3-git-send-email-stanley.chu@mediatek.com Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 21:59:44 -05:00
Stanley Chu	645728a644	scsi: ufs: fix empty check of error history Currently checking if an error history element is empty or not is by its "value". In most cases, value is error code. However this checking is not correct because some errors or events do not specify any values in error history so values remain as 0, and this will lead to incorrect empty checking. Fix it by checking "timestamp" instead of "value" because timestamp will be always assigned for all history elements Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Asutosh Das <asutoshd@codeaurora.org> Cc: Avri Altman <avri.altman@wdc.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Bean Huo <beanhuo@micron.com> Cc: Can Guo <cang@codeaurora.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/1578147968-30938-2-git-send-email-stanley.chu@mediatek.com Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-01-15 21:59:44 -05:00
Andrii Nakryiko	b65053cd94	selftests/bpf: Add whitelist/blacklist of test names to test_progs Add ability to specify a list of test name substrings for selecting which tests to run. So now -t is accepting a comma-separated list of strings, similarly to how -n accepts a comma-separated list of test numbers. Additionally, add ability to blacklist tests by name. Blacklist takes precedence over whitelist. Blacklisting is important for cases where it's known that some tests can't pass (e.g., due to perf hardware events that are not available within VM). This is going to be used for libbpf testing in Travis CI in its Github repo. Example runs with just whitelist and whitelist + blacklist: $ sudo ./test_progs -tattach,core/existence #1 attach_probe:OK #6 cgroup_attach_autodetach:OK #7 cgroup_attach_multi:OK #8 cgroup_attach_override:OK #9 core_extern:OK #10/44 existence:OK #10/45 existence___minimal:OK #10/46 existence__err_int_sz:OK #10/47 existence__err_int_type:OK #10/48 existence__err_int_kind:OK #10/49 existence__err_arr_kind:OK #10/50 existence__err_arr_value_type:OK #10/51 existence__err_struct_type:OK #10 core_reloc:OK #19 flow_dissector_reattach:OK #60 tp_attach_query:OK Summary: 8/8 PASSED, 0 SKIPPED, 0 FAILED $ sudo ./test_progs -tattach,core/existence -bcgroup,flow/arr #1 attach_probe:OK #9 core_extern:OK #10/44 existence:OK #10/45 existence___minimal:OK #10/46 existence__err_int_sz:OK #10/47 existence__err_int_type:OK #10/48 existence__err_int_kind:OK #10/51 existence__err_struct_type:OK #10 core_reloc:OK #60 tp_attach_query:OK Summary: 4/6 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: Julia Kartseva <hex@fb.com> Link: https://lore.kernel.org/bpf/20200116005549.3644118-1-andriin@fb.com	2020-01-15 18:38:39 -08:00
Greentime Hu	20d2292754	riscv: make sure the cores stay looping in .Lsecondary_park The code in secondary_park is currently placed in the .init section. The kernel reclaims and clears this code when it finishes booting. That causes the cores parked in it to go to somewhere unpredictable, so we move this function out of init to make sure the cores stay looping there. The instruction bgeu a0, t0, .Lsecondary_park may have "a relocation truncated to fit" issue during linking time. It is because that sections are too far to jump. Let's use tail to jump to the .Lsecondary_park. Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Anup Patel <anup.patel@sifive.com> Cc: Andreas Schwab <schwab@suse.de> Cc: stable@vger.kernel.org Fixes: `76d2a0493a` ("RISC-V: Init and Halt Code") Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>	2020-01-15 18:07:54 -08:00
Dave Airlie	16a89856f0	Merge tag 'amd-drm-fixes-5.5-2020-01-15' of git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.5-2020-01-15: - Update golden settings for renoir - eDP fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200115235141.4149-1-alexander.deucher@amd.com	2020-01-16 11:25:44 +10:00
Vineet Gupta	bd71c453db	ARC: wireup clone3 syscall Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2020-01-15 16:08:12 -08:00
Nicolas Saenz Julienne	d5c8dc0d4c	ARM: dts: bcm2711: Enable PCIe controller This enables bcm2711's PCIe bus, which is hardwired to a VIA Technologies XHCI USB 3.0 controller. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>	2020-01-15 15:41:11 -08:00
Nicolas Saenz Julienne	c5a1e5375d	ARM: dts: bcm283x: Unify CMA configuration With the introduction of the Raspberry Pi 4 we were forced to explicitly configure CMA's location, since arm64 defaults it into the ZONE_DMA32 memory area, which is not good enough to perform DMA operations on that device. To bypass this limitation a dedicated CMA DT node was created, explicitly indicating the acceptable memory range and size. That said, compatibility between boards is a must on the Raspberry Pi ecosystem so this creates a common CMA DT node so as for DT overlays to be able to update CMA's properties regardless of the board being used. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Reviewed-by: Phil Elwell <phil@raspberrypi.org> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>	2020-01-15 15:41:00 -08:00
Alexei Starovoitov	7bcfea9615	Merge branch 'bpftool-improvements' Martin Lau says: ==================== When a map is storing a kernel's struct, its map_info->btf_vmlinux_value_type_id is set. The first map type supporting it is BPF_MAP_TYPE_STRUCT_OPS. This series adds support to dump this kind of map with BTF. The first two patches are bug fixes which are only applicable to bpf-next. Please see individual patches for details. v3: - Remove unnecessary #include "libbpf_internal.h" from patch 5 v2: - Expose bpf_find_kernel_btf() as a LIBBPF_API in patch 3 (Andrii) - Cache btf_vmlinux in bpftool/map.c (Andrii) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-01-15 15:24:17 -08:00
Martin KaFai Lau	4e1ea33292	bpftool: Support dumping a map with btf_vmlinux_value_type_id This patch makes bpftool support dumping a map's value properly when the map's value type is a type of the running kernel's btf. (i.e. map_info.btf_vmlinux_value_type_id is set instead of map_info.btf_value_type_id). The first usecase is for the BPF_MAP_TYPE_STRUCT_OPS. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200115230044.1103008-1-kafai@fb.com	2020-01-15 15:23:27 -08:00
Martin KaFai Lau	84c72ceee9	bpftool: Add struct_ops map name This patch adds BPF_MAP_TYPE_STRUCT_OPS to "struct_ops" name mapping so that "bpftool map show" can print the "struct_ops" map type properly. [root@arch-fb-vm1 bpf]# ~/devshare/fb-kernel/linux/tools/bpf/bpftool/bpftool map show id 8 8: struct_ops name dctcp flags 0x0 key 4B value 256B max_entries 1 memlock 4096B btf_id 7 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200115230037.1102674-1-kafai@fb.com	2020-01-15 15:23:27 -08:00
Martin KaFai Lau	fb2426ad00	libbpf: Expose bpf_find_kernel_btf as a LIBBPF_API This patch exposes bpf_find_kernel_btf() as a LIBBPF_API. It will be used in 'bpftool map dump' in a following patch to dump a map with btf_vmlinux_value_type_id set. bpf_find_kernel_btf() is renamed to libbpf_find_kernel_btf() and moved to btf.c. As <linux/kernel.h> is included, some of the max/min type casting needs to be fixed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200115230031.1102305-1-kafai@fb.com	2020-01-15 15:23:27 -08:00
Martin KaFai Lau	188a486619	bpftool: Fix missing BTF output for json during map dump The btf availability check is only done for plain text output. It causes the whole BTF output went missing when json_output is used. This patch simplifies the logic a little by avoiding passing "int btf" to map_dump(). For plain text output, the btf_wtr is only created when the map has BTF (i.e. info->btf_id != 0). The nullness of "json_writer_t *wtr" in map_dump() alone can decide if dumping BTF output is needed. As long as wtr is not NULL, map_dump() will print out the BTF-described data whenever a map has BTF available (i.e. info->btf_id != 0) regardless of json or plain-text output. In do_dump(), the "int btf" is also renamed to "int do_plain_btf". Fixes: `99f9863a0c` ("bpftool: Match maps by name") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: Paul Chaignon <paul.chaignon@orange.com> Link: https://lore.kernel.org/bpf/20200115230025.1101828-1-kafai@fb.com	2020-01-15 15:23:27 -08:00
Martin KaFai Lau	d7de72674a	bpftool: Fix a leak of btf object When testing a map has btf or not, maps_have_btf() tests it by actually getting a btf_fd from sys_bpf(BPF_BTF_GET_FD_BY_ID). However, it forgot to btf__free() it. In maps_have_btf() stage, there is no need to test it by really calling sys_bpf(BPF_BTF_GET_FD_BY_ID). Testing non zero info.btf_id is good enough. Also, the err_close case is unnecessary, and also causes double close() because the calling func do_dump() will close() all fds again. Fixes: `99f9863a0c` ("bpftool: Match maps by name") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: Paul Chaignon <paul.chaignon@orange.com> Link: https://lore.kernel.org/bpf/20200115230019.1101352-1-kafai@fb.com	2020-01-15 15:23:27 -08:00
Mario Kleiner	3b7c59754c	drm/amd/display: Reorder detect_edp_sink_caps before link settings read. read_current_link_settings_on_detect() on eDP 1.4+ may use the edp_supported_link_rates table which is set up by detect_edp_sink_caps(), so that function needs to be called first. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Martin Leung <martin.leung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-01-15 17:54:54 -05:00
Aaron Liu	f2360e333b	drm/amdgpu: update goldensetting for renoir Update mmSDMA0_UTCL1_WATERMK golden setting for renoir. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-15 17:52:21 -05:00
Marek Marczykowski-Górecki	476878e4b2	xen-pciback: optionally allow interrupt enable flag writes QEMU running in a stubdom needs to be able to set INTX_DISABLE, and the MSI(-X) enable flags in the PCI config space. This adds an attribute 'allow_interrupt_control' which when set for a PCI device allows writes to this flag(s). The toolstack will need to set this for stubdoms. When enabled, guest (stubdomain) will be allowed to set relevant enable flags, but only one at a time - i.e. it refuses to enable more than one of INTx, MSI, MSI-X at a time. This functionality is needed only for config space access done by device model (stubdomain) serving a HVM with the actual PCI device. It is not necessary and unsafe to enable direct access to those bits for PV domain with the device attached. For PV domains, there are separate protocol messages (XEN_PCI_OP_{enable,disable}_{msi,msix}) for this purpose. Those ops in addition to setting enable bits, also configure MSI(-X) in dom0 kernel - which is undesirable for PCI passthrough to HVM guests. This should not introduce any new security issues since a malicious guest (or stubdom) can already generate MSIs through other ways, see [1] page 8. Additionally, when qemu runs in dom0, it already have direct access to those bits. This is the second iteration of this feature. First was proposed as a direct Xen interface through a new hypercall, but ultimately it was rejected by the maintainer, because of mixing pciback and hypercalls for PCI config space access isn't a good design. Full discussion at [2]. [1]: https://invisiblethingslab.com/resources/2011/Software%20Attacks%20on%20Intel%20VT-d.pdf [2]: https://xen.markmail.org/thread/smpgpws4umdzizze [part of the commit message and sysfs handling] Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> [the rest] Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> [boris: A few small changes suggested by Roger, some formatting changes] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2020-01-15 16:50:13 -06:00
Daniel Borkmann	85ddd9c317	Merge branch 'bpf-sockmap-tls-fixes' John Fastabend says: ==================== To date our usage of sockmap/tls has been fairly simple, the BPF programs did only well-defined pop, push, pull and apply/cork operations. Now that we started to push more complex programs into sockmap we uncovered a series of issues addressed here. Further OpenSSL3.0 version should be released soon with kTLS support so its important to get any remaining issues on BPF and kTLS support resolved. Additionally, I have a patch under development to allow sockmap to be enabled/disabled at runtime for Cilium endpoints. This allows us to stress the map insert/delete with kTLS more than previously where Cilium only added the socket to the map when it entered ESTABLISHED state and never touched it from the control path side again relying on the sockets own close() hook to remove it. To test I have a set of test cases in test_sockmap.c that expose these issues. Once we get fixes here merged and in bpf-next I'll submit the tests to bpf-next tree to ensure we don't regress again. Also I've run these patches in the Cilium CI with OpenSSL (master branch) this will run tools such as netperf, ab, wrk2, curl, etc. to get a broad set of testing. I'm aware of two more issues that we are working to resolve in another couple (probably two) patches. First we see an auth tag corruption in kTLS when sending small 1byte chunks under stress. I've not pinned this down yet. But, guessing because its under 1B stress tests it must be some error path being triggered. And second we need to ensure BPF RX programs are not skipped when kTLS ULP is loaded. This breaks some of the sockmap selftests when running with kTLS. I'll send a follow up for this. v2: I dropped a patch that added !0 size check in tls_push_record this originated from a panic I caught awhile ago with a trace in the crypto stack. But I can not reproduce it anymore so will dig into that and send another patch later if needed. Anyways after a bit of thought it would be nicer if tls/crypto/bpf didn't require special case handling for the !0 size. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2020-01-15 23:26:23 +01:00
John Fastabend	7361d44896	bpf: Sockmap/tls, fix pop data with SK_DROP return code When user returns SK_DROP we need to reset the number of copied bytes to indicate to the user the bytes were dropped and not sent. If we don't reset the copied arg sendmsg will return as if those bytes were copied giving the user a positive return value. This works as expected today except in the case where the user also pops bytes. In the pop case the sg.size is reduced but we don't correctly account for this when copied bytes is reset. The popped bytes are not accounted for and we return a small positive value potentially confusing the user. The reason this happens is due to a typo where we do the wrong comparison when accounting for pop bytes. In this fix notice the if/else is not needed and that we have a similar problem if we push data except its not visible to the user because if delta is larger the sg.size we return a negative value so it appears as an error regardless. Fixes: `7246d8ed4d` ("bpf: helper to pop data from messages") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-9-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	9aaaa56845	bpf: Sockmap/tls, skmsg can have wrapped skmsg that needs extra chaining Its possible through a set of push, pop, apply helper calls to construct a skmsg, which is just a ring of scatterlist elements, with the start value larger than the end value. For example, end start \|_0_\|_1_\| ... \|_n_\|_n+1_\| Where end points at 1 and start points and n so that valid elements is the set {n, n+1, 0, 1}. Currently, because we don't build the correct chain only {n, n+1} will be sent. This adds a check and sg_chain call to correctly submit the above to the crypto and tls send path. Fixes: `d3b18ad31f` ("tls: add bpf support to sk_msg handling") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-8-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	d468e4775c	bpf: Sockmap/tls, tls_sw can create a plaintext buf > encrypt buf It is possible to build a plaintext buffer using push helper that is larger than the allocated encrypt buffer. When this record is pushed to crypto layers this can result in a NULL pointer dereference because the crypto API expects the encrypt buffer is large enough to fit the plaintext buffer. Kernel splat below. To resolve catch the cases this can happen and split the buffer into two records to send individually. Unfortunately, there is still one case to handle where the split creates a zero sized buffer. In this case we merge the buffers and unmark the split. This happens when apply is zero and user pushed data beyond encrypt buffer. This fixes the original case as well because the split allocated an encrypt buffer larger than the plaintext buffer and the merge simply moves the pointers around so we now have a reference to the new (larger) encrypt buffer. Perhaps its not ideal but it seems the best solution for a fixes branch and avoids handling these two cases, (a) apply that needs split and (b) non apply case. The are edge cases anyways so optimizing them seems not necessary unless someone wants later in next branches. [ 306.719107] BUG: kernel NULL pointer dereference, address: 0000000000000008 [...] [ 306.747260] RIP: 0010:scatterwalk_copychunks+0x12f/0x1b0 [...] [ 306.770350] Call Trace: [ 306.770956] scatterwalk_map_and_copy+0x6c/0x80 [ 306.772026] gcm_enc_copy_hash+0x4b/0x50 [ 306.772925] gcm_hash_crypt_remain_continue+0xef/0x110 [ 306.774138] gcm_hash_crypt_continue+0xa1/0xb0 [ 306.775103] ? gcm_hash_crypt_continue+0xa1/0xb0 [ 306.776103] gcm_hash_assoc_remain_continue+0x94/0xa0 [ 306.777170] gcm_hash_assoc_continue+0x9d/0xb0 [ 306.778239] gcm_hash_init_continue+0x8f/0xa0 [ 306.779121] gcm_hash+0x73/0x80 [ 306.779762] gcm_encrypt_continue+0x6d/0x80 [ 306.780582] crypto_gcm_encrypt+0xcb/0xe0 [ 306.781474] crypto_aead_encrypt+0x1f/0x30 [ 306.782353] tls_push_record+0x3b9/0xb20 [tls] [ 306.783314] ? sk_psock_msg_verdict+0x199/0x300 [ 306.784287] bpf_exec_tx_verdict+0x3f2/0x680 [tls] [ 306.785357] tls_sw_sendmsg+0x4a3/0x6a0 [tls] test_sockmap test signature to trigger bug, [TEST]: (1, 1, 1, sendmsg, pass,redir,start 1,end 2,pop (1,2),ktls,): Fixes: `d3b18ad31f` ("tls: add bpf support to sk_msg handling") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-7-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	cf21e9ba1e	bpf: Sockmap/tls, msg_push_data may leave end mark in place Leaving an incorrect end mark in place when passing to crypto layer will cause crypto layer to stop processing data before all data is encrypted. To fix clear the end mark on push data instead of expecting users of the helper to clear the mark value after the fact. This happens when we push data into the middle of a skmsg and have room for it so we don't do a set of copies that already clear the end flag. Fixes: `6fff607e2f` ("bpf: sk_msg program helper bpf_msg_push_data") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-6-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	6562e29cf6	bpf: Sockmap, skmsg helper overestimates push, pull, and pop bounds In the push, pull, and pop helpers operating on skmsg objects to make data writable or insert/remove data we use this bounds check to ensure specified data is valid, /* Bounds checks: start and pop must be inside message */ if (start >= offset + l \|\| last >= msg->sg.size) return -EINVAL; The problem here is offset has already included the length of the current element the 'l' above. So start could be past the end of the scatterlist element in the case where start also points into an offset on the last skmsg element. To fix do the accounting slightly different by adding the length of the previous entry to offset at the start of the iteration. And ensure its initialized to zero so that the first iteration does nothing. Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Fixes: `6fff607e2f` ("bpf: sk_msg program helper bpf_msg_push_data") Fixes: `7246d8ed4d` ("bpf: helper to pop data from messages") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-5-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	33bfe20dd7	bpf: Sockmap/tls, push write_space updates through ulp updates When sockmap sock with TLS enabled is removed we cleanup bpf/psock state and call tcp_update_ulp() to push updates to TLS ULP on top. However, we don't push the write_space callback up and instead simply overwrite the op with the psock stored previous op. This may or may not be correct so to ensure we don't overwrite the TLS write space hook pass this field to the ULP and have it fixup the ctx. This completes a previous fix that pushed the ops through to the ULP but at the time missed doing this for write_space, presumably because write_space TLS hook was added around the same time. Fixes: `95fa145479` ("bpf: sockmap/tls, close can race with map free") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-4-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	7e81a35302	bpf: Sockmap, ensure sock lock held during tear down The sock_map_free() and sock_hash_free() paths used to delete sockmap and sockhash maps walk the maps and destroy psock and bpf state associated with the socks in the map. When done the socks no longer have BPF programs attached and will function normally. This can happen while the socks in the map are still "live" meaning data may be sent/received during the walk. Currently, though we don't take the sock_lock when the psock and bpf state is removed through this path. Specifically, this means we can be writing into the ops structure pointers such as sendmsg, sendpage, recvmsg, etc. while they are also being called from the networking side. This is not safe, we never used proper READ_ONCE/WRITE_ONCE semantics here if we believed it was safe. Further its not clear to me its even a good idea to try and do this on "live" sockets while networking side might also be using the socket. Instead of trying to reason about using the socks from both sides lets realize that every use case I'm aware of rarely deletes maps, in fact kubernetes/Cilium case builds map at init and never tears it down except on errors. So lets do the simple fix and grab sock lock. This patch wraps sock deletes from maps in sock lock and adds some annotations so we catch any other cases easier. Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-3-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00

... 81 82 83 84 85 ...

900440 Commits