linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-08 20:07:46 +09:00

Author	SHA1	Message	Date
Dan Williams	24b1819718	cxl/hdm: Extend DVSEC range register emulation for region enumeration One motivation for mapping range registers to decoder objects is to use those settings for region autodiscovery. The need to map a region for devices programmed to use range registers is especially urgent now that the kernel no longer routes "Soft Reserved" ranges in the memory map to device-dax by default. The CXL memory range loses all access mechanisms. Complete the implementation by marking the DPA reservation and setting the endpoint-decoder state to signal autodiscovery. Note that the default settings of ways=1 and granularity=4096 set in cxl_decode_init() do not need to be updated. Fixes: `09d09e04d2` ("cxl/dax: Create dax devices for CXL RAM regions") Tested-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Gregory Price <gregory.price@memverge.com> Link: https://lore.kernel.org/r/168012575521.221280.14177293493678527326.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	52cc48ad2a	cxl/hdm: Limit emulation to the number of range registers Recall that range register emulation seeks to treat the 2 potential range registers as Linux CXL "decoder" objects. The number of range registers can be 1 or 2, while HDM decoder ranges can include more than 2. Be careful not to confuse DVSEC range count with HDM capability decoder count. Commit to range register earlier in devm_cxl_setup_hdm(). Otherwise, a device with more HDM decoders than range registers can set @cxlhdm->decoder_count to an invalid value. Avoid introducing a forward declaration by just moving the definition of should_emulate_decoders() earlier in the file. should_emulate_decoders() is unchanged. Tested-by: Dave Jiang <dave.jiang@intel.com> Fixes: `d7a2153762` ("cxl/hdm: Add emulation when HDM decoders are not committed") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168012574932.221280.15944705098679646436.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	9ff3eec958	cxl/region: Move coherence tracking into cxl_region_attach() Each time the contents of a given HPA are potentially changed in a cache incoherent manner the CXL core sets CXL_REGION_F_INCOHERENT to invalidate CPU caches before the region is used. Successful invocation of attach_target() indicates that DPA has been newly assigned to a given HPA in the dynamic region creation flow. However, attach_target() is also reused in the autodiscovery flow where the region was activated by platform firmware. In that case there is no need to invalidate caches because that region is already in active use and nothing about the autodiscovery flow modifies the HPA-to-DPA relationship. In the autodiscovery case cxl_region_attach() exits early after determining the endpoint decoder is already correctly attached to the region. Fixes: `a32320b71f` ("cxl/region: Add region autodiscovery") Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168002858817.50647.1217607907088920888.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	030f880342	cxl/region: Fix region setup/teardown for RCDs RCDs (CXL memory devices that link train without VH capability and show up as root complex integrated endpoints), hide the presence of the link between the endpoint and the host-bridge. The CXL region setup/teardown paths assume that a link hop is present and go looking for at least one 'struct cxl_port' instance between the CXL root port-object and an endpoint port-object leading to crashes of the form: BUG: kernel NULL pointer dereference, address: 0000000000000008 [..] RIP: 0010:cxl_region_setup_targets+0x3e9/0xae0 [cxl_core] [..] Call Trace: <TASK> cxl_region_attach+0x46c/0x7a0 [cxl_core] cxl_create_region+0x20b/0x270 [cxl_core] cxl_mock_mem_probe+0x641/0x800 [cxl_mock_mem] platform_probe+0x5b/0xb0 Detect RCDs explicitly and skip walking the non-existent port hierarchy between root and endpoint in that case. While this has been a problem since: commit `0a19bfc8de` ("cxl/port: Add RCD endpoint port enumeration") ...it becomes a more reliable crash scenario with the new autodiscovery implementation. Fixes: `a32320b71f` ("cxl/region: Add region autodiscovery") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168002858268.50647.728091521032131326.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	d35b495ddf	cxl/port: Fix find_cxl_root() for RCDs and simplify it The find_cxl_root() helper is used to lookup root decoders and other CXL platform topology information for a given endpoint. It turns out that for RCDs it has never worked. The result of find_cxl_root(&cxlmd->dev) is always NULL for the RCH topology case because it expects to find a cxl_port at the host-bridge. RCH topologies only have the root cxl_port object with the host-bridge as a dport. While there are no reports of this being a problem to date, by inspection region enumeration should crash as a result of this problem, and it does in a local unit test for this scenario. However, an observation that ever since: commit `f17b558d66` ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue") ...all callers of find_cxl_root() occur after the memdev connection to the port topology has been established. That means that find_cxl_root() can be simplified to a walk of the endpoint port topology to the root. Switch to that arrangement which also fixes the RCD bug. Fixes: `a32320b71f` ("cxl/region: Add region autodiscovery") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168002857715.50647.344876437247313909.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	b70c2cf95e	cxl/hdm: Skip emulation when driver manages mem_enable If the driver is allowed to enable memory operation itself then it can also turn on HDM decoder support at will. With this the second call to cxl_setup_hdm_decoder_from_dvsec(), when an HDM decoder is not committed, is not needed. Fixes: `b777e9bec9` ("cxl/hdm: Emulate HDM decoder from DVSEC range registers") Link: http://lore.kernel.org/r/20230220113657.000042e1@huawei.com Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167703068474.185722.664126485486344246.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	82f0832af2	cxl/hdm: Fix double allocation of @cxlhdm devm_cxl_setup_emulated_hdm() reallocates an instance of @cxlhdm that was already allocated at the start of devm_cxl_setup_hdm(). Only one is needed and devm_cxl_setup_emulated_hdm() does not do enough to warrant being an explicit helper. Fixes: `4474ce565e` ("cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders") Tested-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167703067936.185722.7908921750127154779.stgit@dwillia2-xfh.jf.intel.com Link: https://lore.kernel.org/r/168012574357.221280.5001364964799725366.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:12 -07:00
Keith Busch	38a8c4d1d4	blk-mq: directly poll requests Polling needs a bio with a valid bi_bdev, but neither of those are guaranteed for polled driver requests. Make request based polling directly use blk-mq's polling function instead. When executing a request from a polled hctx, we know the request's cookie, and that it's from a live blk-mq queue that supports polling, so we can safely skip everything that bio_poll provides. Cc: stable@kernel.org Reported-by: Martin Belanger <Martin.Belanger@dell.com> Reported-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Tested-by: Daniel Wagner <dwagner@suse.de> Revieded-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20230331180056.1155862-1-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-04-04 16:11:47 -06:00
Linus Torvalds	76f598ba7d	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "PPC: - Hide KVM_CAP_IRQFD_RESAMPLE if XIVE is enabled s390: - Fix handling of external interrupts in protected guests x86: - Resample the pending state of IOAPIC interrupts when unmasking them - Fix usage of Hyper-V "enlightened TLB" on AMD - Small fixes to real mode exceptions - Suppress pending MMIO write exits if emulator detects exception Documentation: - Fix rST syntax" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: docs: kvm: x86: Fix broken field list KVM: PPC: Make KVM_CAP_IRQFD_RESAMPLE platform dependent KVM: s390: pv: fix external interruption loop not always detected KVM: nVMX: Do not report error code when synthesizing VM-Exit from Real Mode KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection KVM: x86: Suppress pending MMIO write exits if emulator detects exception KVM: x86/ioapic: Resample the pending state of an IRQ when unmasking KVM: irqfd: Make resampler_list an RCU list KVM: SVM: Flush Hyper-V TLB when required	2023-04-04 11:29:37 -07:00
Stefano Garzarella	9da667e50c	vdpa_sim_net: complete the initialization before register the device Initialization must be completed before calling _vdpa_register_device() since it can connect the device to the vDPA bus, so requests can arrive after that call. So for example vdpasim_net_work(), which uses the net->*_stats variables, can be scheduled before they are initialized. Let's move _vdpa_register_device() to the end of vdpasim_net_dev_add() and add a comment to avoid future issues. Fixes: `0899774cb3` ("vdpa_sim_net: vendor satistics") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20230329160321.187176-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2023-04-04 14:22:12 -04:00
Linus Torvalds	ceeea1b782	Merge tag 'nfsd-6.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a crash and a resource leak in NFSv4 COMPOUND processing - Fix issues with AUTH_SYS credential handling - Try again to address an NFS/NFSD/SUNRPC build dependency regression * tag 'nfsd-6.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: callback request does not use correct credential for AUTH_SYS NFS: Remove "select RPCSEC_GSS_KRB5 sunrpc: only free unix grouplist after RCU settles nfsd: call op_release, even when op_func returns an error NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL	2023-04-04 11:20:55 -07:00
Takahiro Itazuri	fb5015bc8b	docs: kvm: x86: Fix broken field list Add a missing ":" to fix a broken field list. Signed-off-by: Takahiro Itazuri <itazur@amazon.com> Fixes: `ba7bb663f5` ("KVM: x86: Provide per VM capability for disabling PMU virtualization") Message-Id: <20230331093116.99820-1-itazur@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-04-04 13:22:05 -04:00
Arnd Bergmann	656e9007ef	asm-generic: avoid __generic_cmpxchg_local warnings Code that passes a 32-bit constant into cmpxchg() produces a harmless sparse warning because of the truncation in the branch that is not taken: fs/erofs/zdata.c: note: in included file (through /home/arnd/arm-soc/arch/arm/include/asm/cmpxchg.h, /home/arnd/arm-soc/arch/arm/include/asm/atomic.h, /home/arnd/arm-soc/include/linux/atomic.h, ...): include/asm-generic/cmpxchg-local.h:29:33: warning: cast truncates bits from constant value (5f0ecafe becomes fe) include/asm-generic/cmpxchg-local.h:33:34: warning: cast truncates bits from constant value (5f0ecafe becomes cafe) include/asm-generic/cmpxchg-local.h:29:33: warning: cast truncates bits from constant value (5f0ecafe becomes fe) include/asm-generic/cmpxchg-local.h:30:42: warning: cast truncates bits from constant value (5f0edead becomes ad) include/asm-generic/cmpxchg-local.h:33:34: warning: cast truncates bits from constant value (5f0ecafe becomes cafe) include/asm-generic/cmpxchg-local.h:34:44: warning: cast truncates bits from constant value (5f0edead becomes dead) This was reported as a regression to Matt's recent __generic_cmpxchg_local patch, though this patch only added more warnings on top of the ones that were already there. Rewording the truncation to use an explicit bitmask instead of a cast to a smaller type avoids the warning but otherwise leaves the code unchanged. I had another look at why the cast is even needed for atomic_cmpxchg(), and as Matt describes the problem here is that atomic_t contains a signed 'int', but cmpxchg() takes an 'unsigned long' argument, and converting between the two leads to a 64-bit sign-extension of negative 32-bit atomics. I checked the other implementations of arch_cmpxchg() and did not find any others that run into the same problem as __generic_cmpxchg_local(), but it's easy to be on the safe side here and always convert the signed int into an unsigned int when calling arch_cmpxchg(), as this will work even when any of the arch_cmpxchg() implementations run into the same problem. Fixes: `6246541522` ("locking/atomic: cmpxchg: Make __generic_cmpxchg_local compare against zero-extended 'old' value") Reviewed-by: Matt Evans <mev@rivosinc.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2023-04-04 17:58:11 +02:00
Vladimir Oltean	05d3855b4d	asm-generic/io.h: suppress endianness warnings for relaxed accessors Copy the forced type casts from the normal MMIO accessors to suppress the sparse warnings that point out __raw_readl() returns a native endian word (just like readl()). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2023-04-04 17:58:11 +02:00
Vladimir Oltean	d564fa1ff1	asm-generic/io.h: suppress endianness warnings for readq() and writeq() Commit `c1d55d5013` ("asm-generic/io.h: Fix sparse warnings on big-endian architectures") missed fixing the 64-bit accessors. Arnd explains in the attached link why the casts are necessary, even if __raw_readq() and __raw_writeq() do not take endian-specific types. Link: https://lore.kernel.org/lkml/9105d6fc-880b-4734-857d-e3d30b87ccf6@app.fastmail.com/ Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2023-04-04 17:58:11 +02:00
Fuad Tabba	e81625218b	KVM: arm64: Advertise ID_AA64PFR0_EL1.CSV2/3 to protected VMs The existing pKVM code attempts to advertise CSV2/3 using values initialized to 0, but never set. To advertise CSV2/3 to protected guests, pass the CSV2/3 values to hyp when initializing hyp's view of guests' ID_AA64PFR0_EL1. Similar to non-protected KVM, these are system-wide, rather than per cpu, for simplicity. Fixes: `6c30bfb18d` ("KVM: arm64: Add handlers for protected VM System Registers") Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20230404152321.413064-1-tabba@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-04-04 15:52:06 +00:00
Lingyu Liu	83c911dc5e	ice: Reset FDIR counter in FDIR init stage Reset the FDIR counters when FDIR inits. Without this patch, when VF initializes or resets, all the FDIR counters are not cleaned, which may cause unexpected behaviors for future FDIR rule create (e.g., rule conflict). Fixes: `1f7ea1cd6a` ("ice: Enable FDIR Configure for AVF") Signed-off-by: Junfeng Guo <junfeng.guo@intel.com> Signed-off-by: Lingyu Liu <lingyu.liu@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-04-04 08:34:52 -07:00
Simei Su	b4a01ace20	ice: fix wrong fallback logic for FDIR When adding a FDIR filter, if ice_vc_fdir_set_irq_ctx returns failure, the inserted fdir entry will not be removed and if ice_vc_fdir_write_fltr returns failure, the fdir context info for irq handler will not be cleared which may lead to inconsistent or memory leak issue. This patch refines failure cases to resolve this issue. Fixes: `1f7ea1cd6a` ("ice: Enable FDIR Configure for AVF") Signed-off-by: Simei Su <simei.su@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-04-04 08:34:52 -07:00
Eli Cohen	f0417e72ad	vdpa/mlx5: Add and remove debugfs in setup/teardown driver The right place to add the debugfs create is in setup_driver() and remove it in teardown_driver(). Current code adds the debugfs when creating the device but resetting a device will remove the debugfs subtree and subsequent set_driver will not be able to create the files since the debugfs pointer is NULL. Fixes: `2942210043` ("vdpa/mlx5: Add debugfs subtree") Signed-off-by: Eli Cohen <elic@nvidia.com> v3 -> v4: Fix error flow in setup_driver() Message-Id: <20230403114039.11102-1-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2023-04-04 11:08:30 -04:00
Ross Zwisler	9513c55ce3	tools/virtio: fix typo in README instructions We need to have a unique chardev for each data path, else the chardevs will collide and qemu will die with this message: qemu-system-x86_64: -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0, id=channel1,name=trace-path-cpu0: Property 'virtserialport.chardev' can't take value 'charchannel0': Device 'charchannel0' is in use Signed-off-by: Ross Zwisler <zwisler@google.com> Message-Id: <20230215223350.2658616-7-zwisler@google.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-04-04 11:01:58 -04:00
Mike Christie	4c363c81f6	vhost-scsi: Fix crash during LUN unmapping We normally clear the endpoint then unmap LUNs so the devices are fully shutdown when the LUN is unmapped, but it's legal to unmap before clearing. If the user does that while TMFs are running then we can end up crashing. vhost_scsi_port_unlink assumes that the LUN's tmf struct will always be on the tmf_queue list. However, if a TMF is running then it will have been removed while it's executing. If we do a LUN unmap at this time, then we assume the entry is on the list and just start accessing it and free it. This fixes the bug by just allocating the vhost_scsi_tmf struct when it's needed like is done with the se_tmr struct that's needed when we submit the TMF. In this path perf is not an issue and we can use GFP_KERNEL since it won't swing directly back on us, so we don't need to preallocate the struct. Signed-off-by: Mike Christie <michael.christie@oracle.com> Message-Id: <20230321020624.13323-3-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-04-04 11:01:58 -04:00
Mike Christie	e508efc3ae	vhost-scsi: Fix vhost_scsi struct use after free If vhost_scsi_setup_vq_cmds fails we leave the tpg->vhost_scsi pointer set. If the device is freed and then the user unmaps the LUN, the call to vhost_scsi_port_unlink -> vhost_scsi_hotunplug will see the that tpg->vhost_scsi is still set and try to use it. This has us clear the vhost_scsi pointer in the failure path. It also has us take tv_tpg_mutex in this failure path, because tv_tpg_vhost_count is accessed under this mutex in vhost_scsi_drop_nexus and in the future we will want to serialize access to tpg->vhost_scsi with that mutex instead of the vhost_scsi_mutex. Signed-off-by: Mike Christie <michael.christie@oracle.com> Message-Id: <20230321020624.13323-2-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-04-04 11:01:58 -04:00
Dmitry Fomichev	10805eb5d6	virtio-blk: fix ZBD probe in kernels without ZBD support When the kernel is built without support for zoned block devices, virtio-blk probe needs to error out any host-managed device scans to prevent such devices from appearing in the system as non-zoned. The current virtio-blk code simply bypasses all ZBD checks if CONFIG_BLK_DEV_ZONED is not defined and this leads to host-managed block devices being presented as non-zoned in the OS. This is one of the main problems this patch series is aimed to fix. In this patch, make VIRTIO_BLK_F_ZONED feature defined even when CONFIG_BLK_DEV_ZONED is not. This change makes the code compliant with the voted revision of virtio-blk ZBD spec. Modify the probe code to look at the situation when VIRTIO_BLK_F_ZONED is negotiated in a kernel that is built without ZBD support. In this case, the code checks the zoned model of the device and fails the probe is the device is host-managed. The patch also adds the comment to clarify that the call to perform the zoned device probe is correctly placed after virtio_device ready(). Fixes: `95bfec41bd` ("virtio-blk: add support for zoned block devices") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Message-Id: <20230330214953.1088216-3-dmitry.fomichev@wdc.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-04-04 11:01:57 -04:00
Dmitry Fomichev	f1ba4e674f	virtio-blk: fix to match virtio spec The merged patch series to support zoned block devices in virtio-blk is not the most up to date version. The merged patch can be found at https://lore.kernel.org/linux-block/20221016034127.330942-3-dmitry.fomichev@wdc.com/ but the latest and reviewed version is https://lore.kernel.org/linux-block/20221110053952.3378990-3-dmitry.fomichev@wdc.com/ The reason is apparently that the correct mailing lists and maintainers were not copied. The differences between the two are mostly cleanups, but there is one change that is very important in terms of compatibility with the approved virtio-zbd specification. Before it was approved, the OASIS virtio spec had a change in VIRTIO_BLK_T_ZONE_APPEND request layout that is not reflected in the current virtio-blk driver code. In the running code, the status is the first byte of the in-header that is followed by some pad bytes and the u64 that carries the sector at which the data has been written to the zone back to the driver, aka the append sector. This layout turned out to be problematic for implementing in QEMU and the request status byte has been eventually made the last byte of the in-header. The current code doesn't expect that and this causes the append sector value always come as zero to the block layer. This needs to be fixed ASAP. Fixes: `95bfec41bd` ("virtio-blk: add support for zoned block devices") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Message-Id: <20230330214953.1088216-2-dmitry.fomichev@wdc.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-04-04 11:01:57 -04:00
Oliver Hartkopp	1afae605e0	kvaser_usb: convert USB IDs to hexadecimal values USB IDs are usually represented in 16 bit hexadecimal values. To match the common representation in lsusb and for searching USB IDs in the internet convert the decimal values to lowercase hexadecimal. changes since v1: https://lore.kernel.org/all/20230327175344.4668-1-socketcan@hartkopp.net - drop the aligned block indentation (suggested by Jimmy) - use lowercase hex values (suggested by Alex) Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Acked-by: Jimmy Assarsson <extja@kvaser.com> Reviewed-by: Alexander Dahl <ada@thorsis.com> Link: https://lore.kernel.org/all/20230329090915.3127-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2023-04-04 16:48:45 +02:00
Frank Jungclaus	c42fc36949	can: esd_usb: Add support for CAN_CTRLMODE_BERR_REPORTING Announce that the driver supports CAN_CTRLMODE_BERR_REPORTING by means of priv->can.ctrlmode_supported. Until now berr reporting always has been active without taking care of the berr-reporting parameter given to an "ip link set ..." command. Additionally apply some changes to function esd_usb_rx_event(): - If berr reporting is off and it is also no state change, then immediately return. - Unconditionally (even in case of the above "immediate return") store tx- and rx-error counters, so directly use priv->bec.txerr and priv->bec.rxerr instead of intermediate variables. - Not directly related, but to better point out the linkage between a failed alloc_can_err_skb() and stats->rx_dropped++: Move the increment of the rx_dropped statistic counter (back) to directly behind the err_skb allocation. Signed-off-by: Frank Jungclaus <frank.jungclaus@esd.eu> Link: https://lore.kernel.org/all/20230330184446.2802135-1-frank.jungclaus@esd.eu Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2023-04-04 16:48:45 +02:00
Peng Fan	066b41a599	dt-bindings: can: fsl,flexcan: add optional power-domains property Add optional power-domains property for i.MX8 usage. Signed-off-by: Peng Fan <peng.fan@nxp.com> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/all/20230328054602.1974255-1-peng.fan@oss.nxp.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2023-04-04 16:48:45 +02:00
Geert Uytterhoeven	8e85d550c1	can: rcar_canfd: rcar_canfd_probe(): fix plain integer in transceivers[] init Fix the following compile warning with C=1: \| drivers/net/can/rcar/rcar_canfd.c:1852:59: warning: Using plain integer as NULL pointer Fixes: `a0340df7ec` ("can: rcar_canfd: Add transceiver support") Reported-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230328145658.7fdbc394@kernel.org Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Link: https://lore.kernel.org/all/7f7b0dde0caa2d2977b4fb5b65b63036e75f5022.1680071972.git.geert+renesas@glider.be Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2023-04-04 16:48:42 +02:00
Dai Ngo	7de82c2f36	NFSD: callback request does not use correct credential for AUTH_SYS Currently callback request does not use the credential specified in CREATE_SESSION if the security flavor for the back channel is AUTH_SYS. Problem was discovered by pynfs 4.1 DELEG5 and DELEG7 test with error: DELEG5 st_delegation.testCBSecParms : FAILURE expected callback with uid, gid == 17, 19, got 0, 0 Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Fixes: `8276c902bb` ("SUNRPC: remove uid and gid from struct auth_cred") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-04-04 09:55:27 -04:00
Chuck Lever	8be8f170e8	NFS: Remove "select RPCSEC_GSS_KRB5 If CONFIG_CRYPTO=n (e.g. arm/shmobile_defconfig): WARNING: unmet direct dependencies detected for RPCSEC_GSS_KRB5 Depends on [n]: NETWORK_FILESYSTEMS [=y] && SUNRPC [=y] && CRYPTO [=n] Selected by [y]: - NFS_V4 [=y] && NETWORK_FILESYSTEMS [=y] && NFS_FS [=y] As NFSv4 can work without crypto enabled, remove the RPCSEC_GSS_KRB5 dependency altogether. Trond says: > It is possible to use the NFSv4.1 client with just AUTH_SYS, and > in fact there are plenty of people out there using only that. The > fact that RFC5661 gets its knickers in a twist about RPCSEC_GSS > support is largely irrelevant to those people. > > The other issue is that ’select’ enforces the strict dependency > that if the NFS client is compiled into the kernel, then the > RPCSEC_GSS and kerberos code needs to be compiled in as well: they > cannot exist as modules. Fixes: `e57d065277` ("NFS & NFSD: Update GSS dependencies") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Niklas Söderlund <niklas.soderlund@ragnatech.se> Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-04-04 09:55:27 -04:00
Jeff Layton	5085e41f9e	sunrpc: only free unix grouplist after RCU settles While the unix_gid object is rcu-freed, the group_info list that it contains is not. Ensure that we only put the group list reference once we are really freeing the unix_gid object. Reported-by: Zhi Li <yieli@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2183056 Signed-off-by: Jeff Layton <jlayton@kernel.org> Fixes: `fd5d2f7826` ("SUNRPC: Make server side AUTH_UNIX use lockless lookups") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-04-04 09:55:27 -04:00
Corinna Vinschen	218c597325	net: stmmac: fix up RX flow hash indirection table when setting channels stmmac_reinit_queues() fails to fix up the RX hash. Even if the number of channels gets restricted, the output of `ethtool -x' indicates that all RX queues are used: $ ethtool -l enp0s29f2 Channel parameters for enp0s29f2: Pre-set maximums: RX: 8 TX: 8 Other: n/a Combined: n/a Current hardware settings: RX: 8 TX: 8 Other: n/a Combined: n/a $ ethtool -x enp0s29f2 RX flow hash indirection table for enp0s29f2 with 8 RX ring(s): 0: 0 1 2 3 4 5 6 7 8: 0 1 2 3 4 5 6 7 [...] $ ethtool -L enp0s29f2 rx 3 $ ethtool -x enp0s29f2 RX flow hash indirection table for enp0s29f2 with 3 RX ring(s): 0: 0 1 2 3 4 5 6 7 8: 0 1 2 3 4 5 6 7 [...] Fix this by setting the indirection table according to the number of specified queues. The result is now as expected: $ ethtool -L enp0s29f2 rx 3 $ ethtool -x enp0s29f2 RX flow hash indirection table for enp0s29f2 with 3 RX ring(s): 0: 0 1 2 0 1 2 0 1 8: 2 0 1 2 0 1 2 0 [...] Tested on Intel Elkhart Lake. Fixes: `0366f7e06a` ("net: stmmac: add ethtool support for get/set channels") Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Link: https://lore.kernel.org/r/20230403121120.489138-1-vinschen@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-04-04 15:17:15 +02:00
Jason Gunthorpe	13a0d1ae7e	iommufd: Do not corrupt the pfn list when doing batch carry If batch->end is 0 then setting npfns[0] before computing the new value of pfns will fail to adjust the pfn and result in various page accounting corruptions. It should be ordered after. This seems to result in various kinds of page meta-data corruption related failures: WARNING: CPU: 1 PID: 527 at mm/gup.c:75 try_grab_folio+0x503/0x740 Modules linked in: CPU: 1 PID: 527 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:try_grab_folio+0x503/0x740 Code: e3 01 48 89 de e8 6d c1 dd ff 48 85 db 0f 84 7c fe ff ff e8 4f bf dd ff 49 8d 47 ff 48 89 45 d0 e9 73 fe ff ff e8 3d bf dd ff <0f> 0b 31 db e9 d0 fc ff ff e8 2f bf dd ff 48 8b 5d c8 31 ff 48 89 RSP: 0018:ffffc90000f37908 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 00000000fffffc02 RCX: ffffffff81504c26 RDX: 0000000000000000 RSI: ffff88800d030000 RDI: 0000000000000002 RBP: ffffc90000f37948 R08: 000000000003ca24 R09: 0000000000000008 R10: 000000000003ca00 R11: 0000000000000023 R12: ffffea000035d540 R13: 0000000000000001 R14: 0000000000000000 R15: ffffea000035d540 FS: 00007fecbf659740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200011c3 CR3: 000000000ef66006 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace: <TASK> internal_get_user_pages_fast+0xd32/0x2200 pin_user_pages_fast+0x65/0x90 pfn_reader_user_pin+0x376/0x390 pfn_reader_next+0x14a/0x7b0 pfn_reader_first+0x140/0x1b0 iopt_area_fill_domain+0x74/0x210 iopt_table_add_domain+0x30e/0x6e0 iommufd_device_selftest_attach+0x7f/0x140 iommufd_test+0x10ff/0x16f0 iommufd_fops_ioctl+0x206/0x330 __x64_sys_ioctl+0x10e/0x160 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Cc: <stable@vger.kernel.org> Fixes: `f394576eb1` ("iommufd: PFN handling for iopt_pages") Link: https://lore.kernel.org/r/3-v1-ceab6a4d7d7a+94-iommufd_syz_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: Pengfei Xu <pengfei.xu@intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2023-04-04 09:10:55 -03:00
Jason Gunthorpe	727c28c1ce	iommufd: Fix unpinning of pages when an access is present syzkaller found that the calculation of batch_last_index should use 'start_index' since at input to this function the batch is either empty or it has already been adjusted to cross any accesses so it will start at the point we are unmapping from. Getting this wrong causes the unmap to run over the end of the pages which corrupts pages that were never mapped. In most cases this triggers the num pinned debugging: WARNING: CPU: 0 PID: 557 at drivers/iommu/iommufd/pages.c:294 __iopt_area_unfill_domain+0x152/0x560 Modules linked in: CPU: 0 PID: 557 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__iopt_area_unfill_domain+0x152/0x560 Code: d2 0f ff 44 8b 64 24 54 48 8b 44 24 48 31 ff 44 89 e6 48 89 44 24 38 e8 fc d3 0f ff 45 85 e4 0f 85 eb 01 00 00 e8 0e d2 0f ff <0f> 0b e8 07 d2 0f ff 48 8b 44 24 38 89 5c 24 58 89 18 8b 44 24 54 RSP: 0018:ffffc9000108baf0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000ffffffff RCX: ffffffff821e3f85 RDX: 0000000000000000 RSI: ffff88800faf0000 RDI: 0000000000000002 RBP: ffffc9000108bd18 R08: 000000000003ca25 R09: 0000000000000014 R10: 000000000003ca00 R11: 0000000000000024 R12: 0000000000000004 R13: 0000000000000801 R14: 00000000000007ff R15: 0000000000000800 FS: 00007f3499ce1740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000243 CR3: 00000000179c2001 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> iopt_area_unfill_domain+0x32/0x40 iopt_table_remove_domain+0x23f/0x4c0 iommufd_device_selftest_detach+0x3a/0x90 iommufd_selftest_destroy+0x55/0x70 iommufd_object_destroy_user+0xce/0x130 iommufd_destroy+0xa2/0xc0 iommufd_fops_ioctl+0x206/0x330 __x64_sys_ioctl+0x10e/0x160 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Also add some useful WARN_ON sanity checks. Cc: <stable@vger.kernel.org> Fixes: `8d160cd4d5` ("iommufd: Algorithms for PFN storage") Link: https://lore.kernel.org/r/2-v1-ceab6a4d7d7a+94-iommufd_syz_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: Pengfei Xu <pengfei.xu@intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2023-04-04 09:10:55 -03:00
Jason Gunthorpe	e439570133	iommufd: Check for uptr overflow syzkaller found that setting up a map with a user VA that wraps past zero can trigger WARN_ONs, particularly from pin_user_pages weirdly returning 0 due to invalid arguments. Prevent creating a pages with a uptr and size that would math overflow. WARNING: CPU: 0 PID: 518 at drivers/iommu/iommufd/pages.c:793 pfn_reader_user_pin+0x2e6/0x390 Modules linked in: CPU: 0 PID: 518 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:pfn_reader_user_pin+0x2e6/0x390 Code: b1 11 e9 25 fe ff ff e8 28 e4 0f ff 31 ff 48 89 de e8 2e e6 0f ff 48 85 db 74 0a e8 14 e4 0f ff e9 4d ff ff ff e8 0a e4 0f ff <0f> 0b bb f2 ff ff ff e9 3c ff ff ff e8 f9 e3 0f ff ba 01 00 00 00 RSP: 0018:ffffc90000f9fa30 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff821e2b72 RDX: 0000000000000000 RSI: ffff888014184680 RDI: 0000000000000002 RBP: ffffc90000f9fa78 R08: 00000000000000ff R09: 0000000079de6f4e R10: ffffc90000f9f790 R11: ffff888014185418 R12: ffffc90000f9fc60 R13: 0000000000000002 R14: ffff888007879800 R15: 0000000000000000 FS: 00007f4227555740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000043 CR3: 000000000e748005 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> pfn_reader_next+0x14a/0x7b0 ? interval_tree_double_span_iter_update+0x11a/0x140 pfn_reader_first+0x140/0x1b0 iopt_pages_rw_slow+0x71/0x280 ? __this_cpu_preempt_check+0x20/0x30 iopt_pages_rw_access+0x2b2/0x5b0 iommufd_access_rw+0x19f/0x2f0 iommufd_test+0xd11/0x16f0 ? write_comp_data+0x2f/0x90 iommufd_fops_ioctl+0x206/0x330 __x64_sys_ioctl+0x10e/0x160 ? __pfx_iommufd_fops_ioctl+0x10/0x10 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Cc: <stable@vger.kernel.org> Fixes: `8d160cd4d5` ("iommufd: Algorithms for PFN storage") Link: https://lore.kernel.org/r/1-v1-ceab6a4d7d7a+94-iommufd_syz_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: Pengfei Xu <pengfei.xu@intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2023-04-04 09:10:55 -03:00
Paolo Abeni	b103bab094	Merge branch 'vsock-return-errors-other-than-enomem-to-socket' Arseniy Krasnov says: ==================== vsock: return errors other than -ENOMEM to socket this patchset removes behaviour, where error code returned from any transport was always switched to ENOMEM. This works in the same way as patch from Bobby Eshleman: commit `c43170b7e1` ("vsock: return errors other than -ENOMEM to socket"), but for receive calls. VMCI transport is also updated (both tx and rx SOCK_STREAM callbacks), because it returns VMCI specific error code to af_vsock.c (like VMCI_ERROR_*). Tx path is already merged to net, so it was excluded from patchset in v4. At the same time, virtio and Hyper-V transports are using general error codes, so there is no need to update them. vsock_test suite is also updated. Link to v1: https://lore.kernel.org/netdev/97f19214-ba04-c47e-7486-72e8aa16c690@sberdevices.ru/ Link to v2: https://lore.kernel.org/netdev/60abc0da-0412-6e25-eeb0-8e32e3ec21e7@sberdevices.ru/ Link to v3: https://lore.kernel.org/netdev/dead4842-333a-015e-028b-302151336ff9@sberdevices.ru/ ==================== Link: https://lore.kernel.org/r/0d20e25a-640c-72c1-2dcb-7a53a05e3132@sberdevices.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-04-04 12:46:27 +02:00
Arseniy Krasnov	b5d54eb589	vsock/test: update expected return values This updates expected return values for invalid buffer test. Now such values are returned from transport, not from af_vsock.c. Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-04-04 12:46:24 +02:00
Arseniy Krasnov	02ab696feb	vsock: return errors other than -ENOMEM to socket This removes behaviour, where error code returned from any transport was always switched to ENOMEM. This works in the same way as: commit `c43170b7e1` ("vsock: return errors other than -ENOMEM to socket"), but for receive calls. Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-04-04 12:46:24 +02:00
Arseniy Krasnov	f59f3006ca	vsock/vmci: convert VMCI error code to -ENOMEM on receive This adds conversion of VMCI specific error code to general -ENOMEM. It is preparation for the next patch, which changes af_vsock.c behaviour on receive to pass value returned from transport to the user. Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-04-04 12:46:24 +02:00
Siddharth Vadapalli	c6b486fb33	net: ethernet: ti: am65-cpsw: Fix mdio cleanup in probe In the am65_cpsw_nuss_probe() function's cleanup path, the call to of_platform_device_destroy() for the common->mdio_dev device is invoked unconditionally. It is possible that either the MDIO node is not present in the device-tree, or the MDIO node is disabled in the device-tree. In both these cases, the MDIO device is not created, resulting in a NULL pointer dereference when the of_platform_device_destroy() function is invoked on the common->mdio_dev device on the cleanup path. Fix this by ensuring that the common->mdio_dev device exists, before attempting to invoke of_platform_device_destroy(). Fixes: `a45cfcc69a` ("net: ethernet: ti: am65-cpsw-nuss: use of_platform_device_create() for mdio") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/20230403090321.835877-1-s-vadapalli@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-04-04 12:30:28 +02:00
David Gow	a3046a618a	um: Only disable SSE on clang to work around old GCC bugs As part of the Rust support for UML, we disable SSE (and similar flags) to match the normal x86 builds. This both makes sense (we ideally want a similar configuration to x86), and works around a crash bug with SSE generation under Rust with LLVM. However, this breaks compiling stdlib.h under gcc < 11, as the x86_64 ABI requires floating-point return values be stored in an SSE register. gcc 11 fixes this by only doing register allocation when a function is actually used, and since we never use atof(), it shouldn't be a problem: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99652 Nevertheless, only disable SSE on clang setups, as that's a simple way of working around everyone's bugs. Fixes: `8849818679` ("rust: arch/um: Disable FP/SIMD instruction to match x86") Reported-by: Roberto Sassu <roberto.sassu@huaweicloud.com> Link: https://lore.kernel.org/linux-um/6df2ecef9011d85654a82acd607fdcbc93ad593c.camel@huaweicloud.com/ Tested-by: Roberto Sassu <roberto.sassu@huaweicloud.com> Tested-by: SeongJae Park <sj@kernel.org> Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com> Tested-by: Arthur Grillo <arthurgrillo@riseup.net> Signed-off-by: Richard Weinberger <richard@nod.at>	2023-04-04 09:57:05 +02:00
Jakub Kicinski	b380339919	Merge branch 'sfc-support-unicast-ptp' Íñigo Huguet says: ==================== sfc: support unicast PTP Unicast PTP was not working with sfc NICs. The reason was that these NICs don't timestamp all incoming packets, but instead they only timestamp packets of the queues that are selected for that. Currently, only one RX queue is configured for timestamp: the RX queue of the PTP channel. The packets that are put in the PTP RX queue are selected according to firmware filters configured from the driver. Multicast PTP was already working because the needed filters are known in advance, so they're inserted when PTP is enabled. This patches add the ability to dynamically add filters for unicast addresses, extracted from the TX PTP-event packets. Since we don't know in advance how many filters we'll need, some info about the filters need to be saved. This will allow to check if a filter already exists or if a filter is too old and should be removed. Note that the previous point is unnecessary for multicast filters, but I've opted to change how they're handled to match the new unicast's filters to avoid having duplicate insert/remove_filters functions, once for each type of filter. Tested: With ptp4l, all combinations of IPv4/IPv6, master/slave and unicast/multicast Reported-by: Yalin Li <yalli@redhat.com> ==================== Link: https://lore.kernel.org/r/20230331111404.17256-1-ihuguet@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-03 19:02:53 -07:00
Íñigo Huguet	ad47655ead	sfc: remove expired unicast PTP filters Filters inserted to support unicast PTP mode might become unused after some time, so we need to remove them to avoid accumulating many of them. Refresh the expiration time of a filter each time it's used. Then check periodically if any filter hasn't been used for a long time (30s) and remove it. Reported-by: Yalin Li <yalli@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-03 19:02:51 -07:00
Íñigo Huguet	49ed35a0b6	sfc: support unicast PTP When sending a PTP event packet, add the correct filters that will make that future incoming unicast PTP event packets will be timestamped. The unicast address for the filter is gotten from the outgoing skb before sending it. Until now they were not timestamped because only filters that match with the PTP multicast addressed were being configured into the NIC for the PTP special channel. Packets received through different channels are not timestamped, getting "received SYNC without timestamp" error in ptp4l. Note that the inserted filters are never removed unless the NIC is stopped or reconfigured, so efx_ptp_stop is called. Removal of old filters will be handled by the next patch. Additionally, cleanup a bit efx_ptp_xmit_skb_mc to use the reverse xmas tree convention and remove an unnecessary assignment to rc variable in void function. Reported-by: Yalin Li <yalli@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-03 19:02:51 -07:00
Íñigo Huguet	75687cd066	sfc: allow insertion of filters for unicast PTP Add a second list for unicast filters and generalize the efx_ptp_insert/remove_filters functions to allow acting in any of the 2 lists. No filters for unicast are inserted yet. That will be done in the next patch. The reason to use 2 different lists instead of a single one is that, in next patches, we will want to check if unicast filters are already added and if they're expired. We don't need that for multicast filters. Reported-by: Yalin Li <yalli@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-03 19:02:51 -07:00
Íñigo Huguet	e790fc15bf	sfc: store PTP filters in a list Instead of using a fixed sized array for the PTP filters, use a list. This is not actually necessary at this point because the filters for multicast PTP are a fixed number, but this is a preparation for the following patches adding support for unicast PTP. To avoid confusion with the new struct type efx_ptp_rxfilter, change the name of some local variables from rxfilter to spec, given they're of the type efx_filter_spec. Reported-by: Yalin Li <yalli@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-03 19:02:51 -07:00
Lukas Wunner	abf04be0e7	PCI/DOE: Fix memory leak with CONFIG_DEBUG_OBJECTS=y After a pci_doe_task completes, its work_struct needs to be destroyed to avoid a memory leak with CONFIG_DEBUG_OBJECTS=y. Fixes: `9d24322e88` ("PCI/DOE: Add DOE mailbox support functions") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/775768b4912531c3b887d405fc51a50e465e1bf9.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-03 16:17:21 -07:00
Lukas Wunner	92dc899c3b	PCI/DOE: Silence WARN splat with CONFIG_DEBUG_OBJECTS=y Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL probing because pci_doe_submit_task() invokes INIT_WORK() instead of INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack. All callers of pci_doe_submit_task() allocate the work_struct on the stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable short-term fix. The long-term fix implemented by a subsequent commit is to move to a synchronous API which allocates the work_struct internally in the DOE library. Stacktrace for posterity: WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183 CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: pci_doe_submit_task+0x5d/0xd0 pci_doe_discovery+0xb4/0x100 pcim_doe_create_mb+0x219/0x290 cxl_pci_probe+0x192/0x430 local_pci_probe+0x41/0x80 pci_device_probe+0xb3/0x220 really_probe+0xde/0x380 __driver_probe_device+0x78/0x170 driver_probe_device+0x1f/0x90 __driver_attach_async_helper+0x5c/0xe0 async_run_entry_fn+0x30/0x130 process_one_work+0x294/0x5b0 Fixes: `9d24322e88` ("PCI/DOE: Add DOE mailbox support functions") Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/ Reported-by: Gregory Price <gregory.price@memverge.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Cc: stable@vger.kernel.org # v6.0+ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/67a9117f463ecdb38a2dbca6a20391ce2f1e7a06.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-03 16:17:03 -07:00
Lukas Wunner	4fe2c13d59	cxl/pci: Handle excessive CDAT length If the length in the CDAT header is larger than the concatenation of the header and all table entries, then the CDAT exposed to user space contains trailing null bytes. Not every consumer may be able to handle that. Per Postel's robustness principle, "be liberal in what you accept" and silently reduce the cached length to avoid exposing those null bytes. Fixes: `c97006046c` ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Link: https://lore.kernel.org/r/6d98b3c7da5343172bd3ccabfabbc1f31c079d74.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-03 16:16:49 -07:00
Lukas Wunner	b56faef231	cxl/pci: Handle truncated CDAT entries If truncated CDAT entries are received from a device, the concatenation of those entries constitutes a corrupt CDAT, yet is happily exposed to user space. Avoid by verifying response lengths and erroring out if truncation is detected. The last CDAT entry may still be truncated despite the checks introduced herein if the length in the CDAT header is too small. However, that is easily detectable by user space because it reaches EOF prematurely. A subsequent commit which rightsizes the CDAT response allocation closes that remaining loophole. The two lines introduced here which exceed 80 chars are shortened to less than 80 chars by a subsequent commit which migrates to a synchronous DOE API and replaces "t.task.rv" by "rc". The existing acpi_cdat_header and acpi_table_cdat struct definitions provided by ACPICA cannot be used because they do not employ __le16 or __le32 types. I believe that cannot be changed because those types are Linux-specific and ACPI is specified for little endian platforms only, hence doesn't care about endianness. So duplicate the structs. Fixes: `c97006046c` ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Link: https://lore.kernel.org/r/bce3aebc0e8e18a1173425a7a865b232c3912963.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-03 16:16:34 -07:00

... 10 11 12 13 14 ...

1172621 Commits