Switch from the legacy PCI DMA API to the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver is currently using an odd mix of legacy PCI DMA API and
generic DMA API calls, switch it over to the generic API entirely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch from the legacy PCI DMA API to the generic DMA API.
Also reuse an existing helper (after fixing the error return) to set the
DMA mask instead of having three copies of the code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch from the legacy PCI DMA API to the generic DMA API.
Also simplify setting the DMA mask a bit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch from the legacy PCI DMA API to the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch from the legacy PCI DMA API to the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch from the legacy PCI DMA API to the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch from the legacy PCI DMA API to the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch from the legacy PCI DMA API to the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch from the legacy PCI DMA API to the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Switch from the legacy PCI DMA API to the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver is currently using an odd mix of legacy PCI DMA API and
generic DMA API calls, switch it over to the generic API entirely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Adam Radford <aradford@gmail.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver is currently using an odd mix of legacy PCI DMA API and
generic DMA API calls, switch it over to the generic API entirely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Adam Radford <aradford@gmail.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver is currently using an odd mix of legacy PCI DMA API and
generic DMA API calls, switch it over to the generic API entirely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Adam Radford <aradford@gmail.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver is currently using an odd mix of legacy PCI DMA API and
generic DMA API calls, switch it over to the generic API entirely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Avoid function calls in the inner PIO loops. On a Centris 660av this
improves throughput for sequential read transfers by about 40% and
sequential write by about 10%.
Unfortunately it is not possible to have methods like .esp_write8 placed
inline so this is always going to be slow, even with LTO.
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As a temporary measure, the code to implement PIO transfers was
duplicated in zorro_esp and mac_esp. Now that it has stabilized move the
common code into the core driver but don't build it unless needed.
This replaces the inline assembler with more portable writesb() calls.
Optimizing the m68k writesb() implementation is a separate patch.
[mkp: applied by hand]
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The concept of a 'slow command' as it appears in esp_scsi is confusing
because it could refer to an ESP command or a SCSI command. It turns out
that it refers to a particular ESP select command which the driver also
tracks as 'ESP_SELECT_MSGOUT'. For readability, it is better to use the
terminology from the datasheets.
The global ESP_FLAG_DOING_SLOWCMD flag is redundant anyway, as it can be
inferred from esp->select_state. Remove the ESP_FLAG_DOING_SLOWCMD cruft
and just use a boolean local variable.
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A SCSI device is not granted disconnect privilege by an esp_scsi host
unless that device has its simple_tags flag set. However, a device may
support disconnect/reselect and not support command queueing. Allow such
devices to disconnect and thereby improve bus utilization.
Drop the redundant 'lp' check. The mid-layer invokes .slave_alloc and
.slave_destroy in such a way that we may rely on scmd->device->hostdata
for as long as scmd belongs to the low-level driver.
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a target disconnects during a PIO data transfer the command may fail
when the target reconnects:
scsi host1: DMA length is zero!
scsi host1: cur adr[04380000] len[00000000]
The scsi bus is then reset. This happens because the residual reached
zero before the transfer was completed.
The usual residual calculation relies on the Transfer Count registers.
That works for DMA transfers but not for PIO transfers. Fix the problem
by storing the PIO transfer residual and using that to correctly
calculate bytes_sent.
Fixes: 6fe07aaffb ("[SCSI] m68k: new mac_esp scsi driver")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The core driver, esp_scsi, does not use the ESP_CONFIG2_FENAB bit, so the
chip's Transfer Counter register is only 16 bits wide (not 24). A larger
transfer cannot work and will theoretically result in a failed command
and a "DMA length is zero" error.
Fixes: 3109e5ae03 ("scsi: zorro_esp: New driver for Amiga Zorro NCR53C9x boards")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Cc: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Convert the driver from the legacy pci_* DMA API to the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We need to transfer device ownership to the CPU before we can manipulate
the mapped data.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We can't just transfer ownership to the CPU and then unmap, as this will
break with swiotlb.
Instead unmap the command and sense buffer a little earlier in the I/O
completion handler and get rid of the pci_dma_sync_sg_for_cpu call
entirely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove the list wrappers, including the pointless list iteration before
deletion.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, i915 appears to rely on blocking modesets on
no-longer-present MSTB ports by simply returning NULL for
->best_encoder(), which in turn causes any new atomic commits that don't
disable the CRTC to fail. This is wrong however, since we still want to
allow userspace to disable CRTCs on no-longer-present MSTB ports by
changing the DPMS state to off and this still requires that we retrieve
an encoder.
So, fix this by always returning a valid encoder regardless of the state
of the MST port.
Changes since v1:
- Remove mst atomic helper, since this got replaced with a much simpler
solution
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20181008232437.5571-6-lyude@redhat.com
(cherry picked from commit a9f9ca33d1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Since we need to be able to allow DPMS on->off prop changes after an MST
port has disappeared from the system, we need to be able to make sure we
can compute a config for the resulting atomic commit. Currently this is
impossible when the port has disappeared, since the VCPI slot searching
we try to do in intel_dp_mst_compute_config() will fail with -EINVAL.
Since the only commits we want to allow on no-longer-present MST ports
are ones that shut off display hardware, we already know that no VCPI
allocations are needed. So, hardcode the VCPI slot count to 0 when
intel_dp_mst_compute_config() is called on an MST port that's gone.
Changes since V4:
- Don't use mst_port_gone at all, just check whether or not the drm
connector is registered - Daniel Vetter
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20181008232437.5571-5-lyude@redhat.com
(cherry picked from commit f67207d78c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Currently we set intel_connector->mst_port to NULL to signify that the
MST port has been removed from the system so that we can prevent further
action on the port such as connector probes, mode probing, etc.
However, we're going to need access to intel_connector->mst_port in
order to fixup ->best_encoder() so that it can always return the correct
encoder for an MST port to prevent legacy DPMS prop changes from
failing. This should be safe, so instead keep intel_connector->mst_port
always set and instead just check the status of
drm_connector->regustered to signify whether or not the connector has
disappeared from the system.
Changes since v2:
- Add a comment to mst_port_gone (Jani Nikula)
- Change mst_port_gone to a u8 instead of a bool, per the kernel bot.
Apparently bool is discouraged in structs these days
Changes since v4:
- Don't use mst_port_gone at all! Just check if the connector is
registered or not - Daniel Vetter
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20181008232437.5571-4-lyude@redhat.com
(cherry picked from commit 6ed5bb1fba)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
When we decide that a plane is attached to the wrong pipe we try
to turn off said plane. However we are passing around the crtc we
think that the plane is supposed to be using rather than the crtc
it is currently using. That doesn't work all that well because
we may have to do vblank waits etc. and the other pipe might
not even be enabled here. So let's pass the plane's current crtc to
intel_plane_disable_noatomic() so that it can its job correctly.
To do that semi-cleanly we also have to change the plane readout
to record the plane's visibility into the bitmasks of the crtc
where the plane is currently enabled rather than to the crtc
we want to use for the plane.
One caveat here is that our active_planes bitmask will get confused
if both planes are enabled on the same pipe. Fortunately we can use
plane_mask to reconstruct active_planes sufficiently since
plane_mask still has the same meaning (is the plane visible?)
during readout. We also have to do the same during the initial
plane readout as the second plane could clear the active_planes
bit the first plane had already set.
v2: Rely on fixup_active_planes() to populate active_planes fully (Daniel)
Add Daniel's proposed comment to better document why we do this
Drop the redundant intel_set_plane_visible() call
Cc: stable@vger.kernel.org # fcba862e8428 drm/i915: Have plane->get_hw_state() return the current pipe
Cc: stable@vger.kernel.org
Cc: Dennis <dennis.nezic@utoronto.ca>
Cc: Daniel Vetter <daniel@ffwll.ch>
Tested-by: Dennis <dennis.nezic@utoronto.ca>
Tested-by: Peter Nowee <peter.nowee@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105637
Fixes: b1e01595a6 ("drm/i915: Redo plane sanitation during readout")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181003145017.4527-1-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 62358aa4ee)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
commit 4e0b83a567 ("drm/i915: Extract per-platform plane->check()
functions") removed the plane max stride check for sprite planes.
I was going to add it back when introducing GTT remapping for the
display, but after further thought it seems better to re-introduce
it separately.
So let's add the max stride check back. And let's do it in a nicer
form than what we had before and do it for all plane types (easy
now that we have the ->max_stride() plane vfunc).
Only sprite planes really need this for now since primary planes
are capable of scanning out the current max fb size we allow, and
cursors have more stringent stride checks elsewhere.
Cc: José Roberto de Souza <jose.souza@intel.com>
Fixes: 4e0b83a567 ("drm/i915: Extract per-platform plane->check() functions")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180918140243.12207-1-ville.syrjala@linux.intel.com
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
(cherry picked from commit fc3fed5d29)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Right now we return EINVAL if a process does not have permission to dedupe a
file. This was an oversight on my part. EPERM gives a true description of
the nature of our error, and EINVAL is already used for the case that the
filesystem does not support dedupe.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The permission check in vfs_dedupe_file_range_one() is too coarse - We only
allow dedupe of the destination file if the user is root, or they have the
file open for write.
This effectively limits a non-root user from deduping their own read-only
files. In addition, the write file descriptor that the user is forced to
hold open can prevent execution of files. As file data during a dedupe
does not change, the behavior is unexpected and this has caused a number of
issue reports. For an example, see:
https://github.com/markfasheh/duperemove/issues/129
So change the check so we allow dedupe on the target if:
- the root or admin is asking for it
- the process has write access
- the owner of the file is asking for the dedupe
- the process could get write access
That way users can open read-only and still get dedupe.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch adds support for the Mylex DAC960 RAID controller,
supporting the newer, SCSI-based interface. The driver is a
re-implementation of the original DAC960 driver.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds support for the Mylex DAC960 RAID controller,
supporting the older, block-based interface only. The driver is a
re-implementation of the original DAC960 driver.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/advansys.c: In function 'adv_isr_callback':
drivers/scsi/advansys.c:5952:6: warning:
variable 'srb_tag' set but not used [-Wunused-but-set-variable]
It never used since introduction in
commit 9c17c62aed ("advansys: use shared host tag map for command lookup")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This message floods the log when enabling mask 0x7 for
/proc/sys/dev/scsi/logging_level:
xxxxxxxx kernel: scsi_block_when_processing_errors: rtn: 1
It's not needed and makes tracing just scsi_eh* messages way too
verbose so get rid of it.
[mkp: mangled patch, applied by hand]
Signed-off-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Chad Dupuis <chad.dupuis@cavium.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is currently a bug with the driver where there is never a call to
target_sess_cmd_list_set_waiting(), it only called
target_wait_for_sess_cmd(), which basically means that the sess_wait_list
would always be empty.
Thus, list_empty(&sess->sess_wait_list) = true,
(eg: no se_cmd I/O is quiesced, because no se_cmd in sess_wait_list),
since commit 712db3eb2c ("scsi: ibmvscsis: Properly deregister
target sessions") in 4.9.y code.
ibmvscsi_tgt does not remove the I_T Nexus when a VM is active so we can
fix this issue by removing the call to target_wait_for_sess_cmd()
altogether.
Signed-off-by: Bryant G. Ly <bly@catalogicsoftware.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
With commit 10e5e37581 ("scsi: ufs: Add clock ungating to a separate
workqueue"), clock gating work was moved to a separate work queue with
WQ_MEM_RECLAIM set, since clock gating could occur from a memory reclaim
context. Unfortunately, clk_gating.gate_work was left queued via
schedule_delayed_work, which is a system workqueue that does not have
WQ_MEM_RECLAIM set. Because ufshcd_ungate_work attempts to cancel
gate_work, the following warning appears:
[ 14.174170] workqueue: WQ_MEM_RECLAIM ufs_clk_gating_0:ufshcd_ungate_work is flushing !WQ_MEM_RECLAIM events:ufshcd_gate_work
[ 14.174179] WARNING: CPU: 4 PID: 173 at kernel/workqueue.c:2440 check_flush_dependency+0x110/0x118
[ 14.205725] CPU: 4 PID: 173 Comm: kworker/u16:3 Not tainted 4.14.68 #1
[ 14.212437] Hardware name: Google Cheza (rev1) (DT)
[ 14.217459] Workqueue: ufs_clk_gating_0 ufshcd_ungate_work
[ 14.223107] task: ffffffc0f6a40080 task.stack: ffffff800a490000
[ 14.229195] PC is at check_flush_dependency+0x110/0x118
[ 14.234569] LR is at check_flush_dependency+0x110/0x118
[ 14.239944] pc : [<ffffff80080cad14>] lr : [<ffffff80080cad14>] pstate: 60c001c9
[ 14.333050] Call trace:
[ 14.427767] [<ffffff80080cad14>] check_flush_dependency+0x110/0x118
[ 14.434219] [<ffffff80080cafec>] start_flush_work+0xac/0x1fc
[ 14.440046] [<ffffff80080caeec>] flush_work+0x40/0x94
[ 14.445246] [<ffffff80080cb288>] __cancel_work_timer+0x11c/0x1b8
[ 14.451433] [<ffffff80080cb4b8>] cancel_delayed_work_sync+0x20/0x30
[ 14.457886] [<ffffff80085b9294>] ufshcd_ungate_work+0x24/0xd0
[ 14.463800] [<ffffff80080cfb04>] process_one_work+0x32c/0x690
[ 14.469713] [<ffffff80080d0154>] worker_thread+0x218/0x338
[ 14.475361] [<ffffff80080d527c>] kthread+0x120/0x130
[ 14.480470] [<ffffff8008084814>] ret_from_fork+0x10/0x18
The simple solution is to put the gate_work on the same WQ_MEM_RECLAIM
work queue as the ungate_work.
Fixes: 10e5e37581 ("scsi: ufs: Add clock ungating to a separate workqueue")
Signed-off-by: Evan Green <evgreen@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ido Schimmel says:
====================
mlxsw: Add VxLAN support
This patchset adds support for VxLAN offload in the mlxsw driver.
With regards to the forwarding plane, VxLAN support is composed from two
main parts: Encapsulation and decapsulation.
In the device, NVE encapsulation (and VxLAN in particular) takes place
in the bridge. A packet can be encapsulated using VxLAN either because
it hit an FDB entry that forwards it to the router with the IP of the
remote VTEP or because it was flooded, in which case it is sent to a
list of remote VTEPs (in addition to local ports). In either case, the
VNI is derived from the filtering identifier (FID) the packet was
classified to at ingress and the underlay source IP is taken from a
device global configuration.
VxLAN decapsulation takes place in the underlay router, where packets
that hit a local route that corresponds to the source IP of the local
VTEP are decapsulated and injected to the bridge. The packets are
classified to a FID based on the VNI they came with.
The first six patches export the required APIs in the VxLAN and mlxsw
drivers in order to allow for the introduction of the NVE core in the
next two patches. The NVE core is designed to support a variety of NVE
encapsulations (e.g., VxLAN, NVGRE) and different ASICs, but currently
only VxLAN and Spectrum are supported. Spectrum-2 support will be added
in the future.
The last 10 patches add support for VxLAN decapsulation and
encapsulation and include the addition of the required switchdev APIs in
the VxLAN driver. These APIs allow capable drivers to get a notification
about the addition / deletion of FDB entries to / from the VxLAN's FDB.
Subsequent patchset will add selftests (generic and mlxsw-specific),
data plane learning, FDB extack and vetoing and support for VLAN-aware
bridges (one VNI per VxLAN device model).
v2:
* Implement netif_is_vxlan() using rtnl_link_ops->kind (Jakub & Stephen)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In the device, VxLAN encapsulation takes place in the FDB table where
certain {MAC, FID} entries are programmed with an underlay unicast IP.
MAC addresses that are not programmed in the FDB are flooded to the
relevant local ports and also to a list of underlay unicast IPs that are
programmed using the all zeros MAC address in the VxLAN driver.
One difference between the hardware and software data paths is the fact
that in the software data path there are two FDB lookups prior to the
encapsulation of the packet. First in the bridge's FDB table using {MAC,
VID} and another in the VxLAN's FDB table using {MAC, VNI}.
Therefore, when a new VxLAN FDB entry is notified, it is only programmed
to the device if there is a corresponding entry in the bridge's FDB
table. Similarly, when a new bridge FDB entry pointing to the VxLAN
device is notified, it is only programmed to the device if there is a
corresponding entry in the VxLAN's FDB table.
Note that the above scheme will result in a discrepancy between both
data paths if only one FDB table is populated in the software data path.
For example, if only the bridge's FDB is populated with an entry
pointing to a VxLAN device, then a packet hitting the entry will only be
flooded by the kernel to remote VTEPs whereas the device will also flood
the packets to other local ports member in the VLAN.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enslavement of VxLAN devices to offloaded bridges was never forbidden by
mlxsw, but this patch makes sure the required configuration is performed
in order to allow VxLAN encapsulation and decapsulation to take place in
the device.
The patch handles both the case where a VxLAN device is enslaved to an
already offloaded bridge and the case where the first mlxsw port is
enslaved to a bridge that already has VxLAN device configured.
Invalid configurations are sanitized and an error string is returned via
extack.
Since encapsulation and decapsulation do not occur when the VxLAN device
is down, the driver makes sure to enable / disable these functionalities
based on NETDEV_PRE_UP and NETDEV_DOWN events.
Note that NETDEV_PRE_UP is used in favor of NETDEV_UP, as the former
allows to veto the operation, if necessary.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, an FDB entry only ceases being offloaded when it is deleted.
This changes with VxLAN encapsulation.
Devices capable of performing VxLAN encapsulation usually have only one
FDB table, unlike the software data path which has two - one in the
bridge driver and another in the VxLAN driver.
Therefore, bridge FDB entries pointing to a VxLAN device are only
offloaded if there is a corresponding entry in the VxLAN FDB.
Allow clearing the offload indication in case the corresponding entry
was deleted from the VxLAN FDB.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>