commit 10bc8e4af6 upstream.
[backport comments for pre v5.15:
- ksmbd mentions are irrelevant - ksmbd hunks were dropped
- sb_write_started() is missing - assert was dropped
]
Commit 868f9f2f8e ("vfs: fix copy_file_range() regression in cross-fs
copies") removed fallback to generic_copy_file_range() for cross-fs
cases inside vfs_copy_file_range().
To preserve behavior of nfsd and ksmbd server-side-copy, the fallback to
generic_copy_file_range() was added in nfsd and ksmbd code, but that
call is missing sb_start_write(), fsnotify hooks and more.
Ideally, nfsd and ksmbd would pass a flag to vfs_copy_file_range() that
will take care of the fallback, but that code would be subtle and we got
vfs_copy_file_range() logic wrong too many times already.
Instead, add a flag to explicitly request vfs_copy_file_range() to
perform only generic_copy_file_range() and let nfsd and ksmbd use this
flag only in the fallback path.
This choise keeps the logic changes to minimum in the non-nfsd/ksmbd code
paths to reduce the risk of further regressions.
Fixes: 868f9f2f8e ("vfs: fix copy_file_range() regression in cross-fs copies")
Tested-by: Namjae Jeon <linkinjeon@kernel.org>
Tested-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 868f9f2f8e upstream.
[backport comments for pre v5.15:
- This commit has a bug fixed by commit 10bc8e4af6 ("vfs: fix
copy_file_range() averts filesystem freeze protection")
- ksmbd mentions are irrelevant - ksmbd hunks were dropped
]
A regression has been reported by Nicolas Boichat, found while using the
copy_file_range syscall to copy a tracefs file.
Before commit 5dae222a5f ("vfs: allow copy_file_range to copy across
devices") the kernel would return -EXDEV to userspace when trying to
copy a file across different filesystems. After this commit, the
syscall doesn't fail anymore and instead returns zero (zero bytes
copied), as this file's content is generated on-the-fly and thus reports
a size of zero.
Another regression has been reported by He Zhe - the assertion of
WARN_ON_ONCE(ret == -EOPNOTSUPP) can be triggered from userspace when
copying from a sysfs file whose read operation may return -EOPNOTSUPP.
Since we do not have test coverage for copy_file_range() between any two
types of filesystems, the best way to avoid these sort of issues in the
future is for the kernel to be more picky about filesystems that are
allowed to do copy_file_range().
This patch restores some cross-filesystem copy restrictions that existed
prior to commit 5dae222a5f ("vfs: allow copy_file_range to copy across
devices"), namely, cross-sb copy is not allowed for filesystems that do
not implement ->copy_file_range().
Filesystems that do implement ->copy_file_range() have full control of
the result - if this method returns an error, the error is returned to
the user. Before this change this was only true for fs that did not
implement the ->remap_file_range() operation (i.e. nfsv3).
Filesystems that do not implement ->copy_file_range() still fall-back to
the generic_copy_file_range() implementation when the copy is within the
same sb. This helps the kernel can maintain a more consistent story
about which filesystems support copy_file_range().
nfsd and ksmbd servers are modified to fall-back to the
generic_copy_file_range() implementation in case vfs_copy_file_range()
fails with -EOPNOTSUPP or -EXDEV, which preserves behavior of
server-side-copy.
fall-back to generic_copy_file_range() is not implemented for the smb
operation FSCTL_DUPLICATE_EXTENTS_TO_FILE, which is arguably a correct
change of behavior.
Fixes: 5dae222a5f ("vfs: allow copy_file_range to copy across devices")
Link: https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chromium.org/
Link: https://lore.kernel.org/linux-fsdevel/CANMq1KDZuxir2LM5jOTm0xx+BnvW=ZmpsG47CyHFJwnw7zSX6Q@mail.gmail.com/
Link: https://lore.kernel.org/linux-fsdevel/20210126135012.1.If45b7cdc3ff707bc1efa17f5366057d60603c45f@changeid/
Link: https://lore.kernel.org/linux-fsdevel/20210630161320.29006-1-lhenriques@suse.de/
Reported-by: Nicolas Boichat <drinkcat@chromium.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Luis Henriques <lhenriques@suse.de>
Fixes: 64bf5ff58d ("vfs: no fallback for ->copy_file_range")
Link: https://lore.kernel.org/linux-fsdevel/20f17f64-88cb-4e80-07c1-85cb96c83619@windriver.com/
Reported-by: He Zhe <zhe.he@windriver.com>
Tested-by: Namjae Jeon <linkinjeon@kernel.org>
Tested-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216800
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29368e0939 upstream.
The call to rcu_cpu_starting() in mtrr_ap_init() is not early enough
in the CPU-hotplug onlining process, which results in lockdep splats
as follows:
=============================
WARNING: suspicious RCU usage
5.9.0+ #268 Not tainted
-----------------------------
kernel/kprobes.c:300 RCU-list traversed in non-reader section!!
other info that might help us debug this:
RCU used illegally from offline CPU!
rcu_scheduler_active = 1, debug_locks = 1
no locks held by swapper/1/0.
stack backtrace:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0+ #268
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1 04/01/2014
Call Trace:
dump_stack+0x77/0x97
__is_insn_slot_addr+0x15d/0x170
kernel_text_address+0xba/0xe0
? get_stack_info+0x22/0xa0
__kernel_text_address+0x9/0x30
show_trace_log_lvl+0x17d/0x380
? dump_stack+0x77/0x97
dump_stack+0x77/0x97
__lock_acquire+0xdf7/0x1bf0
lock_acquire+0x258/0x3d0
? vprintk_emit+0x6d/0x2c0
_raw_spin_lock+0x27/0x40
? vprintk_emit+0x6d/0x2c0
vprintk_emit+0x6d/0x2c0
printk+0x4d/0x69
start_secondary+0x1c/0x100
secondary_startup_64_no_verify+0xb8/0xbb
This is avoided by moving the call to rcu_cpu_starting up near
the beginning of the start_secondary() function. Note that the
raw_smp_processor_id() is required in order to avoid calling into lockdep
before RCU has declared the CPU to be watched for readers.
Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/
Reported-by: Qian Cai <cai@redhat.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fix suspend error by vir dev and hw dev run SYSTEM_SLEEP_PM_OPS
Change-Id: I10971c3f43debf082278cf13aacf68eb97d2f0c3
Signed-off-by: Cai YiWei <cyw@rock-chips.com>
1、Fixes brightness is increasing to a stable value when
sensor’s exp reg is fixed value
2、Fixes cross stripe in the first 15 frame
this patch will delay 3ms before frame start
Signed-off-by: Su Yuefu <yuefu.su@rock-chips.com>
Change-Id: I40ea052ae9e4677b5dc0451ce683f5445feeeed5
BACKGROUND:
DTS-HD Bitstream sounds noise occasionally on Denon-AVR-X2700H,
and we found this happen sometime on PLL(frac mode), But it's gone
on PLL(int mode).
This patch Assign parent of I2S5/6 which used for HDMI0/1 to GPLL
to fix DTS-HD Bitstream noise occasionally on Denon-AVR-X2700H.
Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
Change-Id: I071409278ab983af3c32e7b282de1e2819bb706b
BACKGROUND:
DTS-HD Bitstream sounds noise occasionally on Denon-AVR-X2700H,
and we found this happen sometime on PLL(frac mode), But it's gone
on PLL(int mode).
This patch Adds "CLK_SET_RATE_NO_REPARENT" for I2S5/6 which used
for HDMI0/1 to make its parent fixed from GPLL(int mode) to fix
DTS-HD Bitstream noise occasionally on Denon-AVR-X2700H.
Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
Change-Id: I5694c0a7839df817fd32b82ce69450f0eebdcf77
The dwc2 driver use the nak interrupt for the starting point
of isoc-in transfer. The first nak interrupt for isoc-in means
that in token has arrived and the dwc2 driver can obtain the
(micro) frame of the token to set the even/odd (micro) frame
field of DIEPCTL.
However, on some platforms (e.g Rockchip rk3308) which don't
support the "OTG_MULTI_PROC_INTRPT", it means that all device
endpoints share the same nak mask and interrupt. If the nak
interrupt is always enabled, it may trigger nak interrupt storm
by other endpoints except the isoc-in endpoint. So we disable
the nak interrupt when get first isoc in token if the feature
"OTG_MULTI_PROC_INTRPT" isn't enabled.
Signed-off-by: William Wu <william.wu@rock-chips.com>
Change-Id: I99c71a5e0d7903346fd8f71619b6736c3181c0ec
Because struct rga_external_buffer is not initialized before importbuffer.
Signed-off-by: Yu Qiaowei <cerf.yu@rock-chips.com>
Change-Id: I51e341b80aee6bb4ea70eee4f6c9a247947a8f85
The uvc_function_unbind() was calling the same code two times,
increasing a timeout that may occur. The duplicate code looks to have
come in during the merge of 5.10.117. Remove the duplicate code.
Bug: 261895714
Change-Id: I8957048bfad4a9e01baea033de9b628362b2d991
Signed-off-by: Dan Vacura <w36195@motorola.com>
The uvc_function_unbind() was calling the same code two times,
increasing a timeout that may occur. The duplicate code looks to have
come in during the merge of 5.10.117. Remove the duplicate code.
Bug: 261895714
Change-Id: I8957048bfad4a9e01baea033de9b628362b2d991
Signed-off-by: Dan Vacura <w36195@motorola.com>
Changes in 5.10.159
arm64: dts: rockchip: keep I2S1 disabled for GPIO function on ROCK Pi 4 series
arm: dts: rockchip: fix node name for hym8563 rtc
ARM: dts: rockchip: fix ir-receiver node names
arm64: dts: rockchip: fix ir-receiver node names
ARM: dts: rockchip: rk3188: fix lcdc1-rgb24 node name
ARM: 9251/1: perf: Fix stacktraces for tracepoint events in THUMB2 kernels
ARM: 9266/1: mm: fix no-MMU ZERO_PAGE() implementation
ASoC: wm8962: Wait for updated value of WM8962_CLOCKING1 register
ARM: dts: rockchip: disable arm_global_timer on rk3066 and rk3188
9p/fd: Use P9_HDRSZ for header size
regulator: slg51000: Wait after asserting CS pin
ALSA: seq: Fix function prototype mismatch in snd_seq_expand_var_event
btrfs: send: avoid unaligned encoded writes when attempting to clone range
ASoC: soc-pcm: Add NULL check in BE reparenting
regulator: twl6030: fix get status of twl6032 regulators
fbcon: Use kzalloc() in fbcon_prepare_logo()
usb: dwc3: gadget: Disable GUSB2PHYCFG.SUSPHY for End Transfer
9p/xen: check logical size for buffer size
net: usb: qmi_wwan: add u-blox 0x1342 composition
mm/khugepaged: take the right locks for page table retraction
mm/khugepaged: fix GUP-fast interaction by sending IPI
mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths
rtc: mc146818: Prevent reading garbage
rtc: mc146818: Detect and handle broken RTCs
rtc: mc146818: Dont test for bit 0-5 in Register D
rtc: cmos: remove stale REVISIT comments
rtc: mc146818-lib: change return values of mc146818_get_time()
rtc: Check return value from mc146818_get_time()
rtc: mc146818-lib: fix RTC presence check
rtc: mc146818-lib: extract mc146818_avoid_UIP
rtc: cmos: avoid UIP when writing alarm time
rtc: cmos: avoid UIP when reading alarm time
rtc: cmos: Replace spin_lock_irqsave with spin_lock in hard IRQ
rtc: mc146818: Reduce spinlock section in mc146818_set_time()
xen/netback: Ensure protocol headers don't fall in the non-linear area
xen/netback: do some code cleanup
xen/netback: don't call kfree_skb() with interrupts disabled
media: videobuf2-core: take mmap_lock in vb2_get_unmapped_area()
Revert "ARM: dts: imx7: Fix NAND controller size-cells"
media: v4l2-dv-timings.c: fix too strict blanking sanity checks
memcg: fix possible use-after-free in memcg_write_event_control()
mm/gup: fix gup_pud_range() for dax
Bluetooth: btusb: Add debug message for CSR controllers
Bluetooth: Fix crash when replugging CSR fake controllers
KVM: s390: vsie: Fix the initialization of the epoch extension (epdx) field
drm/vmwgfx: Don't use screen objects when SEV is active
drm/shmem-helper: Remove errant put in error path
drm/shmem-helper: Avoid vm_open error paths
HID: usbhid: Add ALWAYS_POLL quirk for some mice
HID: hid-lg4ff: Add check for empty lbuf
HID: core: fix shift-out-of-bounds in hid_report_raw_event
can: af_can: fix NULL pointer dereference in can_rcv_filter
mm/hugetlb: fix races when looking up a CONT-PTE/PMD size hugetlb page
rtc: cmos: Disable irq around direct invocation of cmos_interrupt()
rtc: mc146818-lib: fix locking in mc146818_set_time
rtc: mc146818-lib: fix signedness bug in mc146818_get_time()
netfilter: nft_set_pipapo: Actually validate intervals in fields after the first one
ieee802154: cc2520: Fix error return code in cc2520_hw_init()
ca8210: Fix crash by zero initializing data
netfilter: ctnetlink: fix compilation warning after data race fixes in ct mark
drm/bridge: ti-sn65dsi86: Fix output polarity setting bug
gpio: amd8111: Fix PCI device reference count leak
e1000e: Fix TX dispatch condition
igb: Allocate MSI-X vector when testing
drm: bridge: dw_hdmi: fix preference of RGB modes over YUV420
af_unix: Get user_ns from in_skb in unix_diag_get_exact().
vmxnet3: correctly report encapsulated LRO packet
Bluetooth: 6LoWPAN: add missing hci_dev_put() in get_l2cap_conn()
Bluetooth: Fix not cleanup led when bt_init fails
net: dsa: ksz: Check return value
selftests: rtnetlink: correct xfrm policy rule in kci_test_ipsec_offload
mac802154: fix missing INIT_LIST_HEAD in ieee802154_if_add()
net: encx24j600: Add parentheses to fix precedence
net: encx24j600: Fix invalid logic in reading of MISTAT register
xen-netfront: Fix NULL sring after live migration
net: mvneta: Prevent out of bounds read in mvneta_config_rss()
i40e: Fix not setting default xps_cpus after reset
i40e: Fix for VF MAC address 0
i40e: Disallow ip4 and ip6 l4_4_bytes
NFC: nci: Bounds check struct nfc_target arrays
nvme initialize core quirks before calling nvme_init_subsystem
net: stmmac: fix "snps,axi-config" node property parsing
ip_gre: do not report erspan version on GRE interface
net: thunderx: Fix missing destroy_workqueue of nicvf_rx_mode_wq
net: hisilicon: Fix potential use-after-free in hisi_femac_rx()
net: hisilicon: Fix potential use-after-free in hix5hd2_rx()
tipc: Fix potential OOB in tipc_link_proto_rcv()
ipv4: Fix incorrect route flushing when source address is deleted
ipv4: Fix incorrect route flushing when table ID 0 is used
net: dsa: sja1105: fix memory leak in sja1105_setup_devlink_regions()
tipc: call tipc_lxc_xmit without holding node_read_lock
ethernet: aeroflex: fix potential skb leak in greth_init_rings()
xen/netback: fix build warning
net: plip: don't call kfree_skb/dev_kfree_skb() under spin_lock_irq()
ipv6: avoid use-after-free in ip6_fragment()
net: mvneta: Fix an out of bounds check
macsec: add missing attribute validation for offload
can: esd_usb: Allow REC and TEC to return to zero
Linux 5.10.159
Change-Id: I3ec26473c358ffda0ea8a8dd91ee265f58739029
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit 918ee4911f ]
We don't get any further EVENT from an esd CAN USB device for changes
on REC or TEC while those counters converge to 0 (with ecc == 0). So
when handling the "Back to Error Active"-event force txerr = rxerr =
0, otherwise the berr-counters might stay on values like 95 forever.
Also, to make life easier during the ongoing development a
netdev_dbg() has been introduced to allow dumping error events send by
an esd CAN USB device.
Fixes: 96d8e90382 ("can: Add driver for esd CAN-USB/2 device")
Signed-off-by: Frank Jungclaus <frank.jungclaus@esd.eu>
Link: https://lore.kernel.org/all/20221130202242.3998219-2-frank.jungclaus@esd.eu
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cdd97383e1 ]
In an earlier commit, I added a bounds check to prevent an out of bounds
read and a WARN(). On further discussion and consideration that check
was probably too aggressive. Instead of returning -EINVAL, a better fix
would be to just prevent the out of bounds read but continue the process.
Background: The value of "pp->rxq_def" is a number between 0-7 by default,
or even higher depending on the value of "rxq_number", which is a module
parameter. If the value is more than the number of available CPUs then
it will trigger the WARN() in cpu_max_bits_warn().
Fixes: e8b4fc1390 ("net: mvneta: Prevent out of bounds read in mvneta_config_rss()")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/Y5A7d1E5ccwHTYPf@kadam
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7dfa764e02 ]
Commit ad7f402ae4 ("xen/netback: Ensure protocol headers don't fall in
the non-linear area") introduced a (valid) build warning. There have
even been reports of this problem breaking networking of Xen guests.
Fixes: ad7f402ae4 ("xen/netback: Ensure protocol headers don't fall in the non-linear area")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Tested-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 88956177db ]
When sending packets between nodes in netns, it calls tipc_lxc_xmit() for
peer node to receive the packets where tipc_sk_mcast_rcv()/tipc_sk_rcv()
might be called, and it's pretty much like in tipc_rcv().
Currently the local 'node rw lock' is held during calling tipc_lxc_xmit()
to protect the peer_net not being freed by another thread. However, when
receiving these packets, tipc_node_add_conn() might be called where the
peer 'node rw lock' is acquired. Then a dead lock warning is triggered by
lockdep detector, although it is not a real dead lock:
WARNING: possible recursive locking detected
--------------------------------------------
conn_server/1086 is trying to acquire lock:
ffff8880065cb020 (&n->lock#2){++--}-{2:2}, \
at: tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]
but task is already holding lock:
ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
at: tipc_node_xmit+0x285/0xb30 [tipc]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&n->lock#2);
lock(&n->lock#2);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by conn_server/1086:
#0: ffff8880036d1e40 (sk_lock-AF_TIPC){+.+.}-{0:0}, \
at: tipc_accept+0x9c0/0x10b0 [tipc]
#1: ffff8880036d5f80 (sk_lock-AF_TIPC/1){+.+.}-{0:0}, \
at: tipc_accept+0x363/0x10b0 [tipc]
#2: ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
at: tipc_node_xmit+0x285/0xb30 [tipc]
#3: ffff888012e13370 (slock-AF_TIPC){+...}-{2:2}, \
at: tipc_sk_rcv+0x2da/0x1b40 [tipc]
Call Trace:
<TASK>
dump_stack_lvl+0x44/0x5b
__lock_acquire.cold.77+0x1f2/0x3d7
lock_acquire+0x1d2/0x610
_raw_write_lock_bh+0x38/0x80
tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]
tipc_sk_finish_conn+0x21e/0x640 [tipc]
tipc_sk_filter_rcv+0x147b/0x3030 [tipc]
tipc_sk_rcv+0xbb4/0x1b40 [tipc]
tipc_lxc_xmit+0x225/0x26b [tipc]
tipc_node_xmit.cold.82+0x4a/0x102 [tipc]
__tipc_sendstream+0x879/0xff0 [tipc]
tipc_accept+0x966/0x10b0 [tipc]
do_accept+0x37d/0x590
This patch avoids this warning by not holding the 'node rw lock' before
calling tipc_lxc_xmit(). As to protect the 'peer_net', rcu_read_lock()
should be enough, as in cleanup_net() when freeing the netns, it calls
synchronize_rcu() before the free is continued.
Also since tipc_lxc_xmit() is like the RX path in tipc_rcv(), it makes
sense to call it under rcu_read_lock(). Note that the right lock order
must be:
rcu_read_lock();
tipc_node_read_lock(n);
tipc_node_read_unlock(n);
tipc_lxc_xmit();
rcu_read_unlock();
instead of:
tipc_node_read_lock(n);
rcu_read_lock();
tipc_node_read_unlock(n);
tipc_lxc_xmit();
rcu_read_unlock();
and we have to call tipc_node_read_lock/unlock() twice in
tipc_node_xmit().
Fixes: f73b12812a ("tipc: improve throughput between nodes in netns")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/5bdd1f8fee9db695cfff4528a48c9b9d0523fb00.1670110641.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c0d999348e ]
Cited commit added the table ID to the FIB info structure, but did not
properly initialize it when table ID 0 is used. This can lead to a route
in the default VRF with a preferred source address not being flushed
when the address is deleted.
Consider the following example:
# ip address add dev dummy1 192.0.2.1/28
# ip address add dev dummy1 192.0.2.17/28
# ip route add 198.51.100.0/24 via 192.0.2.2 src 192.0.2.17 metric 100
# ip route add table 0 198.51.100.0/24 via 192.0.2.2 src 192.0.2.17 metric 200
# ip route show 198.51.100.0/24
198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 100
198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 200
Both routes are installed in the default VRF, but they are using two
different FIB info structures. One with a metric of 100 and table ID of
254 (main) and one with a metric of 200 and table ID of 0. Therefore,
when the preferred source address is deleted from the default VRF,
the second route is not flushed:
# ip address del dev dummy1 192.0.2.17/28
# ip route show 198.51.100.0/24
198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 200
Fix by storing a table ID of 254 instead of 0 in the route configuration
structure.
Add a test case that fails before the fix:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Table ID 0
TEST: Route removed in default VRF when source address deleted [FAIL]
Tests passed: 8
Tests failed: 1
And passes after:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Table ID 0
TEST: Route removed in default VRF when source address deleted [ OK ]
Tests passed: 9
Tests failed: 0
Fixes: 5a56a0b3a4 ("net: Don't delete routes in different VRFs")
Reported-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f96a3d7455 ]
Cited commit added the table ID to the FIB info structure, but did not
prevent structures with different table IDs from being consolidated.
This can lead to routes being flushed from a VRF when an address is
deleted from a different VRF.
Fix by taking the table ID into account when looking for a matching FIB
info. This is already done for FIB info structures backed by a nexthop
object in fib_find_info_nh().
Add test cases that fail before the fix:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [FAIL]
TEST: Route in default VRF not removed [ OK ]
RTNETLINK answers: File exists
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [FAIL]
Tests passed: 6
Tests failed: 2
And pass after:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Tests passed: 8
Tests failed: 0
Fixes: 5a56a0b3a4 ("net: Don't delete routes in different VRFs")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>