linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-11 13:27:06 +09:00

Author	SHA1	Message	Date
Laurent Pinchart	e3a69496a1	media: Prefer designated initializers over memset for subdev pad ops Structures passed to subdev pad operations are all zero-initialized, but not always with the same kind of code constructs. While most drivers used designated initializers, which zero all the fields that are not specified, when declaring variables, some use memset(). Those two methods lead to the same end result, and, depending on compiler optimizations, may even be completely equivalent, but they're not consistent. Improve coding style consistency by using designated initializers instead of calling memset(). Where applicable, also move the variables to inner scopes of for loops to ensure correct initialization in all iterations. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Lad Prabhakar <prabhakar.csengg@gmail.com> # For am437x Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:46:06 +02:00
Laurent Pinchart	ecefa105cc	media: Zero-initialize all structures passed to subdev pad operations Several drivers call subdev pad operations, passing structures that are not fully zeroed. While the drivers initialize the fields they care about explicitly, this results in reserved fields having uninitialized values. Future kernel API changes that make use of those fields thus risk breaking proper driver operation in ways that could be hard to detect. To avoid this, make the code more robust by zero-initializing all the structures passed to subdev pad operation. Maintain a consistent coding style by preferring designated initializers (which zero-initialize all the fields that are not specified) over memset() where possible, and make variable declarations local to inner scopes where applicable. One notable exception to this rule is in the ipu3 driver, where a memset() is needed as the structure is not a local variable but a function parameter provided by the caller. Not all fields of those structures can be initialized when declaring the variables, as the values for those fields are computed later in the code. Initialize the 'which' field in all cases, and other fields when the variable declaration is so close to the v4l2_subdev_call() call that it keeps all the context easily visible when reading the code, to avoid hindering readability. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> # For vimc Reviewed-by: Lad Prabhakar <prabhakar.csengg@gmail.com> # For am437x Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> # For drivers/staging/media/imx/ Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:46:06 +02:00
Laurent Pinchart	a8ca0cf1ff	media: imx-jpeg: Fix incorrect indentation Variable initialization code in notify_src_chg() is incorrectly indented. Fix it. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:46:06 +02:00
Laurent Pinchart	a1e259872f	media: Fix indentation issues introduced by subdev-wide state struct Commit `0d346d2a6f` ("media: v4l2-subdev: add subdev-wide state struct") applied a large media tree-wide change produced by coccinelle. It was so large that a set of identical indentation issues went unnoticed during review. Fix them. While at it, and because it's easy to review both changes together, add a trailing comma for the last (and only) struct member initialization of the related structures, to avoid future changes should new fields need to be initialized. Fixes: `0d346d2a6f` ("media: v4l2-subdev: add subdev-wide state struct") Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:46:06 +02:00
Greg Kroah-Hartman	fba51482b6	Merge tag 'iio-for-6.4a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next Jonathan writes: 1st set of IIO new device support, features and cleanups for the 6.4 cycle. New device support * bosch,bmp280 - Add support for BMP580 - includes significant refactoring and general driver cleanup + support for non-volatile memory for trimming and config parameters. * rohm BU27034 - New driver for this 3 channel ambient light sensor. - New support library for devices where both integration time and amplifier gain are configurable. In these cases a scale change may require changing bother underlying values. This library module provides code to help with this. * st,accel - Add support for IIS328DQ (ID only as compatible wtih LIS331DL) * st,lsm6dsx - Add support for ASM330LHB automotive MEMS sensor. * ti,ads1100, ads1000 - New driver for these 16 bit ADCs. * ti,tmp117 - Add support for older tmp116 device. Includes some general driver cleanup. Staging driver drops * adi,ade7854 - Driver was a very long way from compliant with IIO infrastructure and ABI. If anyone wants a non staging version of this driver they are better off starting from scratch. Hence drop it and the associated meter.h header. Features * adi,ad7441r - Add DT binding to set sink current for digital input. * semtech,sx9324,9360 - Support older register mapping from firmware designed for windows. Core improvements. * Move iio_trigger_poll() docs to next to the implementation and add a note on expected caller context. * Rename iio_trigger_poll_chained() to iio_trigger_poll_nested() so as to use more standard / common terminology. * Improve main ABI docs references to offset and scale for raw values by making them consistent and clear. Cleanups and minor fixes: * adi,ad5592r - Add GPIO names - useful for debug. * adi,ad7441r - Fix current input, loop powered mode configuration setup. * adi,adis16475 - Fix wrong commented value for minimum advised lower rate. * adi,admv1013 - Use devm_clk_get_enabled() to reduce boilerplate. * adi,ads1210 - Fix wrong bits for writing config register (late fix and has been broken a long time so not rushed upstream) * amlogic,meson-saradc - Improve cleanup in error handling if BL30 handshake fails. * apex-embedded,stx104 - Migrate to regmap and use regmap_read_poll_timeout() to neatly handle retries. - Add local mutex to close various races. - Use define U16_MAX rather than value for limit. - Improve code readability with minor reorganization. * atmel,ad91-sama5d2 - Drop trivial dead code. * kionix,kx022a - Drop unused structure element. * linear,ltc2983 - Reorganize bindings doc to enable unevaluatedProperties to be set in one place for all child nodes. - Make binding for adi,custom-thermocouple accept signed values. * maxim,max44000 - Add OF Device matching (of_match_table was not correctly set). * maxim,max5522 - Missing static * measurement-computing,cio-dac - Fix wrong part name in comments. - Migrate to regmap. - Improve includes by replacing bitops.h with more direct bits.h * qcom,pm8xxx-xoadc - Remove a check that can never fail. * renesas,rcar-gyroadc - DT binding documentation improvements. - Tidy up an unused warning with __maybe_unused. * semtech,sx_common - Drop docs for a structure element that doesn't exist. * semtech,sx9500 - Drop ACPI_PTR() and of_match_ptr() protections that just complicate the code / block some firmware registration types that would otherwise work. * sensiron,sps30 - Comment formatting tidy up. * st,sensors - Drop duplicate text in DT binding. * st,stm32-adc - Add some missing static markings. * ti,ads1100 - Use correct return code in dev_err_probe() call. * x-powers,axp20x_adc - precursor series to simplify addition of AXP192. - General code cleanup / minor refactoring for better readabilty of code. - Switch from boolean value to mask for adc_en2 field to avoid hard coding a mask that will be different in AXP192 * tag 'iio-for-6.4a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (63 commits) MAINTAINERS: Add ROHM BU27034 iio: light: ROHM BU27034 Ambient Light Sensor dt-bindings: iio: light: Support ROHM BU27034 MAINTAINERS: Add IIO gain-time-scale helpers iio: light: Add gain-time-scale helpers doc: Make sysfs-bus-iio doc more exact iio: dac: set variable max5522_channels storage-class-specifier to static dt-bindings: iio: temperature: ltc2983: Make 'adi,custom-thermocouple' signed dt-bindings: iio: temperature: ltc2983: Fix child node unevaluated properties iio: addac: stx104: Use regmap_read_poll_timeout() for conversion poll iio: addac: stx104: Migrate to the regmap API iio: addac: stx104: Improve indentation in stx104_write_raw() iio: addac: stx104: Use define rather than hardcoded limit for write val iio: addac: stx104: Fix race condition when converting analog-to-digital iio: addac: stx104: Fix race condition for stx104_write_raw() dt-bindings: iio: st-sensors: Fix repeated text staging: iio: resolver: ads1210: fix config mode iio: adc: ti-ads1100: fix error code in probe() iio: accel: add support for IIS328DQ variant dt-bindings: iio: st-sensors: Add IIS328DQ accelerometer ...	2023-04-12 09:45:34 +02:00
David S. Miller	bbda0f0d15	Merge branch 'dsa-trace-events' Vladimir Oltean says: ==================== DSA trace events This series introduces the "dsa" trace event class, with the following events: $ trace-cmd list \| grep dsa dsa dsa:dsa_fdb_add_hw dsa:dsa_mdb_add_hw dsa:dsa_fdb_del_hw dsa:dsa_mdb_del_hw dsa:dsa_fdb_add_bump dsa:dsa_mdb_add_bump dsa:dsa_fdb_del_drop dsa:dsa_mdb_del_drop dsa:dsa_fdb_del_not_found dsa:dsa_mdb_del_not_found dsa:dsa_lag_fdb_add_hw dsa:dsa_lag_fdb_add_bump dsa:dsa_lag_fdb_del_hw dsa:dsa_lag_fdb_del_drop dsa:dsa_lag_fdb_del_not_found dsa:dsa_vlan_add_hw dsa:dsa_vlan_del_hw dsa:dsa_vlan_add_bump dsa:dsa_vlan_del_drop dsa:dsa_vlan_del_not_found These are useful to debug refcounting issues on CPU and DSA ports, where entries may remain lingering, or may be removed too soon, depending on bugs in higher layers of the network stack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-12 08:36:07 +01:00
Vladimir Oltean	02020bd70f	net: dsa: add trace points for VLAN operations These are not as critical as the FDB/MDB trace points (I'm not aware of outstanding VLAN related bugs), but maybe they are useful to somebody, either debugging something or simply trying to learn more. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-12 08:36:07 +01:00
Vladimir Oltean	9538ebce88	net: dsa: add trace points for FDB/MDB operations DSA performs non-trivial housekeeping of unicast and multicast addresses on shared (CPU and DSA) ports, and puts a bit of pressure on higher layers, requiring them to behave correctly (remove these addresses exactly as many times as they were added). Otherwise, either addresses linger around forever, or DSA returns -ENOENT complaining that entries that were already deleted must be deleted again. To aid debugging, introduce some trace points specifically for FDB and MDB - that's where some of the bugs still are right now. Some bugs I have seen were also due to race conditions, see: - `630fd4822a` ("net: dsa: flush switchdev workqueue on bridge join error path") - `a2614140dc` ("net: dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN") so it would be good to not disturb the timing too much, hence the choice to use trace points vs regular dev_dbg(). I've had these for some time on my computer in a less polished form, and they've proven useful. What I found most useful was to enable CONFIG_BOOTTIME_TRACING, add "trace_event=dsa" to the kernel cmdline, and run "cat /sys/kernel/debug/tracing/trace". This is to debug more complex environments with network managers started by the init system, things like that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-12 08:36:07 +01:00
Fritz Koenig	a47a3ae5fc	media: venus: Correct P010 buffer alignment According to msm_media_info.h the correct alignment for the stride of P010 buffers is 128. Signed-off-by: Fritz Koenig <frkoenig@chromium.org> Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:35:44 +02:00
Dikshita Agarwal	7493db46e4	venus: venc: add handling for VIDIOC_ENCODER_CMD Add handling for below commands in encoder: 1. V4L2_ENC_CMD_STOP 2. V4L2_ENC_CMD_START Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:35:44 +02:00
Viswanath Boma	90655e2e79	venus: Add support for min/max qp range. Currently QP range set from client is not communicated to video firmware. Add support for the QP range HFI to set the same to firmware. Signed-off-by: Viswanath Boma <quic_vboma@quicinc.com> Signed-off-by: Vikash Garodia <quic_vgarodia@quicinc.com> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:35:44 +02:00
Javier Martinez Canillas	a9d45ec74c	media: venus: dec: Fix capture formats enumeration order Commit `9593126dae` ("media: venus: Add a handling of QC08C compressed format") and commit `cef92b14e6` ("media: venus: Add a handling of QC10C compressed format") added support for the QC08C and QC10C compressed formats respectively. But these also caused a regression, because the new formats where added at the beginning of the vdec_formats[] array and the vdec_inst_init() function sets the default format output and capture using fixed indexes of that array: static void vdec_inst_init(struct venus_inst *inst) { ... inst->fmt_out = &vdec_formats[8]; inst->fmt_cap = &vdec_formats[0]; ... } Since now V4L2_PIX_FMT_NV12 is not the first entry in the array anymore, the default capture format is not set to that as it was done before. Both commits changed the first index to keep inst->fmt_out default format set to V4L2_PIX_FMT_H264, but did not update the latter to keep .fmt_out default format set to V4L2_PIX_FMT_NV12. Rather than updating the index to the current V4L2_PIX_FMT_NV12 position, let's reorder the entries so that this format is the first entry again. This would also make VIDIOC_ENUM_FMT report the V4L2_PIX_FMT_NV12 format with an index 0 as it did before the QC08C and QC10C formats were added. Fixes: `9593126dae` ("media: venus: Add a handling of QC08C compressed format") Fixes: `cef92b14e6` ("media: venus: Add a handling of QC10C compressed format") Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:35:44 +02:00
Viswanath Boma	bfe1326573	venus: Fix for H265 decoding failure. Aligned the mismatch of persist1 and scratch1 buffer calculation, as per the firmware requirements. Signed-off-by: Vikash Garodia <vgarodia@qti.qualcomm.com> Signed-off-by: Viswanath Boma <quic_vboma@quicinc.com> Tested-by: Nathan Hebert <nhebert@chromium.org> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:35:44 +02:00
Michał Krawczyk	50248ad9f1	media: venus: dec: Fix handling of the start cmd The decoder driver should clear the last_buffer_dequeued flag of the capture queue upon receiving V4L2_DEC_CMD_START. The last_buffer_dequeued flag is set upon receiving EOS (which always happens upon receiving V4L2_DEC_CMD_STOP). Without this patch, after issuing the V4L2_DEC_CMD_STOP and V4L2_DEC_CMD_START, the vb2_dqbuf() function will always fail, even if the buffers are completed by the hardware. Fixes: `beac82904a` ("media: venus: make decoder compliant with stateful codec API") Signed-off-by: Michał Krawczyk <mk@semihalf.com> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:35:44 +02:00
Denis Plotnikov	7573099e10	qlcnic: check pci_reset_function result Static code analyzer complains to unchecked return value. The result of pci_reset_function() is unchecked. Despite, the issue is on the FLR supported code path and in that case reset can be done with pcie_flr(), the patch uses less invasive approach by adding the result check of pci_reset_function(). Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `7e2cf4feba` ("qlcnic: change driver hardware interface mechanism") Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-12 08:32:38 +01:00
Florian Fainelli	9c592f8ab1	media: rc: gpio-ir-recv: Fix support for wake-up The driver was intended from the start to be a wake-up source for the system, however due to the absence of a suitable call to device_set_wakeup_capable(), the device_may_wakeup() call used to decide whether to enable the GPIO interrupt as a wake-up source would never happen. Lookup the DT standard "wakeup-source" property and call device_init_wakeup() to ensure the device is flagged as being wakeup capable. Reported-by: Matthew Lear <matthew.lear@broadcom.com> Fixes: `fd0f6851eb` ("[media] rc: Add support for GPIO based IR Receiver driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:19:05 +02:00
Florian Fainelli	9fffbf9ca0	dt-bindings: media: gpio-ir-receiver: Document wakeup-souce property The GPIO IR receiver can be used as a wake-up source for the system, document that. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2023-04-12 09:19:05 +02:00
Manivannan Sadhasivam	a4c716706f	dt-bindings: PCI: qcom: Update maintainers entry Stan is no longer working with MMSOL and expressed his interest to not continue maintaining Qcom PCIe driver. Since I took over the driver maintainership, I'm stepping in to maintain the binding also. Link: https://lore.kernel.org/r/20230308082424.140224-2-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>	2023-04-12 08:49:04 +02:00
Manivannan Sadhasivam	c0e1eb441b	PCI: qcom: Enable async probe by default Qcom PCIe RC driver waits for the PHY link to be up during the probe; this consumes several milliseconds during boot. Enable async probe by default so that other drivers can load in parallel while this driver waits for the link to be up. Suggested-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230320064644.5217-1-manivannan.sadhasivam@linaro.org Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org>	2023-04-12 08:49:04 +02:00
Manivannan Sadhasivam	ad9b9b6e36	PCI: qcom: Add support for system suspend and resume During the system suspend, vote for minimal interconnect bandwidth (1KiB) to keep the interconnect path active for config access and also turn OFF the resources like clock and PHY if there are no active devices connected to the controller. For the controllers with active devices, the resources are kept ON as removing the resources will trigger access violation during the late end of suspend cycle as kernel tries to access the config space of PCIe devices to mask the MSIs. Also, it is not desirable to put the link into L2/L3 state as that implies VDD supply will be removed and the devices may go into powerdown state. This will affect the lifetime of storage devices like NVMe. And finally, during resume, turn ON the resources if the controller was truly suspended (resources OFF) and update the interconnect bandwidth based on PCIe Gen speed. Suggested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> Link: https://lore.kernel.org/r/20230403154922.20704-2-manivannan.sadhasivam@linaro.org Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Acked-by: Dhruva Gole <d-gole@ti.com>	2023-04-12 08:48:53 +02:00
Ye Bin	8ee81ed581	xfs: fix BUG_ON in xfs_getbmap() There's issue as follows: XFS: Assertion failed: (bmv->bmv_iflags & BMV_IF_DELALLOC) != 0, file: fs/xfs/xfs_bmap_util.c, line: 329 ------------[ cut here ]------------ kernel BUG at fs/xfs/xfs_message.c:102! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 14612 Comm: xfs_io Not tainted 6.3.0-rc2-next-20230315-00006-g2729d23ddb3b-dirty #422 RIP: 0010:assfail+0x96/0xa0 RSP: 0018:ffffc9000fa178c0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff888179a18000 RDX: 0000000000000000 RSI: ffff888179a18000 RDI: 0000000000000002 RBP: 0000000000000000 R08: ffffffff8321aab6 R09: 0000000000000000 R10: 0000000000000001 R11: ffffed1105f85139 R12: ffffffff8aacc4c0 R13: 0000000000000149 R14: ffff888269f58000 R15: 000000000000000c FS: 00007f42f27a4740(0000) GS:ffff88882fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000b92388 CR3: 000000024f006000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> xfs_getbmap+0x1a5b/0x1e40 xfs_ioc_getbmap+0x1fd/0x5b0 xfs_file_ioctl+0x2cb/0x1d50 __x64_sys_ioctl+0x197/0x210 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd Above issue may happen as follows: ThreadA ThreadB do_shared_fault __do_fault xfs_filemap_fault __xfs_filemap_fault filemap_fault xfs_ioc_getbmap -> Without BMV_IF_DELALLOC flag xfs_getbmap xfs_ilock(ip, XFS_IOLOCK_SHARED); filemap_write_and_wait do_page_mkwrite xfs_filemap_page_mkwrite __xfs_filemap_fault xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); iomap_page_mkwrite ... xfs_buffered_write_iomap_begin xfs_bmapi_reserve_delalloc -> Allocate delay extent xfs_ilock_data_map_shared(ip) xfs_getbmap_report_one ASSERT((bmv->bmv_iflags & BMV_IF_DELALLOC) != 0) -> trigger BUG_ON As xfs_filemap_page_mkwrite() only hold XFS_MMAPLOCK_SHARED lock, there's small window mkwrite can produce delay extent after file write in xfs_getbmap(). To solve above issue, just skip delalloc extents. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2023-04-12 15:49:44 +10:00
Darrick J. Wong	22ed903eee	xfs: verify buffer contents when we skip log replay syzbot detected a crash during log recovery: XFS (loop0): Mounting V5 Filesystem bfdc47fc-10d8-4eed-a562-11a831b3f791 XFS (loop0): Torn write (CRC failure) detected at log block 0x180. Truncating head block from 0x200. XFS (loop0): Starting recovery (logdev: internal) ================================================================== BUG: KASAN: slab-out-of-bounds in xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813 Read of size 8 at addr ffff88807e89f258 by task syz-executor132/5074 CPU: 0 PID: 5074 Comm: syz-executor132 Not tainted 6.2.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106 print_address_description+0x74/0x340 mm/kasan/report.c:306 print_report+0x107/0x1f0 mm/kasan/report.c:417 kasan_report+0xcd/0x100 mm/kasan/report.c:517 xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813 xfs_btree_lookup+0x346/0x12c0 fs/xfs/libxfs/xfs_btree.c:1913 xfs_btree_simple_query_range+0xde/0x6a0 fs/xfs/libxfs/xfs_btree.c:4713 xfs_btree_query_range+0x2db/0x380 fs/xfs/libxfs/xfs_btree.c:4953 xfs_refcount_recover_cow_leftovers+0x2d1/0xa60 fs/xfs/libxfs/xfs_refcount.c:1946 xfs_reflink_recover_cow+0xab/0x1b0 fs/xfs/xfs_reflink.c:930 xlog_recover_finish+0x824/0x920 fs/xfs/xfs_log_recover.c:3493 xfs_log_mount_finish+0x1ec/0x3d0 fs/xfs/xfs_log.c:829 xfs_mountfs+0x146a/0x1ef0 fs/xfs/xfs_mount.c:933 xfs_fs_fill_super+0xf95/0x11f0 fs/xfs/xfs_super.c:1666 get_tree_bdev+0x400/0x620 fs/super.c:1282 vfs_get_tree+0x88/0x270 fs/super.c:1489 do_new_mount+0x289/0xad0 fs/namespace.c:3145 do_mount fs/namespace.c:3488 [inline] __do_sys_mount fs/namespace.c:3697 [inline] __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f89fa3f4aca Code: 83 c4 08 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd5fb5ef8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 00646975756f6e2c RCX: 00007f89fa3f4aca RDX: 0000000020000100 RSI: 0000000020009640 RDI: 00007fffd5fb5f10 RBP: 00007fffd5fb5f10 R08: 00007fffd5fb5f50 R09: 000000000000970d R10: 0000000000200800 R11: 0000000000000206 R12: 0000000000000004 R13: 0000555556c6b2c0 R14: 0000000000200800 R15: 00007fffd5fb5f50 </TASK> The fuzzed image contains an AGF with an obviously garbage agf_refcount_level value of 32, and a dirty log with a buffer log item for that AGF. The ondisk AGF has a higher LSN than the recovered log item. xlog_recover_buf_commit_pass2 reads the buffer, compares the LSNs, and decides to skip replay because the ondisk buffer appears to be newer. Unfortunately, the ondisk buffer is corrupt, but recovery just read the buffer with no buffer ops specified: error = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno, buf_f->blf_len, buf_flags, &bp, NULL); Skipping the buffer leaves its contents in memory unverified. This sets us up for a kernel crash because xfs_refcount_recover_cow_leftovers reads the buffer (which is still around in XBF_DONE state, so no read verification) and creates a refcountbt cursor of height 32. This is impossible so we run off the end of the cursor object and crash. Fix this by invoking the verifier on all skipped buffers and aborting log recovery if the ondisk buffer is corrupt. It might be smarter to force replay the log item atop the buffer and then see if it'll pass the write verifier (like ext4 does) but for now let's go with the conservative option where we stop immediately. Link: https://syzkaller.appspot.com/bug?extid=7e9494b8b399902e994e Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2023-04-12 15:49:23 +10:00
Darrick J. Wong	c95356ca88	xfs: _{attr,data}_map_shared should take ILOCK_EXCL until iread_extents is completely done While fuzzing the data fork extent count on a btree-format directory with xfs/375, I observed the following (excerpted) splat: XFS: Assertion failed: xfs_isilocked(ip, XFS_ILOCK_EXCL), file: fs/xfs/libxfs/xfs_bmap.c, line: 1208 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 43192 at fs/xfs/xfs_message.c:104 assfail+0x46/0x4a [xfs] Call Trace: <TASK> xfs_iread_extents+0x1af/0x210 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xchk_dir_walk+0xb8/0x190 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xchk_parent_count_parent_dentries+0x41/0x80 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xchk_parent_validate+0x199/0x2e0 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xchk_parent+0xdf/0x130 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xfs_scrub_metadata+0x2b8/0x730 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xfs_scrubv_metadata+0x38b/0x4d0 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xfs_ioc_scrubv_metadata+0x111/0x160 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xfs_file_ioctl+0x367/0xf50 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] __x64_sys_ioctl+0x82/0xa0 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 The cause of this is a race condition in xfs_ilock_data_map_shared, which performs an unlocked access to the data fork to guess which lock mode it needs: Thread 0 Thread 1 xfs_need_iread_extents <observe no iext tree> xfs_ilock(..., ILOCK_EXCL) xfs_iread_extents <observe no iext tree> <check ILOCK_EXCL> <load bmbt extents into iext> <notice iext size doesn't match nextents> xfs_need_iread_extents <observe iext tree> xfs_ilock(..., ILOCK_SHARED) <tear down iext tree> xfs_iunlock(..., ILOCK_EXCL) xfs_iread_extents <observe no iext tree> <check ILOCK_EXCL> BOOM Fix this race by adding a flag to the xfs_ifork structure to indicate that we have not yet read in the extent records and changing the predicate to look at the flag state, not if_height. The memory barrier ensures that the flag will not be set until the very end of the function. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2023-04-12 15:49:10 +10:00
Dave Chinner	4b827b3f30	xfs: remove WARN when dquot cache insertion fails It just creates unnecessary bot noise these days. Reported-by: syzbot+6ae213503fb12e87934f@syzkaller.appspotmail.com Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>	2023-04-12 15:48:59 +10:00
Dave Chinner	aa88019851	xfs: don't consider future format versions valid In commit `fe08cc5044` we reworked the valid superblock version checks. If it is a V5 filesystem, it is always valid, then we checked if the version was less than V4 (reject) and then checked feature fields in the V4 flags to determine if it was valid. What we missed was that if the version is not V4 at this point, we shoudl reject the fs. i.e. the check current treats V6+ filesystems as if it was a v4 filesystem. Fix this. cc: stable@vger.kernel.org Fixes: `fe08cc5044` ("xfs: open code sb verifier feature checks") Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>	2023-04-12 15:48:50 +10:00
Jakub Kicinski	adacf21f1c	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== iavf: fix racing in VLANs Ahmed Zaki says: This patchset mainly fixes a racing issue in the iavf where the number of VLANs in the vlan_filter_list might be more than the PF limit. To fix that, we get rid of the cvlans and svlans bitmaps and keep all the required info in the list. The second patch adds two new states that are needed so that we keep the VLAN info while the interface goes DOWN: -- DISABLE (notify PF, but keep the filter in the list) -- INACTIVE (dev is DOWN, filter is removed from PF) Finally, the current code keeps each state in a separate bit field, which is error prone. The first patch refactors that by replacing all bits with a single enum. The changes are minimal where each bit change is replaced with the new state value. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: remove active_cvlans and active_svlans bitmaps iavf: refactor VLAN filter states ==================== Link: https://lore.kernel.org/r/20230407210730.3046149-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-11 21:37:53 -07:00
Jakub Kicinski	160c13175e	Merge tag 'for-net-2023-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix not setting Dath Path for broadcast sink - Fix not cleaning up on LE Connection failure - SCO: Fix possible circular locking dependency - L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp} - Fix race condition in hidp_session_thread - btbcm: Fix logic error in forming the board name - btbcm: Fix use after free in btsdio_remove * tag 'for-net-2023-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp} Bluetooth: Set ISO Data Path on broadcast sink Bluetooth: hci_conn: Fix possible UAF Bluetooth: SCO: Fix possible circular locking dependency sco_sock_getsockopt Bluetooth: SCO: Fix possible circular locking dependency on sco_connect_cfm bluetooth: btbcm: Fix logic error in forming the board name. Bluetooth: btsdio: fix use after free bug in btsdio_remove due to race condition Bluetooth: Fix race condition in hidp_session_thread Bluetooth: Fix printing errors if LE Connection times out Bluetooth: hci_conn: Fix not cleaning up on LE Connection failure ==================== Link: https://lore.kernel.org/r/20230410172718.4067798-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-11 21:18:23 -07:00
Andrew Lunn	18bb56ab44	net: dsa: mv88e6xxx: Correct cmode to PHY_INTERFACE_ The switch can either take the MAC or the PHY role in an MII or RMII link. There are distinct PHY_INTERFACE_ macros for these two roles. Correct the mapping so that the `REV` version is used for the PHY role. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230411023541.2372609-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-11 21:17:51 -07:00
Yevgeny Kliteynik	108ff8215b	net/mlx5: DR, Add modify-header-pattern ICM pool There is a new ICM area for that memory, so we need to handle it as we did for the others ICM types. The patch added that specific pool with its requirements and management. Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:38 -07:00
Yevgeny Kliteynik	1e5cc7369b	net/mlx5: DR, Prepare sending new WQE type The send engine should be ready to handle more opcodes in addition to RDMA_WRITE/RDMA_READ. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:38 -07:00
Yevgeny Kliteynik	977c4a3e7c	net/mlx5: Add new WQE for updating flow table Add new WQE type: FLOW_TBL_ACCESS, which will be used for writing modify header arguments. This type has specific control segment and special data segment. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:37 -07:00
Yevgeny Kliteynik	9fa7f1de3d	net/mlx5: Add mlx5_ifc bits for modify header argument Add enum value for modify-header argument object and mlx5_bits for the related capabilities. Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:37 -07:00
Yevgeny Kliteynik	cee6484edd	net/mlx5: DR, Set counter ID on the last STE for STEv1 TX In STEv1 counter action can be set either by filling counter ID on STE, in which case it is executed before other actions on this STE, or as a single action, in which case it is executed in accordance with the actions order. FW steering on STEv1 devices implements counter as counter ID on STE, and this counter is set on the last STE. Fix SMFS to be consistent with this behaviour - move TX counter to the last STE, this way the counter will include all actions of the previous STEs that might have changed packet headers length, e.g. encap, vlan push, etc. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:37 -07:00
Parav Pandit	9df839a711	net/mlx5: Create a new profile for SFs Create a new profile for SFs in order to disable the command cache. Each function command cache consumes ~500KB of memory, when using a large number of SFs this savings is notable on memory constarined systems. Use a new profile to provide for future differences between SFs and PFs. The mr_cache not used for non-PF functions, so it is excluded from the new profile. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Bodong Wang <bodong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:37 -07:00
Vlad Buslov	55f3e740f7	net/mlx5: Bridge, add tracepoints for multicast Pass target struct net_device to mdb attach/detach handler in order to expose the port name to the new tracepoints. Implemented following tracepoints: - Attach mdb to port. - Detach mdb from port. Usage example: ># cd /sys/kernel/debug/tracing ># echo mlx5:mlx5_esw_bridge_port_mdb_attach >> set_event ># cat trace ... kworker/0:0-19071 [000] ..... 259004.253848: mlx5_esw_bridge_port_mdb_attach: net_device=enp8s0f0_0 addr=33:33:ff:00:00:01 vid=0 num_ports=1 offloaded=1 Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:37 -07:00
Vlad Buslov	70f0302b3f	net/mlx5: Bridge, implement mdb offload Implement support for add/del SWITCHDEV_OBJ_ID_PORT_MDB events. For mdb destination addresses configure egress table rules to replicate to per-port multicast tables of all ports that are member of the multicast group as illustrated by 'MDB1' rule in the following diagram: +--------+--+ +---------------------------------------> Port 1 \| \| \| +-^------+--+ \| \| \| \| +-----------------------------------------+ \| +---------------------------+ \| \| EGRESS table \| \| +--> PORT 1 multicast table \| \| +----------------------------------+ +-----------------------------------------+ \| \| +---------------------------+ \| \| INGRESS table \| \| \| \| \| \| \| \| +----------------------------------+ \| dst_mac=P1,vlan=X -> pop vlan, goto P1 +--+ \| \| FG0: \| \| \| \| \| dst_mac=P1,vlan=Y -> pop vlan, goto P1 \| \| \| src_port=dst_port -> drop \| \| \| src_mac=M1,vlan=X -> goto egress +---> dst_mac=P2,vlan=X -> pop vlan, goto P2 +--+ \| \| FG1: \| \| \| ... \| \| dst_mac=P2,vlan=Y -> goto P2 \| \| \| \| VLAN X -> pop, goto port \| \| \| \| \| dst_mac=MDB1,vlan=Y -> goto mcast P1,P2 +-----+ \| ... \| \| +----------------------------------+ \| \| \| \| \| VLAN Y -> pop, goto port +-------+ +-----------------------------------------+ \| \| \| FG3: \| \| \| \| matchall -> goto port \| \| \| \| \| \| \| +---------------------------+ \| \| \| \| \| \| +--------+--+ +---------------------------------------> Port 2 \| \| \| +-^------+--+ \| \| \| \| \| +---------------------------+ \| +--> PORT 2 multicast table \| \| +---------------------------+ \| \| \| \| \| FG0: \| \| \| src_port=dst_port -> drop \| \| \| FG1: \| \| \| VLAN X -> pop, goto port \| \| \| ... \| \| \| \| \| \| FG3: \| \| \| matchall -> goto port +-------+ \| \| +---------------------------+ MDB is managed by extending mlx5 bridge to store an entry in mlx5_esw_bridge->mdb_list linked list (used to iterate over all offloaded MDBs) and mlx5_esw_bridge->mdb_ht hash table (used to lookup existing MDB by MAC+VLAN). Every MDB entry can be attached to arbitrary amount of bridge ports that are stored in mlx5_esw_bridge_mdb_entry->ports xarray in order to allow both efficient lookup of the port and also iteration over all ports that the entry is attached to. Every time MDB is attached/detached to/from a port, the hardware rule is recreated with list of destinations corresponding to all attached ports. When the entry is detached from the last port it is removed from mdb and destroyed which means that the ports xarray also acts as implicit reference counting mechanism. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:37 -07:00
Vlad Buslov	b5e80625d1	net/mlx5: Bridge, support multicast VLAN pop When VLAN with 'untagged' flag is created on port also provision the per-port multicast table rule to pop the VLAN during packet replication. This functionality must be in per-port table because some subset of ports that are member of multicast group can require just a match on VLAN (trunk mode) while other subset can be configured to remove the VLAN tag from packets received on the ports (access mode). Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:36 -07:00
Vlad Buslov	272ecfc92f	net/mlx5: Bridge, add per-port multicast replication tables Multicast replication requires adding one more level of FDB_BR_OFFLOAD priority flow tables. The new level is used for per-port multicast-specific tables that have following flow groups structure (flow highest to lowest priority): - Flow group of size one that matches on source port metadata. This will have a static single rule that prevent packets from being replicated to their source port. - Flow group of size one that matches all packets and forwards them to the port that owns the table. Initialize the table dynamically on all bridge ports when adding a port to the bridge that has multicast enabled and on all existing bridge ports when receiving multicast enable notification. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:36 -07:00
Vlad Buslov	18c2916cee	net/mlx5: Bridge, snoop igmp/mld packets Handle SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED attribute notification to dynamically toggle bridge multicast offload. Set new MLX5_ESW_BRIDGE_MCAST_FLAG bridge flag when multicast offload is enabled. Put multicast-specific code into new bridge_mcast.c file. When initializing bridge multicast pipeline create a static rule for snooping on IGMP traffic and three rules for snooping on MLD traffic (for query, report and done message types). Note that matching MLD traffic requires having flexparser MLX5_FLEX_PROTO_ICMPV6 capability enabled. By default Linux bridge is created with multicast enabled which can be modified by 'mcast_snooping' argument: $ ip link set name my_bridge type bridge mcast_snooping 0 Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:36 -07:00
Vlad Buslov	b99c4ef29e	net/mlx5: Bridge, extract code to lookup parent bridge of port The pattern when function looks up a port by vport_num+vhca_id tuple in order to just obtain its parent bridge is repeated multiple times in bridge.c file. Further commits in this series use the pattern even more. Extract the pattern to standalone mlx5_esw_bridge_from_port_lookup() function to improve code readability. This commits doesn't change functionality. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:36 -07:00
Vlad Buslov	6767c97d7a	net/mlx5: Bridge, move additional data structures to priv header Following patches in series will require accessing flow tables and groups sizes, table levels and struct mlx5_esw_bridge from new the new source file dedicated to multicast code. Expose these data in bridge_priv.h to reduce clutter in following patches that will implement the actual functionality. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:36 -07:00
Vlad Buslov	9071b423c3	net/mlx5: Bridge, increase bridge tables sizes Bridge ingress and egress tables got more flow groups recently for QinQ support and will get more in following patches of this series. Increase the sizes of the tables to allow offloading more flows in each mode. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:36 -07:00
Vlad Buslov	e5688f6fb9	net/mlx5: Add mlx5_ifc definitions for bridge multicast support Add the required hardware definitions to mlx5_ifc: fdb_uplink_hairpin, fdb_multi_path_any_table_limit_regc, fdb_multi_path_any_table. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-04-11 20:57:35 -07:00
Eric Biggers	0483913921	fsverity: reject FS_IOC_ENABLE_VERITY on mode 3 fds Commit `56124d6c87` ("fsverity: support enabling with tree block size < PAGE_SIZE") changed FS_IOC_ENABLE_VERITY to use __kernel_read() to read the file's data, instead of direct pagecache accesses. An unintended consequence of this is that the 'WARN_ON_ONCE(!(file->f_mode & FMODE_READ))' in __kernel_read() became reachable by fuzz tests. This happens if FS_IOC_ENABLE_VERITY is called on a fd opened with access mode 3, which means "ioctl access only". Arguably, FS_IOC_ENABLE_VERITY should work on ioctl-only fds. But ioctl-only fds are a weird Linux extension that is rarely used and that few people even know about. (The documentation for FS_IOC_ENABLE_VERITY even specifically says it requires O_RDONLY.) It's probably not worthwhile to make the ioctl internally open a new fd just to handle this case. Thus, just reject the ioctl on such fds for now. Fixes: `56124d6c87` ("fsverity: support enabling with tree block size < PAGE_SIZE") Reported-by: syzbot+51177e4144d764827c45@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=2281afcbbfa8fdb92f9887479cc0e4180f1c6b28 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230406215106.235829-1-ebiggers@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2023-04-11 19:23:23 -07:00
Eric Biggers	39049b69ec	fsverity: explicitly check for buffer overflow in build_merkle_tree() The new Merkle tree construction algorithm is a bit fragile in that it may overflow the 'root_hash' array if the tree actually generated does not match the calculated tree parameters. This should never happen unless there is a filesystem bug that allows the file size to change despite deny_write_access(), or a bug in the Merkle tree logic itself. Regardless, it's fairly easy to check for buffer overflow here, so let's do so. This is a robustness improvement only; this case is not currently known to be reachable. I've added a Fixes tag anyway, since I recommend that this be included in kernels that have the mentioned commit. Fixes: `56124d6c87` ("fsverity: support enabling with tree block size < PAGE_SIZE") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230328041505.110162-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>	2023-04-11 19:23:23 -07:00
Eric Biggers	8eb8af4b3d	fsverity: use WARN_ON_ONCE instead of WARN_ON As per Linus's suggestion (https://lore.kernel.org/r/CAHk-=whefxRGyNGzCzG6BVeM=5vnvgb-XhSeFJVxJyAxAF8XRA@mail.gmail.com), use WARN_ON_ONCE instead of WARN_ON. This barely adds any extra overhead, and it makes it so that if any of these ever becomes reachable (they shouldn't, but that's the point), the logs can't be flooded. Link: https://lore.kernel.org/r/20230406181542.38894-1-ebiggers@kernel.org Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>	2023-04-11 19:23:15 -07:00
Darrick J. Wong	7ba83850ca	xfs: deprecate the ascii-ci feature This feature is a mess -- the hash function has been broken for the entire 15 years of its existence if you create names with extended ascii bytes; metadump name obfuscation has silently failed for just as long; and the feature clashes horribly with the UTF8 encodings that most systems use today. There is exactly one fstest for this feature. In other words, this feature is crap. Let's deprecate it now so we can remove it from the codebase in 2030. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2023-04-11 19:05:19 -07:00
Darrick J. Wong	6db09a8d03	xfs: test the ascii case-insensitive hash Now that we've made kernel and userspace use the same tolower code for computing directory index hashes, add that to the selftest code. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2023-04-11 19:05:05 -07:00
Darrick J. Wong	a9248538fa	xfs: stabilize the dirent name transformation function used for ascii-ci dir hash computation Back in the old days, the "ascii-ci" feature was created to implement case-insensitive directory entry lookups for latin1-encoded names and remove the large overhead of Samba's case-insensitive lookup code. UTF8 names were not allowed, but nobody explicitly wrote in the documentation that this was only expected to work if the system used latin1 names. The kernel tolower function was selected to prepare names for hashed lookups. There's a major discrepancy in the function that computes directory entry hashes for filesystems that have ASCII case-insensitive lookups enabled. The root of this is that the kernel and glibc's tolower implementations have differing behavior for extended ASCII accented characters. I wrote a program to spit out characters for which the tolower() return value is different from the input: glibc tolower: 65:A 66:B 67:C 68:D 69:E 70:F 71:G 72:H 73:I 74:J 75:K 76:L 77:M 78:N 79:O 80:P 81:Q 82:R 83:S 84:T 85:U 86:V 87:W 88:X 89:Y 90:Z kernel tolower: 65:A 66:B 67:C 68:D 69:E 70:F 71:G 72:H 73:I 74:J 75:K 76:L 77:M 78:N 79:O 80:P 81:Q 82:R 83:S 84:T 85:U 86:V 87:W 88:X 89:Y 90:Z 192:À 193:Á 194:Â 195:Ã 196:Ä 197:Å 198:Æ 199:Ç 200:È 201:É 202:Ê 203:Ë 204:Ì 205:Í 206:Î 207:Ï 208:Ð 209:Ñ 210:Ò 211:Ó 212:Ô 213:Õ 214:Ö 215:× 216:Ø 217:Ù 218:Ú 219:Û 220:Ü 221:Ý 222:Þ Which means that the kernel and userspace do not agree on the hash value for a directory filename that contains those higher values. The hash values are written into the leaf index block of directories that are larger than two blocks in size, which means that xfs_repair will flag these directories as having corrupted hash indexes and rewrite the index with hash values that the kernel now will not recognize. Because the ascii-ci feature is not frequently enabled and the kernel touches filesystems far more frequently than xfs_repair does, fix this by encoding the kernel's toupper predicate and tolower functions into libxfs. Give the new functions less provocative names to make it really obvious that this is a pre-hash name preparation function, and nothing else. This change makes userspace's behavior consistent with the kernel. Found by auditing obfuscate_name in xfs_metadump as part of working on parent pointers, wondering how it could possibly work correctly with ci filesystems, writing a test tool to create a directory with hash-colliding names, and watching xfs_repair flag it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2023-04-11 19:05:04 -07:00
Darrick J. Wong	4f5e304248	xfs: cross-reference rmap records with refcount btrees Strengthen the rmap btree record checker a little more by comparing OWN_REFCBT reverse mappings against the refcount btrees. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2023-04-11 19:00:39 -07:00

... 89 90 91 92 93 ...

1185157 Commits