From c92a7a52243871eb10e0ea2260685def47fb5094 Mon Sep 17 00:00:00 2001 From: David Vernet Date: Wed, 12 Oct 2022 18:20:14 -0500 Subject: [PATCH 01/57] bpf: Allow bpf_user_ringbuf_drain() callbacks to return 1 The bpf_user_ringbuf_drain() helper function allows a BPF program to specify a callback that is invoked when draining entries from a BPF_MAP_TYPE_USER_RINGBUF ring buffer map. The API is meant to allow the callback to return 0 if it wants to continue draining samples, and 1 if it's done draining. Unfortunately, bpf_user_ringbuf_drain() landed shortly after commit 1bfe26fb0827 ("bpf: Add verifier support for custom callback return range"), which changed the default behavior of callbacks to only support returning 0. This patch corrects that oversight by allowing bpf_user_ringbuf_drain() callbacks to return 0 or 1. A follow-on patch will update the user_ringbuf selftests to also return 1 from a bpf_user_ringbuf_drain() callback to prevent this from regressing in the future. Fixes: 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper") Signed-off-by: David Vernet Signed-off-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/20221012232015.1510043-2-void@manifault.com --- kernel/bpf/verifier.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 6f6d2d511c06..9ab7188d8f68 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -6946,6 +6946,7 @@ static int set_user_ringbuf_callback_state(struct bpf_verifier_env *env, __mark_reg_not_init(env, &callee->regs[BPF_REG_5]); callee->in_callback_fn = true; + callee->callback_ret_range = tnum_range(0, 1); return 0; } From 6e44b9f375a3135fc4960d76a9ea6720625cad73 Mon Sep 17 00:00:00 2001 From: David Vernet Date: Wed, 12 Oct 2022 18:20:15 -0500 Subject: [PATCH 02/57] selftests/bpf: Make bpf_user_ringbuf_drain() selftest callback return 1 In commit 1bfe26fb0827 ("bpf: Add verifier support for custom callback return range"), the verifier was updated to require callbacks to BPF helpers to explicitly specify the range of values that can be returned. bpf_user_ringbuf_drain() was merged after this in commit 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper"), and this change in default behavior was missed. This patch updates the BPF_MAP_TYPE_USER_RINGBUF selftests to also return 1 from a bpf_user_ringbuf_drain() callback so as to properly test this going forward. Signed-off-by: David Vernet Signed-off-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/20221012232015.1510043-3-void@manifault.com --- tools/testing/selftests/bpf/progs/user_ringbuf_success.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/progs/user_ringbuf_success.c b/tools/testing/selftests/bpf/progs/user_ringbuf_success.c index 099c23d9aa21..b39093dd5715 100644 --- a/tools/testing/selftests/bpf/progs/user_ringbuf_success.c +++ b/tools/testing/selftests/bpf/progs/user_ringbuf_success.c @@ -47,14 +47,14 @@ record_sample(struct bpf_dynptr *dynptr, void *context) if (status) { bpf_printk("bpf_dynptr_read() failed: %d\n", status); err = 1; - return 0; + return 1; } } else { sample = bpf_dynptr_data(dynptr, 0, sizeof(*sample)); if (!sample) { bpf_printk("Unexpectedly failed to get sample\n"); err = 2; - return 0; + return 1; } stack_sample = *sample; } From 17747577bbcb496e1b1c4096d64c2fc1e7bc0fef Mon Sep 17 00:00:00 2001 From: Siarhei Volkau Date: Sun, 16 Oct 2022 18:35:48 +0300 Subject: [PATCH 03/57] pinctrl: Ingenic: JZ4755 bug fixes Fixes UART1 function bits and MMC groups typo. For pins 0x97,0x99 function 0 is designated to PWM3/PWM5 respectively, function is 1 designated to the UART1. Diff from v1: - sent separately - added tag Fixes Cc: stable@vger.kernel.org Fixes: b582b5a434d3 ("pinctrl: Ingenic: Add pinctrl driver for JZ4755.") Tested-by: Siarhei Volkau Signed-off-by: Siarhei Volkau Link: https://lore.kernel.org/r/20221016153548.3024209-1-lis8215@gmail.com Signed-off-by: Linus Walleij --- drivers/pinctrl/pinctrl-ingenic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/pinctrl/pinctrl-ingenic.c b/drivers/pinctrl/pinctrl-ingenic.c index 7e732076dedf..9e46d83e5138 100644 --- a/drivers/pinctrl/pinctrl-ingenic.c +++ b/drivers/pinctrl/pinctrl-ingenic.c @@ -667,7 +667,7 @@ static u8 jz4755_lcd_24bit_funcs[] = { 1, 1, 1, 1, 0, 0, }; static const struct group_desc jz4755_groups[] = { INGENIC_PIN_GROUP("uart0-data", jz4755_uart0_data, 0), INGENIC_PIN_GROUP("uart0-hwflow", jz4755_uart0_hwflow, 0), - INGENIC_PIN_GROUP("uart1-data", jz4755_uart1_data, 0), + INGENIC_PIN_GROUP("uart1-data", jz4755_uart1_data, 1), INGENIC_PIN_GROUP("uart2-data", jz4755_uart2_data, 1), INGENIC_PIN_GROUP("ssi-dt-b", jz4755_ssi_dt_b, 0), INGENIC_PIN_GROUP("ssi-dt-f", jz4755_ssi_dt_f, 0), @@ -721,7 +721,7 @@ static const char *jz4755_ssi_groups[] = { "ssi-ce1-b", "ssi-ce1-f", }; static const char *jz4755_mmc0_groups[] = { "mmc0-1bit", "mmc0-4bit", }; -static const char *jz4755_mmc1_groups[] = { "mmc0-1bit", "mmc0-4bit", }; +static const char *jz4755_mmc1_groups[] = { "mmc1-1bit", "mmc1-4bit", }; static const char *jz4755_i2c_groups[] = { "i2c-data", }; static const char *jz4755_cim_groups[] = { "cim-data", }; static const char *jz4755_lcd_groups[] = { From d21f4b7ffc22c009da925046b69b15af08de9d75 Mon Sep 17 00:00:00 2001 From: Douglas Anderson Date: Fri, 14 Oct 2022 10:33:18 -0700 Subject: [PATCH 04/57] pinctrl: qcom: Avoid glitching lines when we first mux to output Back in the description of commit e440e30e26dd ("arm64: dts: qcom: sc7180: Avoid glitching SPI CS at bootup on trogdor") we described a problem that we were seeing on trogdor devices. I'll re-summarize here but you can also re-read the original commit. On trogdor devices, the BIOS is setting up the SPI chip select as: - mux special function (SPI chip select) - output enable - output low (unused because we've muxed as special function) In the kernel, however, we've moved away from using the chip select line as special function. Since the kernel wants to fully control the chip select it's far more efficient to treat the line as a GPIO rather than sending packet-like commands to the GENI firmware every time we want the line to toggle. When we transition from how the BIOS had the pin configured to how the kernel has the pin configured we end up glitching the line. That's because we _first_ change the mux of the line and then later set its output. This glitch is bad and can confuse the device on the other end of the line. The old commit e440e30e26dd ("arm64: dts: qcom: sc7180: Avoid glitching SPI CS at bootup on trogdor") fixed the glitch, though the solution was far from elegant. It essentially did the thing that everyone always hates: encoding a sequential program in device tree, even if it's a simple one. It also, unfortunately, got broken by commit b991f8c3622c ("pinctrl: core: Handling pinmux and pinconf separately"). After that commit we did all the muxing _first_ even though the config (set the pin to output high) was listed first. :( I looked at ideas for how to solve this more properly. My first thought was to use the "init" pinctrl state. In theory the "init" pinctrl state is supposed to be exactly for achieving glitch-free transitions. My dream would have been for the "init" pinctrl to do nothing at all. That would let us delay the automatic pin muxing until the driver could set things up and call pinctrl_init_done(). In other words, my dream was: /* Request the GPIO; init it 1 (because DT says GPIO_ACTIVE_LOW) */ devm_gpiod_get_index(dev, "cs", GPIOD_OUT_LOW); /* Output should be right, so we can remux, yay! */ pinctrl_init_done(dev); Unfortunately, it didn't work out. The primary reason is that the MSM GPIO driver implements gpio_request_enable(). As documented in pinmux.h, that function automatically remuxes a line as a GPIO. ...and it does this remuxing _before_ specifying the output of the pin. You can see in gpiod_get_index() that we call gpiod_request() before gpiod_configure_flags(). gpiod_request() isn't passed any flags so it has no idea what the eventual output will be. We could have debates about whether or not the automatic remuxing to GPIO for the MSM pinctrl was a good idea or not, but at this point I think there is a plethora of code that's relying on it and I certainly wouldn't suggest changing it. Alternatively, we could try to come up with a way to pass the initial output state to gpio_request_enable() and plumb all that through. That seems like it would be doable, but we'd have to plumb it through several layers in the stack. This patch implements yet another alternative. Here, we specifically avoid glitching the first time a pin is muxed to GPIO function if the direction of the pin is output. The idea is that we can read the state of the pin before we set the mux and make sure that the re-mux won't change the state. NOTES: - We only do this the first time since later swaps between mux states might want to preserve the old output value. In other words, I wouldn't want to break a driver that did: gpiod_set_value(g, 1); pinctrl_select_state(pinctrl, special_state); pinctrl_select_default_state(); /* We should be driving 1 even if "special_state" made the pin 0 */ - It's safe to do this the first time since the driver _couldn't_ have explicitly set a state. In order to even be able to control the GPIO (at least using gpiod) we have to have requested it which would have counted as the first mux. - In theory, instead of keeping track of the first time a pin was set as a GPIO we could enable the glitch-free behavior only when msm_pinmux_request_gpio() is in the callchain. That works an enables my "dream" implementation above where we use an "init" state to solve this. However, it's nice not to have to do this. By handling just the first transition to GPIO we can simply let the normal "default" remuxing happen and we can be assured that there won't be a glitch. Before this change I could see the glitch reported on the EC console when booting. It would say this when booting the kernel: Unexpected state 1 in CSNRE ISR After this change there is no error reported. Note that I haven't reproduced the original problem described in e440e30e26dd ("arm64: dts: qcom: sc7180: Avoid glitching SPI CS at bootup on trogdor") but I could believe it might happen in certain timing conditions. Fixes: b991f8c3622c ("pinctrl: core: Handling pinmux and pinconf separately") Signed-off-by: Douglas Anderson Reviewed-by: Stephen Boyd Link: https://lore.kernel.org/r/20221014103217.1.I656bb2c976ed626e5d37294eb252c1cf3be769dc@changeid Signed-off-by: Linus Walleij --- drivers/pinctrl/qcom/pinctrl-msm.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c index a2abfe987ab1..8bf8b21954fe 100644 --- a/drivers/pinctrl/qcom/pinctrl-msm.c +++ b/drivers/pinctrl/qcom/pinctrl-msm.c @@ -51,6 +51,7 @@ * detection. * @skip_wake_irqs: Skip IRQs that are handled by wakeup interrupt controller * @disabled_for_mux: These IRQs were disabled because we muxed away. + * @ever_gpio: This bit is set the first time we mux a pin to gpio_func. * @soc: Reference to soc_data of platform specific data. * @regs: Base addresses for the TLMM tiles. * @phys_base: Physical base address @@ -72,6 +73,7 @@ struct msm_pinctrl { DECLARE_BITMAP(enabled_irqs, MAX_NR_GPIO); DECLARE_BITMAP(skip_wake_irqs, MAX_NR_GPIO); DECLARE_BITMAP(disabled_for_mux, MAX_NR_GPIO); + DECLARE_BITMAP(ever_gpio, MAX_NR_GPIO); const struct msm_pinctrl_soc_data *soc; void __iomem *regs[MAX_NR_TILES]; @@ -218,6 +220,25 @@ static int msm_pinmux_set_mux(struct pinctrl_dev *pctldev, val = msm_readl_ctl(pctrl, g); + /* + * If this is the first time muxing to GPIO and the direction is + * output, make sure that we're not going to be glitching the pin + * by reading the current state of the pin and setting it as the + * output. + */ + if (i == gpio_func && (val & BIT(g->oe_bit)) && + !test_and_set_bit(group, pctrl->ever_gpio)) { + u32 io_val = msm_readl_io(pctrl, g); + + if (io_val & BIT(g->in_bit)) { + if (!(io_val & BIT(g->out_bit))) + msm_writel_io(io_val | BIT(g->out_bit), pctrl, g); + } else { + if (io_val & BIT(g->out_bit)) + msm_writel_io(io_val & ~BIT(g->out_bit), pctrl, g); + } + } + if (egpio_func && i == egpio_func) { if (val & BIT(g->egpio_present)) val &= ~BIT(g->egpio_enable); From 35cc9d622e8cd45029a1656ab2c6817538bc4180 Mon Sep 17 00:00:00 2001 From: Stanislav Fomichev Date: Fri, 14 Oct 2022 17:24:43 -0700 Subject: [PATCH 05/57] selftests/bpf: Add reproducer for decl_tag in func_proto return type It should trigger a WARN_ON_ONCE in btf_type_id_size. btf_func_proto_check kernel/bpf/btf.c:4447 [inline] btf_check_all_types kernel/bpf/btf.c:4723 [inline] btf_parse_type_sec kernel/bpf/btf.c:4752 [inline] btf_parse kernel/bpf/btf.c:5026 [inline] btf_new_fd+0x1926/0x1e70 kernel/bpf/btf.c:6892 bpf_btf_load kernel/bpf/syscall.c:4324 [inline] __sys_bpf+0xb7d/0x4cf0 kernel/bpf/syscall.c:5010 __do_sys_bpf kernel/bpf/syscall.c:5069 [inline] __se_sys_bpf kernel/bpf/syscall.c:5067 [inline] __x64_sys_bpf+0x75/0xb0 kernel/bpf/syscall.c:5067 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Cc: Yonghong Song Cc: Martin KaFai Lau Signed-off-by: Stanislav Fomichev Acked-by: Yonghong Song Link: https://lore.kernel.org/r/20221015002444.2680969-1-sdf@google.com Signed-off-by: Martin KaFai Lau --- tools/testing/selftests/bpf/prog_tests/btf.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index 127b8caa3dc1..24dd6214394e 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -3935,6 +3935,19 @@ static struct btf_raw_test raw_tests[] = { .btf_load_err = true, .err_str = "Invalid type_id", }, +{ + .descr = "decl_tag test #16, func proto, return type", + .raw_types = { + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + BTF_VAR_ENC(NAME_TBD, 1, 0), /* [2] */ + BTF_TYPE_ENC(NAME_TBD, BTF_INFO_ENC(BTF_KIND_DECL_TAG, 0, 0), 2), (-1), /* [3] */ + BTF_FUNC_PROTO_ENC(3, 0), /* [4] */ + BTF_END_RAW, + }, + BTF_STR_SEC("\0local\0tag1"), + .btf_load_err = true, + .err_str = "Invalid return type", +}, { .descr = "type_tag test #1", .raw_types = { From ea68376c8bed5cd156900852aada20c3a0874d17 Mon Sep 17 00:00:00 2001 From: Stanislav Fomichev Date: Fri, 14 Oct 2022 17:24:44 -0700 Subject: [PATCH 06/57] bpf: prevent decl_tag from being referenced in func_proto Syzkaller was able to hit the following issue: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 3609 at kernel/bpf/btf.c:1946 btf_type_id_size+0x2d5/0x9d0 kernel/bpf/btf.c:1946 Modules linked in: CPU: 0 PID: 3609 Comm: syz-executor361 Not tainted 6.0.0-syzkaller-02734-g0326074ff465 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022 RIP: 0010:btf_type_id_size+0x2d5/0x9d0 kernel/bpf/btf.c:1946 Code: ef e8 7f 8e e4 ff 41 83 ff 0b 77 28 f6 44 24 10 18 75 3f e8 6d 91 e4 ff 44 89 fe bf 0e 00 00 00 e8 20 8e e4 ff e8 5b 91 e4 ff <0f> 0b 45 31 f6 e9 98 02 00 00 41 83 ff 12 74 18 e8 46 91 e4 ff 44 RSP: 0018:ffffc90003cefb40 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 RDX: ffff8880259c0000 RSI: ffffffff81968415 RDI: 0000000000000005 RBP: ffff88801270ca00 R08: 0000000000000005 R09: 000000000000000e R10: 0000000000000011 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000011 R14: ffff888026ee6424 R15: 0000000000000011 FS: 000055555641b300(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000f2e258 CR3: 000000007110e000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: btf_func_proto_check kernel/bpf/btf.c:4447 [inline] btf_check_all_types kernel/bpf/btf.c:4723 [inline] btf_parse_type_sec kernel/bpf/btf.c:4752 [inline] btf_parse kernel/bpf/btf.c:5026 [inline] btf_new_fd+0x1926/0x1e70 kernel/bpf/btf.c:6892 bpf_btf_load kernel/bpf/syscall.c:4324 [inline] __sys_bpf+0xb7d/0x4cf0 kernel/bpf/syscall.c:5010 __do_sys_bpf kernel/bpf/syscall.c:5069 [inline] __se_sys_bpf kernel/bpf/syscall.c:5067 [inline] __x64_sys_bpf+0x75/0xb0 kernel/bpf/syscall.c:5067 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f0fbae41c69 Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc8aeb6228 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0fbae41c69 RDX: 0000000000000020 RSI: 0000000020000140 RDI: 0000000000000012 RBP: 00007f0fbae05e10 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000246 R12: 00007f0fbae05ea0 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Looks like it tries to create a func_proto which return type is decl_tag. For the details, see Martin's spot on analysis in [0]. 0: https://lore.kernel.org/bpf/CAKH8qBuQDLva_hHxxBuZzyAcYNO4ejhovz6TQeVSk8HY-2SO6g@mail.gmail.com/T/#mea6524b3fcd6298347432226e81b1e6155efc62c Cc: Yonghong Song Cc: Martin KaFai Lau Fixes: bd16dee66ae4 ("bpf: Add BTF_KIND_DECL_TAG typedef support") Reported-by: syzbot+d8bd751aef7c6b39a344@syzkaller.appspotmail.com Signed-off-by: Stanislav Fomichev Acked-by: Yonghong Song Link: https://lore.kernel.org/r/20221015002444.2680969-2-sdf@google.com Signed-off-by: Martin KaFai Lau --- kernel/bpf/btf.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index eba603cec2c5..35c07afac924 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -4436,6 +4436,11 @@ static int btf_func_proto_check(struct btf_verifier_env *env, return -EINVAL; } + if (btf_type_is_resolve_source_only(ret_type)) { + btf_verifier_log_type(env, t, "Invalid return type"); + return -EINVAL; + } + if (btf_type_needs_resolve(ret_type) && !env_type_is_resolved(env, ret_type_id)) { err = btf_resolve(env, ret_type, ret_type_id); From 9989bc33c4894e0751679b91fc6eb585772487b9 Mon Sep 17 00:00:00 2001 From: Sai Krishna Potthuri Date: Mon, 17 Oct 2022 18:33:02 +0530 Subject: [PATCH 07/57] Revert "pinctrl: pinctrl-zynqmp: Add support for output-enable and bias-high-impedance" This reverts commit ad2bea79ef0144043721d4893eef719c907e2e63. On systems with older PMUFW (Xilinx ZynqMP Platform Management Firmware) using these pinctrl properties can cause system hang because there is missing feature autodetection. When this feature is implemented in the PMUFW, support for these two properties should bring back. Cc: stable@vger.kernel.org Signed-off-by: Sai Krishna Potthuri Acked-by: Michal Simek Link: https://lore.kernel.org/r/20221017130303.21746-2-sai.krishna.potthuri@amd.com Signed-off-by: Linus Walleij --- drivers/pinctrl/pinctrl-zynqmp.c | 9 --------- 1 file changed, 9 deletions(-) diff --git a/drivers/pinctrl/pinctrl-zynqmp.c b/drivers/pinctrl/pinctrl-zynqmp.c index 7d2fbf8a02cd..c98f35ad8921 100644 --- a/drivers/pinctrl/pinctrl-zynqmp.c +++ b/drivers/pinctrl/pinctrl-zynqmp.c @@ -412,10 +412,6 @@ static int zynqmp_pinconf_cfg_set(struct pinctrl_dev *pctldev, break; case PIN_CONFIG_BIAS_HIGH_IMPEDANCE: - param = PM_PINCTRL_CONFIG_TRI_STATE; - arg = PM_PINCTRL_TRI_STATE_ENABLE; - ret = zynqmp_pm_pinctrl_set_config(pin, param, arg); - break; case PIN_CONFIG_MODE_LOW_POWER: /* * These cases are mentioned in dts but configurable @@ -424,11 +420,6 @@ static int zynqmp_pinconf_cfg_set(struct pinctrl_dev *pctldev, */ ret = 0; break; - case PIN_CONFIG_OUTPUT_ENABLE: - param = PM_PINCTRL_CONFIG_TRI_STATE; - arg = PM_PINCTRL_TRI_STATE_DISABLE; - ret = zynqmp_pm_pinctrl_set_config(pin, param, arg); - break; default: dev_warn(pctldev->dev, "unsupported configuration parameter '%u'\n", From ff8356060e3a5e126abb5e1f6b6e9931c220dec2 Mon Sep 17 00:00:00 2001 From: Sai Krishna Potthuri Date: Mon, 17 Oct 2022 18:33:03 +0530 Subject: [PATCH 08/57] Revert "dt-bindings: pinctrl-zynqmp: Add output-enable configuration" This reverts commit 133ad0d9af99bdca90705dadd8d31c20bfc9919f. On systems with older PMUFW (Xilinx ZynqMP Platform Management Firmware) using these pinctrl properties can cause system hang because there is missing feature autodetection. When this feature is implemented, support for these two properties should bring back. Cc: stable@vger.kernel.org Signed-off-by: Sai Krishna Potthuri Acked-by: Michal Simek Link: https://lore.kernel.org/r/20221017130303.21746-3-sai.krishna.potthuri@amd.com Signed-off-by: Linus Walleij --- .../devicetree/bindings/pinctrl/xlnx,zynqmp-pinctrl.yaml | 4 ---- 1 file changed, 4 deletions(-) diff --git a/Documentation/devicetree/bindings/pinctrl/xlnx,zynqmp-pinctrl.yaml b/Documentation/devicetree/bindings/pinctrl/xlnx,zynqmp-pinctrl.yaml index 1e2b9b627b12..2722dc7bb03d 100644 --- a/Documentation/devicetree/bindings/pinctrl/xlnx,zynqmp-pinctrl.yaml +++ b/Documentation/devicetree/bindings/pinctrl/xlnx,zynqmp-pinctrl.yaml @@ -274,10 +274,6 @@ patternProperties: slew-rate: enum: [0, 1] - output-enable: - description: - This will internally disable the tri-state for MIO pins. - drive-strength: description: Selects the drive strength for MIO pins, in mA. From e9945b2633deccda74a769d94060df49c53ff181 Mon Sep 17 00:00:00 2001 From: Horatiu Vultur Date: Tue, 18 Oct 2022 09:09:59 +0200 Subject: [PATCH 09/57] pinctrl: ocelot: Fix incorrect trigger of the interrupt. The interrupt controller can detect only link changes. So in case an external device generated a level based interrupt, then the interrupt controller detected correctly the first edge. But the problem was that the interrupt controller was detecting also the edge when the interrupt was cleared. So it would generate another interrupt. The fix for this is to clear the second interrupt but still check the interrupt line status. Fixes: c297561bc98a ("pinctrl: ocelot: Fix interrupt controller") Signed-off-by: Horatiu Vultur Tested-by: Michael Walle Link: https://lore.kernel.org/r/20221018070959.1322606-1-horatiu.vultur@microchip.com Signed-off-by: Linus Walleij --- drivers/pinctrl/pinctrl-ocelot.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/pinctrl/pinctrl-ocelot.c b/drivers/pinctrl/pinctrl-ocelot.c index 62ce3957abe4..687aaa601555 100644 --- a/drivers/pinctrl/pinctrl-ocelot.c +++ b/drivers/pinctrl/pinctrl-ocelot.c @@ -1864,19 +1864,28 @@ static void ocelot_irq_unmask_level(struct irq_data *data) if (val & bit) ack = true; + /* Try to clear any rising edges */ + if (!active && ack) + regmap_write_bits(info->map, REG(OCELOT_GPIO_INTR, info, gpio), + bit, bit); + /* Enable the interrupt now */ gpiochip_enable_irq(chip, gpio); regmap_update_bits(info->map, REG(OCELOT_GPIO_INTR_ENA, info, gpio), bit, bit); /* - * In case the interrupt line is still active and the interrupt - * controller has not seen any changes in the interrupt line, then it - * means that there happen another interrupt while the line was active. + * In case the interrupt line is still active then it means that + * there happen another interrupt while the line was active. * So we missed that one, so we need to kick the interrupt again * handler. */ - if (active && !ack) { + regmap_read(info->map, REG(OCELOT_GPIO_IN, info, gpio), &val); + if ((!(val & bit) && trigger_level == IRQ_TYPE_LEVEL_LOW) || + (val & bit && trigger_level == IRQ_TYPE_LEVEL_HIGH)) + active = true; + + if (active) { struct ocelot_irq_work *work; work = kmalloc(sizeof(*work), GFP_ATOMIC); From 03cab65a07e083b6c1010fbc8f9b817e9aca75d9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ricardo=20Ca=C3=B1uelo?= Date: Mon, 10 Oct 2022 08:37:02 +0200 Subject: [PATCH 10/57] selftests/futex: fix build for clang MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Don't use the test-specific header files as source files to force a target dependency, as clang will complain if more than one source file is used for a compile command with a single '-o' flag. Use the proper Makefile variables instead as defined in tools/testing/selftests/lib.mk. Signed-off-by: Ricardo Cañuelo Reviewed-by: André Almeida Signed-off-by: Shuah Khan --- tools/testing/selftests/futex/functional/Makefile | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/futex/functional/Makefile b/tools/testing/selftests/futex/functional/Makefile index 732149011692..5a0e0df8de9b 100644 --- a/tools/testing/selftests/futex/functional/Makefile +++ b/tools/testing/selftests/futex/functional/Makefile @@ -3,11 +3,11 @@ INCLUDES := -I../include -I../../ -I../../../../../usr/include/ CFLAGS := $(CFLAGS) -g -O2 -Wall -D_GNU_SOURCE -pthread $(INCLUDES) $(KHDR_INCLUDES) LDLIBS := -lpthread -lrt -HEADERS := \ +LOCAL_HDRS := \ ../include/futextest.h \ ../include/atomic.h \ ../include/logging.h -TEST_GEN_FILES := \ +TEST_GEN_PROGS := \ futex_wait_timeout \ futex_wait_wouldblock \ futex_requeue_pi \ @@ -24,5 +24,3 @@ TEST_PROGS := run.sh top_srcdir = ../../../../.. DEFAULT_INSTALL_HDR_PATH := 1 include ../../lib.mk - -$(TEST_GEN_FILES): $(HEADERS) From beb7d862ed4ac6aa14625418970f22a7d55b8615 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ricardo=20Ca=C3=B1uelo?= Date: Mon, 10 Oct 2022 08:38:11 +0200 Subject: [PATCH 11/57] selftests/intel_pstate: fix build for ARCH=x86_64 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Handle the scenario where the build is launched with the ARCH envvar defined as x86_64. Signed-off-by: Ricardo Cañuelo Signed-off-by: Shuah Khan --- tools/testing/selftests/intel_pstate/Makefile | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/intel_pstate/Makefile b/tools/testing/selftests/intel_pstate/Makefile index 39f0fa2a8fd6..05d66ef50c97 100644 --- a/tools/testing/selftests/intel_pstate/Makefile +++ b/tools/testing/selftests/intel_pstate/Makefile @@ -2,10 +2,10 @@ CFLAGS := $(CFLAGS) -Wall -D_GNU_SOURCE LDLIBS += -lm -uname_M := $(shell uname -m 2>/dev/null || echo not) -ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/x86/ -e s/x86_64/x86/) +ARCH ?= $(shell uname -m 2>/dev/null || echo not) +ARCH_PROCESSED := $(shell echo $(ARCH) | sed -e s/i.86/x86/ -e s/x86_64/x86/) -ifeq (x86,$(ARCH)) +ifeq (x86,$(ARCH_PROCESSED)) TEST_GEN_FILES := msr aperf endif From 2a8e366b23fea29a5308f71ba49555e3c8c664f1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ricardo=20Ca=C3=B1uelo?= Date: Mon, 10 Oct 2022 08:39:27 +0200 Subject: [PATCH 12/57] selftests/kexec: fix build for ARCH=x86_64 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Handle the scenario where the build is launched with the ARCH envvar defined as x86_64. Signed-off-by: Ricardo Cañuelo Signed-off-by: Shuah Khan --- tools/testing/selftests/kexec/Makefile | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/kexec/Makefile b/tools/testing/selftests/kexec/Makefile index 806a150648c3..67fe7a46cb62 100644 --- a/tools/testing/selftests/kexec/Makefile +++ b/tools/testing/selftests/kexec/Makefile @@ -1,10 +1,10 @@ # SPDX-License-Identifier: GPL-2.0-only # Makefile for kexec tests -uname_M := $(shell uname -m 2>/dev/null || echo not) -ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/x86/ -e s/x86_64/x86/) +ARCH ?= $(shell uname -m 2>/dev/null || echo not) +ARCH_PROCESSED := $(shell echo $(ARCH) | sed -e s/i.86/x86/ -e s/x86_64/x86/) -ifeq ($(ARCH),$(filter $(ARCH),x86 ppc64le)) +ifeq ($(ARCH_PROCESSED),$(filter $(ARCH_PROCESSED),x86 ppc64le)) TEST_PROGS := test_kexec_load.sh test_kexec_file_load.sh TEST_FILES := kexec_common_lib.sh From eb6789b0c3424f84e8441c4796083db2f095c391 Mon Sep 17 00:00:00 2001 From: Zhao Gongyi Date: Tue, 11 Oct 2022 09:39:26 +0800 Subject: [PATCH 13/57] selftests/memory-hotplug: Remove the redundant warning information Remove the redundant warning information of online_all_offline_memory() since there is a warning in online_memory_expect_success(). Signed-off-by: Zhao Gongyi Reviewed-by: David Hildenbrand Signed-off-by: Shuah Khan --- tools/testing/selftests/memory-hotplug/mem-on-off-test.sh | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh b/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh index 74ee5067a8ce..611be86eaf3d 100755 --- a/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh +++ b/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh @@ -138,7 +138,6 @@ online_all_offline_memory() { for memory in `hotpluggable_offline_memory`; do if ! online_memory_expect_success $memory; then - echo "$FUNCNAME $memory: unexpected fail" >&2 retval=1 fi done From cb05c81ada76a30a25a5f79b249375e33473af33 Mon Sep 17 00:00:00 2001 From: Sven Schnelle Date: Mon, 10 Oct 2022 09:42:07 +0200 Subject: [PATCH 14/57] selftests/ftrace: fix dynamic_events dependency check commit 95c104c378dc ("tracing: Auto generate event name when creating a group of events") changed the syntax in the ftrace README file which is used by the selftests to check what features are support. Adjust the string to make test_duplicates.tc and trigger-synthetic-eprobe.tc work again. Fixes: 95c104c378dc ("tracing: Auto generate event name when creating a group of events") Signed-off-by: Sven Schnelle Acked-by: Steven Rostedt (Google) Acked-by: Masami Hiramatsu (Google) Signed-off-by: Shuah Khan --- .../testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc | 2 +- .../test.d/trigger/inter-event/trigger-synthetic-eprobe.tc | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc b/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc index db522577ff78..d3a79da215c8 100644 --- a/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc +++ b/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc @@ -1,7 +1,7 @@ #!/bin/sh # SPDX-License-Identifier: GPL-2.0 # description: Generic dynamic event - check if duplicate events are caught -# requires: dynamic_events "e[:[/]] . []":README +# requires: dynamic_events "e[:[/][]] . []":README echo 0 > events/enable diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-eprobe.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-eprobe.tc index 914fe2e5d030..6461c375694f 100644 --- a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-eprobe.tc +++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-eprobe.tc @@ -1,7 +1,7 @@ #!/bin/sh # SPDX-License-Identifier: GPL-2.0 # description: event trigger - test inter-event histogram trigger eprobe on synthetic event -# requires: dynamic_events synthetic_events events/syscalls/sys_enter_openat/hist "e[:[/]] . []":README +# requires: dynamic_events synthetic_events events/syscalls/sys_enter_openat/hist "e[:[/][]] . []":README echo 0 > events/enable From 618887768bb71f0a475334fa5a4fba7dc98d7ab5 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Fri, 14 Oct 2022 12:37:25 +0300 Subject: [PATCH 15/57] kunit: update NULL vs IS_ERR() tests The alloc_string_stream() functions were changed from returning NULL on error to returning error pointers so these caller needs to be updated as well. Fixes: 78b1c6584fce ("kunit: string-stream: Simplify resource use") Signed-off-by: Dan Carpenter Reviewed-by: Daniel Latypov Reviewed-by: David Gow Signed-off-by: Shuah Khan --- lib/kunit/string-stream.c | 4 ++-- lib/kunit/test.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/kunit/string-stream.c b/lib/kunit/string-stream.c index f5ae79c37400..a608746020a9 100644 --- a/lib/kunit/string-stream.c +++ b/lib/kunit/string-stream.c @@ -56,8 +56,8 @@ int string_stream_vadd(struct string_stream *stream, frag_container = alloc_string_stream_fragment(stream->test, len, stream->gfp); - if (!frag_container) - return -ENOMEM; + if (IS_ERR(frag_container)) + return PTR_ERR(frag_container); len = vsnprintf(frag_container->fragment, len, fmt, args); spin_lock(&stream->lock); diff --git a/lib/kunit/test.c b/lib/kunit/test.c index 90640a43cf62..2a6992fe7c3e 100644 --- a/lib/kunit/test.c +++ b/lib/kunit/test.c @@ -265,7 +265,7 @@ static void kunit_fail(struct kunit *test, const struct kunit_loc *loc, kunit_set_failure(test); stream = alloc_string_stream(test, GFP_KERNEL); - if (!stream) { + if (IS_ERR(stream)) { WARN(true, "Could not allocate stream to print failed assertion in %s:%d\n", loc->file, From 31d8aaa87fcef1be5932f3813ea369e21bd3b11d Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 20 Oct 2022 10:58:14 -0700 Subject: [PATCH 16/57] rcu: Keep synchronize_rcu() from enabling irqs in early boot Making polled RCU grace periods account for expedited grace periods required acquiring the leaf rcu_node structure's lock during early boot, but after rcu_init() was called. This lock is irq-disabled, but the code incorrectly assumes that irqs are always disabled when invoking synchronize_rcu(). The exception is early boot before the scheduler has started, which means that upon return from synchronize_rcu(), irqs will be incorrectly enabled. This commit fixes this bug by using irqsave/irqrestore locking primitives. Fixes: bf95b2bc3e42 ("rcu: Switch polled grace-period APIs to ->gp_seq_polled") Reported-by: Steven Rostedt Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 6bb8e72bc815..93416afebd59 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1403,30 +1403,32 @@ static void rcu_poll_gp_seq_end(unsigned long *snap) // where caller does not hold the root rcu_node structure's lock. static void rcu_poll_gp_seq_start_unlocked(unsigned long *snap) { + unsigned long flags; struct rcu_node *rnp = rcu_get_root(); if (rcu_init_invoked()) { lockdep_assert_irqs_enabled(); - raw_spin_lock_irq_rcu_node(rnp); + raw_spin_lock_irqsave_rcu_node(rnp, flags); } rcu_poll_gp_seq_start(snap); if (rcu_init_invoked()) - raw_spin_unlock_irq_rcu_node(rnp); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } // Make the polled API aware of the end of a grace period, but where // caller does not hold the root rcu_node structure's lock. static void rcu_poll_gp_seq_end_unlocked(unsigned long *snap) { + unsigned long flags; struct rcu_node *rnp = rcu_get_root(); if (rcu_init_invoked()) { lockdep_assert_irqs_enabled(); - raw_spin_lock_irq_rcu_node(rnp); + raw_spin_lock_irqsave_rcu_node(rnp, flags); } rcu_poll_gp_seq_end(snap); if (rcu_init_invoked()) - raw_spin_unlock_irq_rcu_node(rnp); + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } /* From dbe69b29988465b011f198f2797b1c2b6980b50e Mon Sep 17 00:00:00 2001 From: Jiri Olsa Date: Tue, 18 Oct 2022 09:59:34 +0200 Subject: [PATCH 17/57] bpf: Fix dispatcher patchable function entry to 5 bytes nop The patchable_function_entry(5) might output 5 single nop instructions (depends on toolchain), which will clash with bpf_arch_text_poke check for 5 bytes nop instruction. Adding early init call for dispatcher that checks and change the patchable entry into expected 5 nop instruction if needed. There's no need to take text_mutex, because we are using it in early init call which is called at pre-smp time. Fixes: ceea991a019c ("bpf: Move bpf_dispatcher function out of ftrace locations") Signed-off-by: Jiri Olsa Acked-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20221018075934.574415-1-jolsa@kernel.org Signed-off-by: Alexei Starovoitov --- arch/x86/net/bpf_jit_comp.c | 13 +++++++++++++ include/linux/bpf.h | 14 +++++++++++++- kernel/bpf/dispatcher.c | 6 ++++++ 3 files changed, 32 insertions(+), 1 deletion(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 99620428ad78..00127abd89ee 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -388,6 +389,18 @@ out: return ret; } +int __init bpf_arch_init_dispatcher_early(void *ip) +{ + const u8 *nop_insn = x86_nops[5]; + + if (is_endbr(*(u32 *)ip)) + ip += ENDBR_INSN_SIZE; + + if (memcmp(ip, nop_insn, X86_PATCH_SIZE)) + text_poke_early(ip, nop_insn, X86_PATCH_SIZE); + return 0; +} + int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t, void *old_addr, void *new_addr) { diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 9e7d46d16032..0566705c1d4e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -27,6 +27,7 @@ #include #include #include +#include struct bpf_verifier_env; struct bpf_verifier_log; @@ -970,6 +971,8 @@ struct bpf_trampoline *bpf_trampoline_get(u64 key, struct bpf_attach_target_info *tgt_info); void bpf_trampoline_put(struct bpf_trampoline *tr); int arch_prepare_bpf_dispatcher(void *image, void *buf, s64 *funcs, int num_funcs); +int __init bpf_arch_init_dispatcher_early(void *ip); + #define BPF_DISPATCHER_INIT(_name) { \ .mutex = __MUTEX_INITIALIZER(_name.mutex), \ .func = &_name##_func, \ @@ -983,6 +986,13 @@ int arch_prepare_bpf_dispatcher(void *image, void *buf, s64 *funcs, int num_func }, \ } +#define BPF_DISPATCHER_INIT_CALL(_name) \ + static int __init _name##_init(void) \ + { \ + return bpf_arch_init_dispatcher_early(_name##_func); \ + } \ + early_initcall(_name##_init) + #ifdef CONFIG_X86_64 #define BPF_DISPATCHER_ATTRIBUTES __attribute__((patchable_function_entry(5))) #else @@ -1000,7 +1010,9 @@ int arch_prepare_bpf_dispatcher(void *image, void *buf, s64 *funcs, int num_func } \ EXPORT_SYMBOL(bpf_dispatcher_##name##_func); \ struct bpf_dispatcher bpf_dispatcher_##name = \ - BPF_DISPATCHER_INIT(bpf_dispatcher_##name); + BPF_DISPATCHER_INIT(bpf_dispatcher_##name); \ + BPF_DISPATCHER_INIT_CALL(bpf_dispatcher_##name); + #define DECLARE_BPF_DISPATCHER(name) \ unsigned int bpf_dispatcher_##name##_func( \ const void *ctx, \ diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c index fa64b80b8bca..04f0a045dcaa 100644 --- a/kernel/bpf/dispatcher.c +++ b/kernel/bpf/dispatcher.c @@ -4,6 +4,7 @@ #include #include #include +#include /* The BPF dispatcher is a multiway branch code generator. The * dispatcher is a mechanism to avoid the performance penalty of an @@ -90,6 +91,11 @@ int __weak arch_prepare_bpf_dispatcher(void *image, void *buf, s64 *funcs, int n return -ENOTSUPP; } +int __weak __init bpf_arch_init_dispatcher_early(void *ip) +{ + return -ENOTSUPP; +} + static int bpf_dispatcher_prepare(struct bpf_dispatcher *d, void *image, void *buf) { s64 ips[BPF_DISPATCHER_MAX] = {}, *ipsp = &ips[0]; From 82cb4e4612c633a9ce320e1773114875604a3cce Mon Sep 17 00:00:00 2001 From: Xin Long Date: Tue, 18 Oct 2022 15:19:50 -0400 Subject: [PATCH 18/57] tipc: fix a null-ptr-deref in tipc_topsrv_accept syzbot found a crash in tipc_topsrv_accept: KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] Workqueue: tipc_rcv tipc_topsrv_accept RIP: 0010:kernel_accept+0x22d/0x350 net/socket.c:3487 Call Trace: tipc_topsrv_accept+0x197/0x280 net/tipc/topsrv.c:460 process_one_work+0x991/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e4/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 It was caused by srv->listener that might be set to null by tipc_topsrv_stop() in net .exit whereas it's still used in tipc_topsrv_accept() worker. srv->listener is protected by srv->idr_lock in tipc_topsrv_stop(), so add a check for srv->listener under srv->idr_lock in tipc_topsrv_accept() to avoid the null-ptr-deref. To ensure the lsock is not released during the tipc_topsrv_accept(), move sock_release() after tipc_topsrv_work_stop() where it's waiting until the tipc_topsrv_accept worker to be done. Note that sk_callback_lock is used to protect sk->sk_user_data instead of srv->listener, and it should check srv in tipc_topsrv_listener_data_ready() instead. This also ensures that no more tipc_topsrv_accept worker will be started after tipc_conn_close() is called in tipc_topsrv_stop() where it sets sk->sk_user_data to null. Fixes: 0ef897be12b8 ("tipc: separate topology server listener socket from subcsriber sockets") Reported-by: syzbot+c5ce866a8d30f4be0651@syzkaller.appspotmail.com Signed-off-by: Xin Long Acked-by: Jon Maloy Link: https://lore.kernel.org/r/4eee264380c409c61c6451af1059b7fb271a7e7b.1666120790.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski --- net/tipc/topsrv.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/net/tipc/topsrv.c b/net/tipc/topsrv.c index 14fd05fd6107..d92ec92f0b71 100644 --- a/net/tipc/topsrv.c +++ b/net/tipc/topsrv.c @@ -450,12 +450,19 @@ static void tipc_conn_data_ready(struct sock *sk) static void tipc_topsrv_accept(struct work_struct *work) { struct tipc_topsrv *srv = container_of(work, struct tipc_topsrv, awork); - struct socket *lsock = srv->listener; - struct socket *newsock; + struct socket *newsock, *lsock; struct tipc_conn *con; struct sock *newsk; int ret; + spin_lock_bh(&srv->idr_lock); + if (!srv->listener) { + spin_unlock_bh(&srv->idr_lock); + return; + } + lsock = srv->listener; + spin_unlock_bh(&srv->idr_lock); + while (1) { ret = kernel_accept(lsock, &newsock, O_NONBLOCK); if (ret < 0) @@ -489,7 +496,7 @@ static void tipc_topsrv_listener_data_ready(struct sock *sk) read_lock_bh(&sk->sk_callback_lock); srv = sk->sk_user_data; - if (srv->listener) + if (srv) queue_work(srv->rcv_wq, &srv->awork); read_unlock_bh(&sk->sk_callback_lock); } @@ -699,8 +706,9 @@ static void tipc_topsrv_stop(struct net *net) __module_get(lsock->sk->sk_prot_creator->owner); srv->listener = NULL; spin_unlock_bh(&srv->idr_lock); - sock_release(lsock); + tipc_topsrv_work_stop(srv); + sock_release(lsock); idr_destroy(&srv->conn_idr); kfree(srv); } From 94423589689124e8cd145b38a1034be7f25835b2 Mon Sep 17 00:00:00 2001 From: Yang Yingliang Date: Wed, 19 Oct 2022 14:41:04 +0800 Subject: [PATCH 19/57] net: netsec: fix error handling in netsec_register_mdio() If phy_device_register() fails, phy_device_free() need be called to put refcount, so memory of phy device and device name can be freed in callback function. If get_phy_device() fails, mdiobus_unregister() need be called, or it will cause warning in mdiobus_free() and kobject is leaked. Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver") Signed-off-by: Yang Yingliang Link: https://lore.kernel.org/r/20221019064104.3228892-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/socionext/netsec.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c index 2240f6d0b89b..9b46579b5a10 100644 --- a/drivers/net/ethernet/socionext/netsec.c +++ b/drivers/net/ethernet/socionext/netsec.c @@ -1961,11 +1961,13 @@ static int netsec_register_mdio(struct netsec_priv *priv, u32 phy_addr) ret = PTR_ERR(priv->phydev); dev_err(priv->dev, "get_phy_device err(%d)\n", ret); priv->phydev = NULL; + mdiobus_unregister(bus); return -ENODEV; } ret = phy_device_register(priv->phydev); if (ret) { + phy_device_free(priv->phydev); mdiobus_unregister(bus); dev_err(priv->dev, "phy_device_register err(%d)\n", ret); From f8c1c66b99a570c08b9d26e4347276f00e49bba7 Mon Sep 17 00:00:00 2001 From: Horatiu Vultur Date: Wed, 19 Oct 2022 10:30:56 +0200 Subject: [PATCH 20/57] net: lan966x: Fix the rx drop counter Currently the rx drop is calculated as the sum of multiple HW drop counters. The issue is that not all the HW drop counters were added for the rx drop counter. So if for example you have a police that drops frames, they were not see in the rx drop counter. Fix this by updating how the rx drop counter is calculated. It is required to add also RX_RED_PRIO_* HW counters. Fixes: 12c2d0a5b8e2 ("net: lan966x: add ethtool configuration and statistics") Signed-off-by: Horatiu Vultur Link: https://lore.kernel.org/r/20221019083056.2744282-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski --- .../net/ethernet/microchip/lan966x/lan966x_ethtool.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_ethtool.c b/drivers/net/ethernet/microchip/lan966x/lan966x_ethtool.c index e58a27fd8b50..fea42542be28 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_ethtool.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_ethtool.c @@ -656,7 +656,15 @@ void lan966x_stats_get(struct net_device *dev, stats->rx_dropped = dev->stats.rx_dropped + lan966x->stats[idx + SYS_COUNT_RX_LONG] + lan966x->stats[idx + SYS_COUNT_DR_LOCAL] + - lan966x->stats[idx + SYS_COUNT_DR_TAIL]; + lan966x->stats[idx + SYS_COUNT_DR_TAIL] + + lan966x->stats[idx + SYS_COUNT_RX_RED_PRIO_0] + + lan966x->stats[idx + SYS_COUNT_RX_RED_PRIO_1] + + lan966x->stats[idx + SYS_COUNT_RX_RED_PRIO_2] + + lan966x->stats[idx + SYS_COUNT_RX_RED_PRIO_3] + + lan966x->stats[idx + SYS_COUNT_RX_RED_PRIO_4] + + lan966x->stats[idx + SYS_COUNT_RX_RED_PRIO_5] + + lan966x->stats[idx + SYS_COUNT_RX_RED_PRIO_6] + + lan966x->stats[idx + SYS_COUNT_RX_RED_PRIO_7]; for (i = 0; i < LAN966X_NUM_TC; i++) { stats->rx_dropped += From ae108c48b5d2b34bcef3c4fb5076f42c922c426a Mon Sep 17 00:00:00 2001 From: Benjamin Poirier Date: Wed, 19 Oct 2022 18:10:41 +0900 Subject: [PATCH 21/57] selftests: net: Fix cross-tree inclusion of scripts When exporting and running a subset of selftests via kselftest, files from parts of the source tree which were not exported are not available. A few tests are trying to source such files. Address the problem by using symlinks. The problem can be reproduced by running: make -C tools/testing/selftests gen_tar TARGETS="drivers/net/bonding" [... extract archive ...] ./run_kselftest.sh or: make kselftest KBUILD_OUTPUT=/tmp/kselftests TARGETS="drivers/net/bonding" Fixes: bbb774d921e2 ("net: Add tests for bonding and team address list management") Fixes: eccd0a80dc7f ("selftests: net: dsa: add a stress test for unlocked FDB operations") Link: https://lore.kernel.org/netdev/40f04ded-0c86-8669-24b1-9a313ca21076@redhat.com/ Reported-by: Jonathan Toppins Signed-off-by: Benjamin Poirier Reviewed-by: Jonathan Toppins Signed-off-by: Jakub Kicinski --- tools/testing/selftests/drivers/net/bonding/Makefile | 4 +++- tools/testing/selftests/drivers/net/bonding/dev_addr_lists.sh | 2 +- .../selftests/drivers/net/bonding/net_forwarding_lib.sh | 1 + .../selftests/drivers/net/dsa/test_bridge_fdb_stress.sh | 4 ++-- tools/testing/selftests/drivers/net/team/Makefile | 4 ++++ tools/testing/selftests/drivers/net/team/dev_addr_lists.sh | 4 ++-- tools/testing/selftests/drivers/net/team/lag_lib.sh | 1 + .../testing/selftests/drivers/net/team/net_forwarding_lib.sh | 1 + tools/testing/selftests/lib.mk | 4 ++-- 9 files changed, 17 insertions(+), 8 deletions(-) create mode 120000 tools/testing/selftests/drivers/net/bonding/net_forwarding_lib.sh create mode 120000 tools/testing/selftests/drivers/net/team/lag_lib.sh create mode 120000 tools/testing/selftests/drivers/net/team/net_forwarding_lib.sh diff --git a/tools/testing/selftests/drivers/net/bonding/Makefile b/tools/testing/selftests/drivers/net/bonding/Makefile index e9dab5f9d773..6b8d2e2f23c2 100644 --- a/tools/testing/selftests/drivers/net/bonding/Makefile +++ b/tools/testing/selftests/drivers/net/bonding/Makefile @@ -7,6 +7,8 @@ TEST_PROGS := \ bond-lladdr-target.sh \ dev_addr_lists.sh -TEST_FILES := lag_lib.sh +TEST_FILES := \ + lag_lib.sh \ + net_forwarding_lib.sh include ../../../lib.mk diff --git a/tools/testing/selftests/drivers/net/bonding/dev_addr_lists.sh b/tools/testing/selftests/drivers/net/bonding/dev_addr_lists.sh index e6fa24eded5b..5cfe7d8ebc25 100755 --- a/tools/testing/selftests/drivers/net/bonding/dev_addr_lists.sh +++ b/tools/testing/selftests/drivers/net/bonding/dev_addr_lists.sh @@ -14,7 +14,7 @@ ALL_TESTS=" REQUIRE_MZ=no NUM_NETIFS=0 lib_dir=$(dirname "$0") -source "$lib_dir"/../../../net/forwarding/lib.sh +source "$lib_dir"/net_forwarding_lib.sh source "$lib_dir"/lag_lib.sh diff --git a/tools/testing/selftests/drivers/net/bonding/net_forwarding_lib.sh b/tools/testing/selftests/drivers/net/bonding/net_forwarding_lib.sh new file mode 120000 index 000000000000..39c96828c5ef --- /dev/null +++ b/tools/testing/selftests/drivers/net/bonding/net_forwarding_lib.sh @@ -0,0 +1 @@ +../../../net/forwarding/lib.sh \ No newline at end of file diff --git a/tools/testing/selftests/drivers/net/dsa/test_bridge_fdb_stress.sh b/tools/testing/selftests/drivers/net/dsa/test_bridge_fdb_stress.sh index dca8be6092b9..a1f269ee84da 100755 --- a/tools/testing/selftests/drivers/net/dsa/test_bridge_fdb_stress.sh +++ b/tools/testing/selftests/drivers/net/dsa/test_bridge_fdb_stress.sh @@ -18,8 +18,8 @@ NUM_NETIFS=1 REQUIRE_JQ="no" REQUIRE_MZ="no" NETIF_CREATE="no" -lib_dir=$(dirname $0)/../../../net/forwarding -source $lib_dir/lib.sh +lib_dir=$(dirname "$0") +source "$lib_dir"/lib.sh cleanup() { echo "Cleaning up" diff --git a/tools/testing/selftests/drivers/net/team/Makefile b/tools/testing/selftests/drivers/net/team/Makefile index 642d8df1c137..6a86e61e8bfe 100644 --- a/tools/testing/selftests/drivers/net/team/Makefile +++ b/tools/testing/selftests/drivers/net/team/Makefile @@ -3,4 +3,8 @@ TEST_PROGS := dev_addr_lists.sh +TEST_FILES := \ + lag_lib.sh \ + net_forwarding_lib.sh + include ../../../lib.mk diff --git a/tools/testing/selftests/drivers/net/team/dev_addr_lists.sh b/tools/testing/selftests/drivers/net/team/dev_addr_lists.sh index debda7262956..9684163949f0 100755 --- a/tools/testing/selftests/drivers/net/team/dev_addr_lists.sh +++ b/tools/testing/selftests/drivers/net/team/dev_addr_lists.sh @@ -11,9 +11,9 @@ ALL_TESTS=" REQUIRE_MZ=no NUM_NETIFS=0 lib_dir=$(dirname "$0") -source "$lib_dir"/../../../net/forwarding/lib.sh +source "$lib_dir"/net_forwarding_lib.sh -source "$lib_dir"/../bonding/lag_lib.sh +source "$lib_dir"/lag_lib.sh destroy() diff --git a/tools/testing/selftests/drivers/net/team/lag_lib.sh b/tools/testing/selftests/drivers/net/team/lag_lib.sh new file mode 120000 index 000000000000..e1347a10afde --- /dev/null +++ b/tools/testing/selftests/drivers/net/team/lag_lib.sh @@ -0,0 +1 @@ +../bonding/lag_lib.sh \ No newline at end of file diff --git a/tools/testing/selftests/drivers/net/team/net_forwarding_lib.sh b/tools/testing/selftests/drivers/net/team/net_forwarding_lib.sh new file mode 120000 index 000000000000..39c96828c5ef --- /dev/null +++ b/tools/testing/selftests/drivers/net/team/net_forwarding_lib.sh @@ -0,0 +1 @@ +../../../net/forwarding/lib.sh \ No newline at end of file diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk index 9d4cb94cf437..a3ea3d4a206d 100644 --- a/tools/testing/selftests/lib.mk +++ b/tools/testing/selftests/lib.mk @@ -70,7 +70,7 @@ endef run_tests: all ifdef building_out_of_srctree @if [ "X$(TEST_PROGS)$(TEST_PROGS_EXTENDED)$(TEST_FILES)" != "X" ]; then \ - rsync -aq $(TEST_PROGS) $(TEST_PROGS_EXTENDED) $(TEST_FILES) $(OUTPUT); \ + rsync -aLq $(TEST_PROGS) $(TEST_PROGS_EXTENDED) $(TEST_FILES) $(OUTPUT); \ fi @if [ "X$(TEST_PROGS)" != "X" ]; then \ $(call RUN_TESTS, $(TEST_GEN_PROGS) $(TEST_CUSTOM_PROGS) \ @@ -84,7 +84,7 @@ endif define INSTALL_SINGLE_RULE $(if $(INSTALL_LIST),@mkdir -p $(INSTALL_PATH)) - $(if $(INSTALL_LIST),rsync -a $(INSTALL_LIST) $(INSTALL_PATH)/) + $(if $(INSTALL_LIST),rsync -aL $(INSTALL_LIST) $(INSTALL_PATH)/) endef define INSTALL_RULE From b2c0921b926ca69cc399eb356162f35340598112 Mon Sep 17 00:00:00 2001 From: Benjamin Poirier Date: Wed, 19 Oct 2022 18:10:42 +0900 Subject: [PATCH 22/57] selftests: net: Fix netdev name mismatch in cleanup lag_lib.sh creates the interfaces dummy1 and dummy2 whereas dev_addr_lists.sh:destroy() deletes the interfaces dummy0 and dummy1. Fix the mismatch in names. Fixes: bbb774d921e2 ("net: Add tests for bonding and team address list management") Signed-off-by: Benjamin Poirier Reviewed-by: Jonathan Toppins Signed-off-by: Jakub Kicinski --- tools/testing/selftests/drivers/net/team/dev_addr_lists.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/drivers/net/team/dev_addr_lists.sh b/tools/testing/selftests/drivers/net/team/dev_addr_lists.sh index 9684163949f0..33913112d5ca 100755 --- a/tools/testing/selftests/drivers/net/team/dev_addr_lists.sh +++ b/tools/testing/selftests/drivers/net/team/dev_addr_lists.sh @@ -18,7 +18,7 @@ source "$lib_dir"/lag_lib.sh destroy() { - local ifnames=(dummy0 dummy1 team0 mv0) + local ifnames=(dummy1 dummy2 team0 mv0) local ifname for ifname in "${ifnames[@]}"; do From c0605cd6750f2db9890c43a91ea4d77be8fb4908 Mon Sep 17 00:00:00 2001 From: Zhengchao Shao Date: Wed, 19 Oct 2022 17:57:51 +0800 Subject: [PATCH 23/57] net: hinic: fix incorrect assignment issue in hinic_set_interrupt_cfg() The value of lli_credit_cnt is incorrectly assigned, fix it. Fixes: a0337c0dee68 ("hinic: add support to set and get irq coalesce") Signed-off-by: Zhengchao Shao Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c index 94f470556295..27795288c586 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c @@ -877,7 +877,7 @@ int hinic_set_interrupt_cfg(struct hinic_hwdev *hwdev, if (err) return -EINVAL; - interrupt_info->lli_credit_cnt = temp_info.lli_timer_cnt; + interrupt_info->lli_credit_cnt = temp_info.lli_credit_cnt; interrupt_info->lli_timer_cnt = temp_info.lli_timer_cnt; err = hinic_msg_to_mgmt(&pfhwdev->pf_to_mgmt, HINIC_MOD_COMM, From 4c1f602df8956bc0decdafd7e4fc7eef50c550b1 Mon Sep 17 00:00:00 2001 From: Zhengchao Shao Date: Wed, 19 Oct 2022 17:57:52 +0800 Subject: [PATCH 24/57] net: hinic: fix memory leak when reading function table When the input parameter idx meets the expected case option in hinic_dbg_get_func_table(), read_data is not released. Fix it. Fixes: 5215e16244ee ("hinic: add support to query function table") Signed-off-by: Zhengchao Shao Signed-off-by: Jakub Kicinski --- .../net/ethernet/huawei/hinic/hinic_debugfs.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c b/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c index 19eb839177ec..061952c6c21a 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c @@ -85,6 +85,7 @@ static int hinic_dbg_get_func_table(struct hinic_dev *nic_dev, int idx) struct tag_sml_funcfg_tbl *funcfg_table_elem; struct hinic_cmd_lt_rd *read_data; u16 out_size = sizeof(*read_data); + int ret = ~0; int err; read_data = kzalloc(sizeof(*read_data), GFP_KERNEL); @@ -111,20 +112,25 @@ static int hinic_dbg_get_func_table(struct hinic_dev *nic_dev, int idx) switch (idx) { case VALID: - return funcfg_table_elem->dw0.bs.valid; + ret = funcfg_table_elem->dw0.bs.valid; + break; case RX_MODE: - return funcfg_table_elem->dw0.bs.nic_rx_mode; + ret = funcfg_table_elem->dw0.bs.nic_rx_mode; + break; case MTU: - return funcfg_table_elem->dw1.bs.mtu; + ret = funcfg_table_elem->dw1.bs.mtu; + break; case RQ_DEPTH: - return funcfg_table_elem->dw13.bs.cfg_rq_depth; + ret = funcfg_table_elem->dw13.bs.cfg_rq_depth; + break; case QUEUE_NUM: - return funcfg_table_elem->dw13.bs.cfg_q_num; + ret = funcfg_table_elem->dw13.bs.cfg_q_num; + break; } kfree(read_data); - return ~0; + return ret; } static ssize_t hinic_dbg_cmd_read(struct file *filp, char __user *buffer, size_t count, From 363cc87767f6ddcfb9158ad2e2afa2f8d5c4b94e Mon Sep 17 00:00:00 2001 From: Zhengchao Shao Date: Wed, 19 Oct 2022 17:57:53 +0800 Subject: [PATCH 25/57] net: hinic: fix the issue of CMDQ memory leaks When hinic_set_cmdq_depth() fails in hinic_init_cmdqs(), the cmdq memory is not released correctly. Fix it. Fixes: 72ef908bb3ff ("hinic: add three net_device_ops of vf") Signed-off-by: Zhengchao Shao Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/huawei/hinic/hinic_hw_cmdq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_cmdq.c b/drivers/net/ethernet/huawei/hinic/hinic_hw_cmdq.c index 78190e88cd75..d39eec9c62bf 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_hw_cmdq.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_cmdq.c @@ -924,7 +924,7 @@ int hinic_init_cmdqs(struct hinic_cmdqs *cmdqs, struct hinic_hwif *hwif, err_set_cmdq_depth: hinic_ceq_unregister_cb(&func_to_io->ceqs, HINIC_CEQ_CMDQ); - + free_cmdq(&cmdqs->cmdq[HINIC_CMDQ_SYNC]); err_cmdq_ctxt: hinic_wqs_cmdq_free(&cmdqs->cmdq_pages, cmdqs->saved_wqs, HINIC_MAX_CMDQ_TYPES); From 8ec2f4c6b2e11a4249bba77460f0cfe6d95a82f8 Mon Sep 17 00:00:00 2001 From: Zhengchao Shao Date: Wed, 19 Oct 2022 17:57:54 +0800 Subject: [PATCH 26/57] net: hinic: fix the issue of double release MBOX callback of VF In hinic_vf_func_init(), if VF fails to register information with PF through the MBOX, the MBOX callback function of VF is released once. But it is released again in hinic_init_hwdev(). Remove one. Fixes: 7dd29ee12865 ("hinic: add sriov feature support") Signed-off-by: Zhengchao Shao Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/huawei/hinic/hinic_sriov.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/ethernet/huawei/hinic/hinic_sriov.c b/drivers/net/ethernet/huawei/hinic/hinic_sriov.c index a5f08b969e3f..f7e05b41385b 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_sriov.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_sriov.c @@ -1174,7 +1174,6 @@ int hinic_vf_func_init(struct hinic_hwdev *hwdev) dev_err(&hwdev->hwif->pdev->dev, "Failed to register VF, err: %d, status: 0x%x, out size: 0x%x\n", err, register_info.status, out_size); - hinic_unregister_vf_mbox_cb(hwdev, HINIC_MOD_L2NIC); return -EIO; } } else { From 15a9dbec631cd69dfbbfc4e2cbf90c9dd8432a8f Mon Sep 17 00:00:00 2001 From: Sergiu Moga Date: Wed, 19 Oct 2022 15:09:32 +0300 Subject: [PATCH 27/57] net: macb: Specify PHY PM management done by MAC The `macb_resume`/`macb_suspend` methods already call the `phylink_start`/`phylink_stop` methods during their execution so explicitly say that the PM of the PHY is done by MAC by using the `mac_managed_pm` flag of the `struct phylink_config`. This also fixes the warning message issued during resume: WARNING: CPU: 0 PID: 237 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0x144/0x148 Depends-on: 96de900ae78e ("net: phylink: add mac_managed_pm in phylink_config structure") Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") Signed-off-by: Sergiu Moga Reviewed-by: Florian Fainelli Reviewed-by: Claudiu Beznea Link: https://lore.kernel.org/r/20221019120929.63098-1-sergiu.moga@microchip.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/cadence/macb_main.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 51c9fd6f68a4..4f63f1ba3161 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -806,6 +806,7 @@ static int macb_mii_probe(struct net_device *dev) bp->phylink_config.dev = &dev->dev; bp->phylink_config.type = PHYLINK_NETDEV; + bp->phylink_config.mac_managed_pm = true; if (bp->phy_interface == PHY_INTERFACE_MODE_SGMII) { bp->phylink_config.poll_fixed_state = true; From e840d8f4a1b323973052a1af5ad4edafcde8ae3d Mon Sep 17 00:00:00 2001 From: Shang XiaoJing Date: Thu, 20 Oct 2022 11:05:05 +0800 Subject: [PATCH 28/57] nfc: virtual_ncidev: Fix memory leak in virtual_nci_send() skb should be free in virtual_nci_send(), otherwise kmemleak will report memleak. Steps for reproduction (simulated in qemu): cd tools/testing/selftests/nci make ./nci_dev BUG: memory leak unreferenced object 0xffff888107588000 (size 208): comm "nci_dev", pid 206, jiffies 4294945376 (age 368.248s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000008d94c8fd>] __alloc_skb+0x1da/0x290 [<00000000278bc7f8>] nci_send_cmd+0xa3/0x350 [<0000000081256a22>] nci_reset_req+0x6b/0xa0 [<000000009e721112>] __nci_request+0x90/0x250 [<000000005d556e59>] nci_dev_up+0x217/0x5b0 [<00000000e618ce62>] nfc_dev_up+0x114/0x220 [<00000000981e226b>] nfc_genl_dev_up+0x94/0xe0 [<000000009bb03517>] genl_family_rcv_msg_doit.isra.14+0x228/0x2d0 [<00000000b7f8c101>] genl_rcv_msg+0x35c/0x640 [<00000000c94075ff>] netlink_rcv_skb+0x11e/0x350 [<00000000440cfb1e>] genl_rcv+0x24/0x40 [<0000000062593b40>] netlink_unicast+0x43f/0x640 [<000000001d0b13cc>] netlink_sendmsg+0x73a/0xbf0 [<000000003272487f>] __sys_sendto+0x324/0x370 [<00000000ef9f1747>] __x64_sys_sendto+0xdd/0x1b0 [<000000001e437841>] do_syscall_64+0x3f/0x90 Fixes: e624e6c3e777 ("nfc: Add a virtual nci device driver") Signed-off-by: Shang XiaoJing Reviewed-by: Krzysztof Kozlowski Link: https://lore.kernel.org/r/20221020030505.15572-1-shangxiaojing@huawei.com Signed-off-by: Jakub Kicinski --- drivers/nfc/virtual_ncidev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/nfc/virtual_ncidev.c b/drivers/nfc/virtual_ncidev.c index f577449e4935..85c06dbb2c44 100644 --- a/drivers/nfc/virtual_ncidev.c +++ b/drivers/nfc/virtual_ncidev.c @@ -54,16 +54,19 @@ static int virtual_nci_send(struct nci_dev *ndev, struct sk_buff *skb) mutex_lock(&nci_mutex); if (state != virtual_ncidev_enabled) { mutex_unlock(&nci_mutex); + kfree_skb(skb); return 0; } if (send_buff) { mutex_unlock(&nci_mutex); + kfree_skb(skb); return -1; } send_buff = skb_copy(skb, GFP_KERNEL); mutex_unlock(&nci_mutex); wake_up_interruptible(&wq); + consume_skb(skb); return 0; } From 46cdedf2a0fa20a99ca8be40bccde7487e13b77a Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Wed, 19 Oct 2022 15:35:51 -0700 Subject: [PATCH 29/57] ethtool: pse-pd: fix null-deref on genl_info in dump ethnl_default_dump_one() passes NULL as info. It's correct not to set extack during dump, as we should just silently skip interfaces which can't provide the information. Reported-by: syzbot+81c4b4bbba6eea2cfcae@syzkaller.appspotmail.com Fixes: 18ff0bcda6d1 ("ethtool: add interface to interact with Ethernet Power Equipment") Signed-off-by: Jakub Kicinski Reviewed-by: Oleksij Rempel Signed-off-by: David S. Miller --- net/ethtool/pse-pd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ethtool/pse-pd.c b/net/ethtool/pse-pd.c index 5a471e115b66..e8683e485dc9 100644 --- a/net/ethtool/pse-pd.c +++ b/net/ethtool/pse-pd.c @@ -64,7 +64,7 @@ static int pse_prepare_data(const struct ethnl_req_info *req_base, if (ret < 0) return ret; - ret = pse_get_pse_attributes(dev, info->extack, data); + ret = pse_get_pse_attributes(dev, info ? info->extack : NULL, data); ethnl_ops_complete(dev); From 4d814b329a4d54cd10eee4bd2ce5a8175646cc16 Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Wed, 19 Oct 2022 19:19:13 -0700 Subject: [PATCH 30/57] MAINTAINERS: add keyword match on PTP Most of PTP drivers live under ethernet and we have to keep telling people to CC the PTP maintainers. Let's try a keyword match, we can refine as we go if it causes false positives. Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 8b73135ddb21..57ad33aa0eb8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16669,6 +16669,7 @@ F: Documentation/driver-api/ptp.rst F: drivers/net/phy/dp83640* F: drivers/ptp/* F: include/linux/ptp_cl* +K: (?:\b|_)ptp(?:\b|_) PTP VIRTUAL CLOCK SUPPORT M: Yangbo Lu From 3d05818707bb2cf133ccdcd29f2d5585b5bc1298 Mon Sep 17 00:00:00 2001 From: Hou Tao Date: Fri, 21 Oct 2022 19:49:12 +0800 Subject: [PATCH 31/57] bpf: Wait for busy refill_work when destroying bpf memory allocator A busy irq work is an unfinished irq work and it can be either in the pending state or in the running state. When destroying bpf memory allocator, refill_work may be busy for PREEMPT_RT kernel in which irq work is invoked in a per-CPU RT-kthread. It is also possible for kernel with arch_irq_work_has_interrupt() being false (e.g. 1-cpu arm32 host or mips) and irq work is inovked in timer interrupt. The busy refill_work leads to various issues. The obvious one is that there will be concurrent operations on free_by_rcu and free_list between irq work and memory draining. Another one is call_rcu_in_progress will not be reliable for the checking of pending RCU callback because do_call_rcu() may have not been invoked by irq work yet. The other is there will be use-after-free if irq work is freed before the callback of irq work is invoked as shown below: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD 12ab94067 P4D 12ab94067 PUD 1796b4067 PMD 0 Oops: 0010 [#1] PREEMPT_RT SMP CPU: 5 PID: 64 Comm: irq_work/5 Not tainted 6.0.0-rt11+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0018:ffffadc080293e78 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffffcdc07fb6a388 RCX: ffffa05000a2e000 RDX: ffffa05000a2e000 RSI: ffffffff96cc9827 RDI: ffffcdc07fb6a388 ...... Call Trace: irq_work_single+0x24/0x60 irq_work_run_list+0x24/0x30 run_irq_workd+0x23/0x30 smpboot_thread_fn+0x203/0x300 kthread+0x126/0x150 ret_from_fork+0x1f/0x30 Considering the ease of concurrency handling, no overhead for irq_work_sync() under non-PREEMPT_RT kernel and has-irq-work-interrupt kernel and the short wait time used for irq_work_sync() under PREEMPT_RT (When running two test_maps on PREEMPT_RT kernel and 72-cpus host, the max wait time is about 8ms and the 99th percentile is 10us), just using irq_work_sync() to wait for busy refill_work to complete before memory draining and memory freeing. Fixes: 7c8199e24fa0 ("bpf: Introduce any context BPF specific memory allocator.") Acked-by: Stanislav Fomichev Signed-off-by: Hou Tao Link: https://lore.kernel.org/r/20221021114913.60508-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 5f83be1d2018..1cb24323826f 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -493,6 +493,16 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) rcu_in_progress = 0; for_each_possible_cpu(cpu) { c = per_cpu_ptr(ma->cache, cpu); + /* + * refill_work may be unfinished for PREEMPT_RT kernel + * in which irq work is invoked in a per-CPU RT thread. + * It is also possible for kernel with + * arch_irq_work_has_interrupt() being false and irq + * work is invoked in timer interrupt. So waiting for + * the completion of irq work to ease the handling of + * concurrency. + */ + irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_in_progress); } @@ -507,6 +517,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) cc = per_cpu_ptr(ma->caches, cpu); for (i = 0; i < NUM_CACHES; i++) { c = &cc->cache[i]; + irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_in_progress); } From fa4447cb73b2bfe7175f1b7ffdc70580fcfcb991 Mon Sep 17 00:00:00 2001 From: Hou Tao Date: Fri, 21 Oct 2022 19:49:13 +0800 Subject: [PATCH 32/57] bpf: Use __llist_del_all() whenever possbile during memory draining Except for waiting_for_gp list, there are no concurrent operations on free_by_rcu, free_llist and free_llist_extra lists, so use __llist_del_all() instead of llist_del_all(). waiting_for_gp list can be deleted by RCU callback concurrently, so still use llist_del_all(). Acked-by: Stanislav Fomichev Signed-off-by: Hou Tao Link: https://lore.kernel.org/r/20221021114913.60508-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 1cb24323826f..4901fa1048cd 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -418,14 +418,17 @@ static void drain_mem_cache(struct bpf_mem_cache *c) /* No progs are using this bpf_mem_cache, but htab_map_free() called * bpf_mem_cache_free() for all remaining elements and they can be in * free_by_rcu or in waiting_for_gp lists, so drain those lists now. + * + * Except for waiting_for_gp list, there are no concurrent operations + * on these lists, so it is safe to use __llist_del_all(). */ llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu)) free_one(c, llnode); llist_for_each_safe(llnode, t, llist_del_all(&c->waiting_for_gp)) free_one(c, llnode); - llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist)) + llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist)) free_one(c, llnode); - llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra)) + llist_for_each_safe(llnode, t, __llist_del_all(&c->free_llist_extra)) free_one(c, llnode); } From f97fc7ef414603189d5ba6f529407c5341c03c2a Mon Sep 17 00:00:00 2001 From: Raju Rangoju Date: Thu, 20 Oct 2022 12:12:11 +0530 Subject: [PATCH 33/57] amd-xgbe: Yellow carp devices do not need rrc Link stability issues are noticed on Yellow carp platforms when Receiver Reset Cycle is issued. Since the CDR workaround is disabled on these platforms, the Receiver Reset Cycle is not needed. So, avoid issuing rrc on Yellow carp platforms. Fixes: dbb6c58b5a61 ("net: amd-xgbe: Add Support for Yellow Carp Ethernet device") Signed-off-by: Raju Rangoju Acked-by: Tom Lendacky Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/amd/xgbe/xgbe-pci.c | 5 +++++ drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c | 2 +- drivers/net/ethernet/amd/xgbe/xgbe.h | 1 + 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-pci.c b/drivers/net/ethernet/amd/xgbe/xgbe-pci.c index 2af3da4b2d05..f409d7bd1f1e 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-pci.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-pci.c @@ -285,6 +285,9 @@ static int xgbe_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) /* Yellow Carp devices do not need cdr workaround */ pdata->vdata->an_cdr_workaround = 0; + + /* Yellow Carp devices do not need rrc */ + pdata->vdata->enable_rrc = 0; } else { pdata->xpcs_window_def_reg = PCS_V2_WINDOW_DEF; pdata->xpcs_window_sel_reg = PCS_V2_WINDOW_SELECT; @@ -483,6 +486,7 @@ static struct xgbe_version_data xgbe_v2a = { .tx_desc_prefetch = 5, .rx_desc_prefetch = 5, .an_cdr_workaround = 1, + .enable_rrc = 1, }; static struct xgbe_version_data xgbe_v2b = { @@ -498,6 +502,7 @@ static struct xgbe_version_data xgbe_v2b = { .tx_desc_prefetch = 5, .rx_desc_prefetch = 5, .an_cdr_workaround = 1, + .enable_rrc = 1, }; static const struct pci_device_id xgbe_pci_table[] = { diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c index 2156600641b6..19b943eba560 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c @@ -2640,7 +2640,7 @@ static int xgbe_phy_link_status(struct xgbe_prv_data *pdata, int *an_restart) } /* No link, attempt a receiver reset cycle */ - if (phy_data->rrc_count++ > XGBE_RRC_FREQUENCY) { + if (pdata->vdata->enable_rrc && phy_data->rrc_count++ > XGBE_RRC_FREQUENCY) { phy_data->rrc_count = 0; xgbe_phy_rrc(pdata); } diff --git a/drivers/net/ethernet/amd/xgbe/xgbe.h b/drivers/net/ethernet/amd/xgbe/xgbe.h index b875c430222e..49d23abce73d 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe.h +++ b/drivers/net/ethernet/amd/xgbe/xgbe.h @@ -1013,6 +1013,7 @@ struct xgbe_version_data { unsigned int tx_desc_prefetch; unsigned int rx_desc_prefetch; unsigned int an_cdr_workaround; + unsigned int enable_rrc; }; struct xgbe_prv_data { From 1246d0862349f36b65db1043d27c466af930047d Mon Sep 17 00:00:00 2001 From: Raju Rangoju Date: Thu, 20 Oct 2022 12:12:12 +0530 Subject: [PATCH 34/57] amd-xgbe: use enums for mailbox cmd and sub_cmds Instead of using hardcoded values, use enumerations for mailbox command and sub commands. Signed-off-by: Raju Rangoju Acked-by: Tom Lendacky Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c | 29 ++++++++++++--------- drivers/net/ethernet/amd/xgbe/xgbe.h | 25 ++++++++++++++++++ 2 files changed, 41 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c index 19b943eba560..8cf5d81fca36 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c @@ -1989,7 +1989,7 @@ static void xgbe_phy_pll_ctrl(struct xgbe_prv_data *pdata, bool enable) } static void xgbe_phy_perform_ratechange(struct xgbe_prv_data *pdata, - unsigned int cmd, unsigned int sub_cmd) + enum xgbe_mb_cmd cmd, enum xgbe_mb_subcmd sub_cmd) { unsigned int s0 = 0; unsigned int wait; @@ -2036,7 +2036,7 @@ reenable_pll: static void xgbe_phy_rrc(struct xgbe_prv_data *pdata) { /* Receiver Reset Cycle */ - xgbe_phy_perform_ratechange(pdata, 5, 0); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_RRC, XGBE_MB_SUBCMD_NONE); netif_dbg(pdata, link, pdata->netdev, "receiver reset complete\n"); } @@ -2046,7 +2046,7 @@ static void xgbe_phy_power_off(struct xgbe_prv_data *pdata) struct xgbe_phy_data *phy_data = pdata->phy_data; /* Power off */ - xgbe_phy_perform_ratechange(pdata, 0, 0); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_POWER_OFF, XGBE_MB_SUBCMD_NONE); phy_data->cur_mode = XGBE_MODE_UNKNOWN; @@ -2061,14 +2061,17 @@ static void xgbe_phy_sfi_mode(struct xgbe_prv_data *pdata) /* 10G/SFI */ if (phy_data->sfp_cable != XGBE_SFP_CABLE_PASSIVE) { - xgbe_phy_perform_ratechange(pdata, 3, 0); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_SET_10G_SFI, XGBE_MB_SUBCMD_ACTIVE); } else { if (phy_data->sfp_cable_len <= 1) - xgbe_phy_perform_ratechange(pdata, 3, 1); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_SET_10G_SFI, + XGBE_MB_SUBCMD_PASSIVE_1M); else if (phy_data->sfp_cable_len <= 3) - xgbe_phy_perform_ratechange(pdata, 3, 2); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_SET_10G_SFI, + XGBE_MB_SUBCMD_PASSIVE_3M); else - xgbe_phy_perform_ratechange(pdata, 3, 3); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_SET_10G_SFI, + XGBE_MB_SUBCMD_PASSIVE_OTHER); } phy_data->cur_mode = XGBE_MODE_SFI; @@ -2083,7 +2086,7 @@ static void xgbe_phy_x_mode(struct xgbe_prv_data *pdata) xgbe_phy_set_redrv_mode(pdata); /* 1G/X */ - xgbe_phy_perform_ratechange(pdata, 1, 3); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_SET_1G, XGBE_MB_SUBCMD_1G_KX); phy_data->cur_mode = XGBE_MODE_X; @@ -2097,7 +2100,7 @@ static void xgbe_phy_sgmii_1000_mode(struct xgbe_prv_data *pdata) xgbe_phy_set_redrv_mode(pdata); /* 1G/SGMII */ - xgbe_phy_perform_ratechange(pdata, 1, 2); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_SET_1G, XGBE_MB_SUBCMD_1G_SGMII); phy_data->cur_mode = XGBE_MODE_SGMII_1000; @@ -2111,7 +2114,7 @@ static void xgbe_phy_sgmii_100_mode(struct xgbe_prv_data *pdata) xgbe_phy_set_redrv_mode(pdata); /* 100M/SGMII */ - xgbe_phy_perform_ratechange(pdata, 1, 1); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_SET_1G, XGBE_MB_SUBCMD_100MBITS); phy_data->cur_mode = XGBE_MODE_SGMII_100; @@ -2125,7 +2128,7 @@ static void xgbe_phy_kr_mode(struct xgbe_prv_data *pdata) xgbe_phy_set_redrv_mode(pdata); /* 10G/KR */ - xgbe_phy_perform_ratechange(pdata, 4, 0); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_SET_10G_KR, XGBE_MB_SUBCMD_NONE); phy_data->cur_mode = XGBE_MODE_KR; @@ -2139,7 +2142,7 @@ static void xgbe_phy_kx_2500_mode(struct xgbe_prv_data *pdata) xgbe_phy_set_redrv_mode(pdata); /* 2.5G/KX */ - xgbe_phy_perform_ratechange(pdata, 2, 0); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_SET_2_5G, XGBE_MB_SUBCMD_NONE); phy_data->cur_mode = XGBE_MODE_KX_2500; @@ -2153,7 +2156,7 @@ static void xgbe_phy_kx_1000_mode(struct xgbe_prv_data *pdata) xgbe_phy_set_redrv_mode(pdata); /* 1G/KX */ - xgbe_phy_perform_ratechange(pdata, 1, 3); + xgbe_phy_perform_ratechange(pdata, XGBE_MB_CMD_SET_1G, XGBE_MB_SUBCMD_1G_KX); phy_data->cur_mode = XGBE_MODE_KX_1000; diff --git a/drivers/net/ethernet/amd/xgbe/xgbe.h b/drivers/net/ethernet/amd/xgbe/xgbe.h index 49d23abce73d..71f24cb47935 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe.h +++ b/drivers/net/ethernet/amd/xgbe/xgbe.h @@ -611,6 +611,31 @@ enum xgbe_mdio_mode { XGBE_MDIO_MODE_CL45, }; +enum xgbe_mb_cmd { + XGBE_MB_CMD_POWER_OFF = 0, + XGBE_MB_CMD_SET_1G, + XGBE_MB_CMD_SET_2_5G, + XGBE_MB_CMD_SET_10G_SFI, + XGBE_MB_CMD_SET_10G_KR, + XGBE_MB_CMD_RRC +}; + +enum xgbe_mb_subcmd { + XGBE_MB_SUBCMD_NONE = 0, + + /* 10GbE SFP subcommands */ + XGBE_MB_SUBCMD_ACTIVE = 0, + XGBE_MB_SUBCMD_PASSIVE_1M, + XGBE_MB_SUBCMD_PASSIVE_3M, + XGBE_MB_SUBCMD_PASSIVE_OTHER, + + /* 1GbE Mode subcommands */ + XGBE_MB_SUBCMD_10MBITS = 0, + XGBE_MB_SUBCMD_100MBITS, + XGBE_MB_SUBCMD_1G_SGMII, + XGBE_MB_SUBCMD_1G_KX +}; + struct xgbe_phy { struct ethtool_link_ksettings lks; From fc75c032aee63e60170d80f44c8567ea45fc59da Mon Sep 17 00:00:00 2001 From: Raju Rangoju Date: Thu, 20 Oct 2022 12:12:13 +0530 Subject: [PATCH 35/57] amd-xgbe: enable PLL_CTL for fixed PHY modes only PLL control setting(RRC) is needed only in fixed PHY configuration to fix the peer-peer issues. Without the PLL control setting, the link up takes longer time in a fixed phy configuration. Driver implements SW RRC for Autoneg On configuration, hence PLL control setting (RRC) is not needed for AN On configuration, and can be skipped. Also, PLL re-initialization is not needed for PHY Power Off and RRC commands. Otherwise, they lead to mailbox errors. Added the changes accordingly. Fixes: daf182d360e5 ("net: amd-xgbe: Toggle PLL settings during rate change") Signed-off-by: Raju Rangoju Acked-by: Tom Lendacky Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c index 8cf5d81fca36..349ba0dc1fa2 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c @@ -1979,6 +1979,10 @@ static void xgbe_phy_rx_reset(struct xgbe_prv_data *pdata) static void xgbe_phy_pll_ctrl(struct xgbe_prv_data *pdata, bool enable) { + /* PLL_CTRL feature needs to be enabled for fixed PHY modes (Non-Autoneg) only */ + if (pdata->phy.autoneg != AUTONEG_DISABLE) + return; + XMDIO_WRITE_BITS(pdata, MDIO_MMD_PMAPMD, MDIO_VEND2_PMA_MISC_CTRL0, XGBE_PMA_PLL_CTRL_MASK, enable ? XGBE_PMA_PLL_CTRL_ENABLE @@ -2029,8 +2033,10 @@ static void xgbe_phy_perform_ratechange(struct xgbe_prv_data *pdata, xgbe_phy_rx_reset(pdata); reenable_pll: - /* Enable PLL re-initialization */ - xgbe_phy_pll_ctrl(pdata, true); + /* Enable PLL re-initialization, not needed for PHY Power Off and RRC cmds */ + if (cmd != XGBE_MB_CMD_POWER_OFF && + cmd != XGBE_MB_CMD_RRC) + xgbe_phy_pll_ctrl(pdata, true); } static void xgbe_phy_rrc(struct xgbe_prv_data *pdata) From 09c5f6bf11ac98874339e55f4f5f79a9dbc9b375 Mon Sep 17 00:00:00 2001 From: Raju Rangoju Date: Thu, 20 Oct 2022 12:12:14 +0530 Subject: [PATCH 36/57] amd-xgbe: fix the SFP compliance codes check for DAC cables The current XGBE code assumes that offset 6 of EEPROM SFP DAC (passive) cables is NULL. However, some cables (the 5 meter and 7 meter Molex passive cables) have non-zero data at offset 6. Fix the logic by moving the passive cable check above the active checks, so as not to be improperly identified as an active cable. This will fix the issue for any passive cable that advertises 1000Base-CX in offset 6. Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Signed-off-by: Raju Rangoju Acked-by: Tom Lendacky Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c index 349ba0dc1fa2..8c41ac5676d6 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c @@ -1151,7 +1151,10 @@ static void xgbe_phy_sfp_parse_eeprom(struct xgbe_prv_data *pdata) } /* Determine the type of SFP */ - if (sfp_base[XGBE_SFP_BASE_10GBE_CC] & XGBE_SFP_BASE_10GBE_CC_SR) + if (phy_data->sfp_cable == XGBE_SFP_CABLE_PASSIVE && + xgbe_phy_sfp_bit_rate(sfp_eeprom, XGBE_SFP_SPEED_10000)) + phy_data->sfp_base = XGBE_SFP_BASE_10000_CR; + else if (sfp_base[XGBE_SFP_BASE_10GBE_CC] & XGBE_SFP_BASE_10GBE_CC_SR) phy_data->sfp_base = XGBE_SFP_BASE_10000_SR; else if (sfp_base[XGBE_SFP_BASE_10GBE_CC] & XGBE_SFP_BASE_10GBE_CC_LR) phy_data->sfp_base = XGBE_SFP_BASE_10000_LR; @@ -1167,9 +1170,6 @@ static void xgbe_phy_sfp_parse_eeprom(struct xgbe_prv_data *pdata) phy_data->sfp_base = XGBE_SFP_BASE_1000_CX; else if (sfp_base[XGBE_SFP_BASE_1GBE_CC] & XGBE_SFP_BASE_1GBE_CC_T) phy_data->sfp_base = XGBE_SFP_BASE_1000_T; - else if ((phy_data->sfp_cable == XGBE_SFP_CABLE_PASSIVE) && - xgbe_phy_sfp_bit_rate(sfp_eeprom, XGBE_SFP_SPEED_10000)) - phy_data->sfp_base = XGBE_SFP_BASE_10000_CR; switch (phy_data->sfp_base) { case XGBE_SFP_BASE_1000_T: From 170a9e341a3b02c0b2ea0df16ef14a33a4f41de8 Mon Sep 17 00:00:00 2001 From: Raju Rangoju Date: Thu, 20 Oct 2022 12:12:15 +0530 Subject: [PATCH 37/57] amd-xgbe: add the bit rate quirk for Molex cables The offset 12 (bit-rate) of EEPROM SFP DAC (passive) cables is expected to be in the range 0x64 to 0x68. However, the 5 meter and 7 meter Molex passive cables have the rate ceiling 0x78 at offset 12. Add a quirk for Molex passive cables to extend the rate ceiling to 0x78. Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Signed-off-by: Raju Rangoju Acked-by: Tom Lendacky Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c index 8c41ac5676d6..4064c3e3dd49 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c @@ -239,6 +239,7 @@ enum xgbe_sfp_speed { #define XGBE_SFP_BASE_BR_1GBE_MAX 0x0d #define XGBE_SFP_BASE_BR_10GBE_MIN 0x64 #define XGBE_SFP_BASE_BR_10GBE_MAX 0x68 +#define XGBE_MOLEX_SFP_BASE_BR_10GBE_MAX 0x78 #define XGBE_SFP_BASE_CU_CABLE_LEN 18 @@ -284,6 +285,8 @@ struct xgbe_sfp_eeprom { #define XGBE_BEL_FUSE_VENDOR "BEL-FUSE " #define XGBE_BEL_FUSE_PARTNO "1GBT-SFP06 " +#define XGBE_MOLEX_VENDOR "Molex Inc. " + struct xgbe_sfp_ascii { union { char vendor[XGBE_SFP_BASE_VENDOR_NAME_LEN + 1]; @@ -834,7 +837,11 @@ static bool xgbe_phy_sfp_bit_rate(struct xgbe_sfp_eeprom *sfp_eeprom, break; case XGBE_SFP_SPEED_10000: min = XGBE_SFP_BASE_BR_10GBE_MIN; - max = XGBE_SFP_BASE_BR_10GBE_MAX; + if (memcmp(&sfp_eeprom->base[XGBE_SFP_BASE_VENDOR_NAME], + XGBE_MOLEX_VENDOR, XGBE_SFP_BASE_VENDOR_NAME_LEN) == 0) + max = XGBE_MOLEX_SFP_BASE_BR_10GBE_MAX; + else + max = XGBE_SFP_BASE_BR_10GBE_MAX; break; default: return false; From 0bda03623e6b8b9052da5ba0145608941bcc2eb0 Mon Sep 17 00:00:00 2001 From: Yinjun Zhang Date: Thu, 20 Oct 2022 09:14:11 +0100 Subject: [PATCH 38/57] nfp: only clean `sp_indiff` when application firmware is unloaded Currently `sp_indiff` is cleaned when driver is removed. This will cause problem in multi-PF/multi-host case, considering one PF is removed while another is still in use. Since `sp_indiff` is the application firmware property, it should only be cleaned when the firmware is unloaded. Now let management firmware to clean it when necessary, driver only set it. Fixes: b1e4f11e426d ("nfp: refine the ABI of getting `sp_indiff` info") Signed-off-by: Yinjun Zhang Signed-off-by: Simon Horman Link: https://lore.kernel.org/r/20221020081411.80186-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/nfp_main.c | 38 ++++++++----------- 1 file changed, 15 insertions(+), 23 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.c b/drivers/net/ethernet/netronome/nfp/nfp_main.c index e66e548919d4..71301dbd8fb5 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_main.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_main.c @@ -716,16 +716,26 @@ static u64 nfp_net_pf_get_app_cap(struct nfp_pf *pf) return val; } -static int nfp_pf_cfg_hwinfo(struct nfp_pf *pf, bool sp_indiff) +static void nfp_pf_cfg_hwinfo(struct nfp_pf *pf) { struct nfp_nsp *nsp; char hwinfo[32]; + bool sp_indiff; int err; nsp = nfp_nsp_open(pf->cpp); if (IS_ERR(nsp)) - return PTR_ERR(nsp); + return; + if (!nfp_nsp_has_hwinfo_set(nsp)) + goto end; + + sp_indiff = (nfp_net_pf_get_app_id(pf) == NFP_APP_FLOWER_NIC) || + (nfp_net_pf_get_app_cap(pf) & NFP_NET_APP_CAP_SP_INDIFF); + + /* No need to clean `sp_indiff` in driver, management firmware + * will do it when application firmware is unloaded. + */ snprintf(hwinfo, sizeof(hwinfo), "sp_indiff=%d", sp_indiff); err = nfp_nsp_hwinfo_set(nsp, hwinfo, sizeof(hwinfo)); /* Not a fatal error, no need to return error to stop driver from loading */ @@ -739,21 +749,8 @@ static int nfp_pf_cfg_hwinfo(struct nfp_pf *pf, bool sp_indiff) pf->eth_tbl = __nfp_eth_read_ports(pf->cpp, nsp); } +end: nfp_nsp_close(nsp); - return 0; -} - -static int nfp_pf_nsp_cfg(struct nfp_pf *pf) -{ - bool sp_indiff = (nfp_net_pf_get_app_id(pf) == NFP_APP_FLOWER_NIC) || - (nfp_net_pf_get_app_cap(pf) & NFP_NET_APP_CAP_SP_INDIFF); - - return nfp_pf_cfg_hwinfo(pf, sp_indiff); -} - -static void nfp_pf_nsp_clean(struct nfp_pf *pf) -{ - nfp_pf_cfg_hwinfo(pf, false); } static int nfp_pci_probe(struct pci_dev *pdev, @@ -856,13 +853,11 @@ static int nfp_pci_probe(struct pci_dev *pdev, goto err_fw_unload; } - err = nfp_pf_nsp_cfg(pf); - if (err) - goto err_fw_unload; + nfp_pf_cfg_hwinfo(pf); err = nfp_net_pci_probe(pf); if (err) - goto err_nsp_clean; + goto err_fw_unload; err = nfp_hwmon_register(pf); if (err) { @@ -874,8 +869,6 @@ static int nfp_pci_probe(struct pci_dev *pdev, err_net_remove: nfp_net_pci_remove(pf); -err_nsp_clean: - nfp_pf_nsp_clean(pf); err_fw_unload: kfree(pf->rtbl); nfp_mip_close(pf->mip); @@ -915,7 +908,6 @@ static void __nfp_pci_shutdown(struct pci_dev *pdev, bool unload_fw) nfp_net_pci_remove(pf); - nfp_pf_nsp_clean(pf); vfree(pf->dumpspec); kfree(pf->rtbl); nfp_mip_close(pf->mip); From 36abde8d24ad740371422a7678ca92b06cc8a3d5 Mon Sep 17 00:00:00 2001 From: "Luke D. Jones" Date: Mon, 10 Oct 2022 19:30:09 +1300 Subject: [PATCH 39/57] platform/x86: asus-wmi: Add support for ROG X16 tablet mode Add quirk for ASUS ROG X16 Flow 2-in-1 to enable tablet mode with lid flip (all screen rotations). Signed-off-by: Luke D. Jones Link: https://lore.kernel.org/r/20221010063009.32293-1-luke@ljones.dev Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede --- drivers/platform/x86/asus-nb-wmi.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/platform/x86/asus-nb-wmi.c b/drivers/platform/x86/asus-nb-wmi.c index 613c45c9fbe3..c685a705b73d 100644 --- a/drivers/platform/x86/asus-nb-wmi.c +++ b/drivers/platform/x86/asus-nb-wmi.c @@ -464,6 +464,15 @@ static const struct dmi_system_id asus_quirks[] = { }, .driver_data = &quirk_asus_tablet_mode, }, + { + .callback = dmi_matched, + .ident = "ASUS ROG FLOW X16", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."), + DMI_MATCH(DMI_PRODUCT_NAME, "GV601R"), + }, + .driver_data = &quirk_asus_tablet_mode, + }, {}, }; From a10d50983f7befe85acf95ea7dbf6ba9187c2d70 Mon Sep 17 00:00:00 2001 From: Jelle van der Waa Date: Wed, 19 Oct 2022 21:47:51 +0200 Subject: [PATCH 40/57] platform/x86: thinkpad_acpi: Fix reporting a non present second fan on some models thinkpad_acpi was reporting 2 fans on a ThinkPad T14s gen 1, even though the laptop has only 1 fan. The second, not present fan always reads 65535 (-1 in 16 bit signed), ignore fans which report 65535 to avoid reporting the non present fan. Signed-off-by: Jelle van der Waa Link: https://lore.kernel.org/r/20221019194751.5392-1-jvanderwaa@redhat.com Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede --- drivers/platform/x86/thinkpad_acpi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index 6a823b850a77..20e5c043a8e8 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -263,6 +263,8 @@ enum tpacpi_hkey_event_t { #define TPACPI_DBG_BRGHT 0x0020 #define TPACPI_DBG_MIXER 0x0040 +#define FAN_NOT_PRESENT 65535 + #define strlencmp(a, b) (strncmp((a), (b), strlen(b))) @@ -8876,7 +8878,7 @@ static int __init fan_init(struct ibm_init_struct *iibm) /* Try and probe the 2nd fan */ tp_features.second_fan = 1; /* needed for get_speed to work */ res = fan2_get_speed(&speed); - if (res >= 0) { + if (res >= 0 && speed != FAN_NOT_PRESENT) { /* It responded - so let's assume it's there */ tp_features.second_fan = 1; tp_features.second_fan_ctl = 1; From 6960d133f66ecddcd3af2b1cbd0c7dcd104268b8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=8D=C3=B1igo=20Huguet?= Date: Thu, 20 Oct 2022 09:53:10 +0200 Subject: [PATCH 41/57] atlantic: fix deadlock at aq_nic_stop MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit NIC is stopped with rtnl_lock held, and during the stop it cancels the 'service_task' work and free irqs. However, if CONFIG_MACSEC is set, rtnl_lock is acquired both from aq_nic_service_task and aq_linkstate_threaded_isr. Then a deadlock happens if aq_nic_stop tries to cancel/disable them when they've already started their execution. As the deadlock is caused by rtnl_lock, it causes many other processes to stall, not only atlantic related stuff. Fix it by introducing a mutex that protects each NIC's macsec related data, and locking it instead of the rtnl_lock from the service task and the threaded IRQ. Before this patch, all macsec data was protected with rtnl_lock, but maybe not all of it needs to be protected. With this new mutex, further efforts can be made to limit the protected data only to that which requires it. However, probably it doesn't worth it because all macsec's data accesses are infrequent, and almost all are done from macsec_ops or ethtool callbacks, called holding rtnl_lock, so macsec_mutex won't never be much contended. The issue appeared repeteadly attaching and deattaching the NIC to a bond interface. Doing that after this patch I cannot reproduce the bug. Fixes: 62c1c2e606f6 ("net: atlantic: MACSec offload skeleton") Reported-by: Li Liang Suggested-by: Andrew Lunn Signed-off-by: Íñigo Huguet Reviewed-by: Igor Russkikh Signed-off-by: David S. Miller --- .../ethernet/aquantia/atlantic/aq_macsec.c | 96 ++++++++++++++----- .../net/ethernet/aquantia/atlantic/aq_nic.h | 2 + 2 files changed, 74 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_macsec.c b/drivers/net/ethernet/aquantia/atlantic/aq_macsec.c index 3d0e16791e1c..a0180811305d 100644 --- a/drivers/net/ethernet/aquantia/atlantic/aq_macsec.c +++ b/drivers/net/ethernet/aquantia/atlantic/aq_macsec.c @@ -1394,26 +1394,57 @@ static void aq_check_txsa_expiration(struct aq_nic_s *nic) egress_sa_threshold_expired); } +#define AQ_LOCKED_MDO_DEF(mdo) \ +static int aq_locked_mdo_##mdo(struct macsec_context *ctx) \ +{ \ + struct aq_nic_s *nic = netdev_priv(ctx->netdev); \ + int ret; \ + mutex_lock(&nic->macsec_mutex); \ + ret = aq_mdo_##mdo(ctx); \ + mutex_unlock(&nic->macsec_mutex); \ + return ret; \ +} + +AQ_LOCKED_MDO_DEF(dev_open) +AQ_LOCKED_MDO_DEF(dev_stop) +AQ_LOCKED_MDO_DEF(add_secy) +AQ_LOCKED_MDO_DEF(upd_secy) +AQ_LOCKED_MDO_DEF(del_secy) +AQ_LOCKED_MDO_DEF(add_rxsc) +AQ_LOCKED_MDO_DEF(upd_rxsc) +AQ_LOCKED_MDO_DEF(del_rxsc) +AQ_LOCKED_MDO_DEF(add_rxsa) +AQ_LOCKED_MDO_DEF(upd_rxsa) +AQ_LOCKED_MDO_DEF(del_rxsa) +AQ_LOCKED_MDO_DEF(add_txsa) +AQ_LOCKED_MDO_DEF(upd_txsa) +AQ_LOCKED_MDO_DEF(del_txsa) +AQ_LOCKED_MDO_DEF(get_dev_stats) +AQ_LOCKED_MDO_DEF(get_tx_sc_stats) +AQ_LOCKED_MDO_DEF(get_tx_sa_stats) +AQ_LOCKED_MDO_DEF(get_rx_sc_stats) +AQ_LOCKED_MDO_DEF(get_rx_sa_stats) + const struct macsec_ops aq_macsec_ops = { - .mdo_dev_open = aq_mdo_dev_open, - .mdo_dev_stop = aq_mdo_dev_stop, - .mdo_add_secy = aq_mdo_add_secy, - .mdo_upd_secy = aq_mdo_upd_secy, - .mdo_del_secy = aq_mdo_del_secy, - .mdo_add_rxsc = aq_mdo_add_rxsc, - .mdo_upd_rxsc = aq_mdo_upd_rxsc, - .mdo_del_rxsc = aq_mdo_del_rxsc, - .mdo_add_rxsa = aq_mdo_add_rxsa, - .mdo_upd_rxsa = aq_mdo_upd_rxsa, - .mdo_del_rxsa = aq_mdo_del_rxsa, - .mdo_add_txsa = aq_mdo_add_txsa, - .mdo_upd_txsa = aq_mdo_upd_txsa, - .mdo_del_txsa = aq_mdo_del_txsa, - .mdo_get_dev_stats = aq_mdo_get_dev_stats, - .mdo_get_tx_sc_stats = aq_mdo_get_tx_sc_stats, - .mdo_get_tx_sa_stats = aq_mdo_get_tx_sa_stats, - .mdo_get_rx_sc_stats = aq_mdo_get_rx_sc_stats, - .mdo_get_rx_sa_stats = aq_mdo_get_rx_sa_stats, + .mdo_dev_open = aq_locked_mdo_dev_open, + .mdo_dev_stop = aq_locked_mdo_dev_stop, + .mdo_add_secy = aq_locked_mdo_add_secy, + .mdo_upd_secy = aq_locked_mdo_upd_secy, + .mdo_del_secy = aq_locked_mdo_del_secy, + .mdo_add_rxsc = aq_locked_mdo_add_rxsc, + .mdo_upd_rxsc = aq_locked_mdo_upd_rxsc, + .mdo_del_rxsc = aq_locked_mdo_del_rxsc, + .mdo_add_rxsa = aq_locked_mdo_add_rxsa, + .mdo_upd_rxsa = aq_locked_mdo_upd_rxsa, + .mdo_del_rxsa = aq_locked_mdo_del_rxsa, + .mdo_add_txsa = aq_locked_mdo_add_txsa, + .mdo_upd_txsa = aq_locked_mdo_upd_txsa, + .mdo_del_txsa = aq_locked_mdo_del_txsa, + .mdo_get_dev_stats = aq_locked_mdo_get_dev_stats, + .mdo_get_tx_sc_stats = aq_locked_mdo_get_tx_sc_stats, + .mdo_get_tx_sa_stats = aq_locked_mdo_get_tx_sa_stats, + .mdo_get_rx_sc_stats = aq_locked_mdo_get_rx_sc_stats, + .mdo_get_rx_sa_stats = aq_locked_mdo_get_rx_sa_stats, }; int aq_macsec_init(struct aq_nic_s *nic) @@ -1435,6 +1466,7 @@ int aq_macsec_init(struct aq_nic_s *nic) nic->ndev->features |= NETIF_F_HW_MACSEC; nic->ndev->macsec_ops = &aq_macsec_ops; + mutex_init(&nic->macsec_mutex); return 0; } @@ -1458,7 +1490,7 @@ int aq_macsec_enable(struct aq_nic_s *nic) if (!nic->macsec_cfg) return 0; - rtnl_lock(); + mutex_lock(&nic->macsec_mutex); if (nic->aq_fw_ops->send_macsec_req) { struct macsec_cfg_request cfg = { 0 }; @@ -1507,7 +1539,7 @@ int aq_macsec_enable(struct aq_nic_s *nic) ret = aq_apply_macsec_cfg(nic); unlock: - rtnl_unlock(); + mutex_unlock(&nic->macsec_mutex); return ret; } @@ -1519,9 +1551,9 @@ void aq_macsec_work(struct aq_nic_s *nic) if (!netif_carrier_ok(nic->ndev)) return; - rtnl_lock(); + mutex_lock(&nic->macsec_mutex); aq_check_txsa_expiration(nic); - rtnl_unlock(); + mutex_unlock(&nic->macsec_mutex); } int aq_macsec_rx_sa_cnt(struct aq_nic_s *nic) @@ -1532,21 +1564,30 @@ int aq_macsec_rx_sa_cnt(struct aq_nic_s *nic) if (!cfg) return 0; + mutex_lock(&nic->macsec_mutex); + for (i = 0; i < AQ_MACSEC_MAX_SC; i++) { if (!test_bit(i, &cfg->rxsc_idx_busy)) continue; cnt += hweight_long(cfg->aq_rxsc[i].rx_sa_idx_busy); } + mutex_unlock(&nic->macsec_mutex); return cnt; } int aq_macsec_tx_sc_cnt(struct aq_nic_s *nic) { + int cnt; + if (!nic->macsec_cfg) return 0; - return hweight_long(nic->macsec_cfg->txsc_idx_busy); + mutex_lock(&nic->macsec_mutex); + cnt = hweight_long(nic->macsec_cfg->txsc_idx_busy); + mutex_unlock(&nic->macsec_mutex); + + return cnt; } int aq_macsec_tx_sa_cnt(struct aq_nic_s *nic) @@ -1557,12 +1598,15 @@ int aq_macsec_tx_sa_cnt(struct aq_nic_s *nic) if (!cfg) return 0; + mutex_lock(&nic->macsec_mutex); + for (i = 0; i < AQ_MACSEC_MAX_SC; i++) { if (!test_bit(i, &cfg->txsc_idx_busy)) continue; cnt += hweight_long(cfg->aq_txsc[i].tx_sa_idx_busy); } + mutex_unlock(&nic->macsec_mutex); return cnt; } @@ -1634,6 +1678,8 @@ u64 *aq_macsec_get_stats(struct aq_nic_s *nic, u64 *data) if (!cfg) return data; + mutex_lock(&nic->macsec_mutex); + aq_macsec_update_stats(nic); common_stats = &cfg->stats; @@ -1716,5 +1762,7 @@ u64 *aq_macsec_get_stats(struct aq_nic_s *nic, u64 *data) data += i; + mutex_unlock(&nic->macsec_mutex); + return data; } diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.h b/drivers/net/ethernet/aquantia/atlantic/aq_nic.h index 935ba889bd9a..ad33f8586532 100644 --- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.h +++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.h @@ -157,6 +157,8 @@ struct aq_nic_s { struct mutex fwreq_mutex; #if IS_ENABLED(CONFIG_MACSEC) struct aq_macsec_cfg *macsec_cfg; + /* mutex to protect data in macsec_cfg */ + struct mutex macsec_mutex; #endif /* PTP support */ struct aq_ptp_s *aq_ptp; From 0b6e6e149c136677f1cc859d4185b5a2db50ffbf Mon Sep 17 00:00:00 2001 From: Mario Limonciello Date: Thu, 20 Oct 2022 06:37:49 -0500 Subject: [PATCH 42/57] platform/x86/amd: pmc: Read SMU version during suspend on Cezanne systems commit b0c07116c894 ("platform/x86: amd-pmc: Avoid reading SMU version at probe time") adjusted the behavior for amd-pmc to avoid reading the SMU version at startup but rather on first use to improve boot time. However the SMU version is also used to decide whether to place a timer based wakeup in the OS_HINT message. If the idlemask hasn't been read before this message was sent then the SMU version will not have been cached. Ensure the SMU version has been read before deciding whether or not to run this codepath. Cc: stable@vger.kernel.org # 6.0 Reported-by: You-Sheng Yang Tested-by: Anson Tsao Fixes: b0c07116c894 ("platform/x86: amd-pmc: Avoid reading SMU version at probe time") Signed-off-by: Mario Limonciello Link: https://lore.kernel.org/r/20221020113749.6621-2-mario.limonciello@amd.com Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede --- drivers/platform/x86/amd/pmc.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/platform/x86/amd/pmc.c b/drivers/platform/x86/amd/pmc.c index ce859b300712..96e790e639a2 100644 --- a/drivers/platform/x86/amd/pmc.c +++ b/drivers/platform/x86/amd/pmc.c @@ -663,6 +663,13 @@ static int amd_pmc_verify_czn_rtc(struct amd_pmc_dev *pdev, u32 *arg) struct rtc_time tm; int rc; + /* we haven't yet read SMU version */ + if (!pdev->major) { + rc = amd_pmc_get_smu_version(pdev); + if (rc) + return rc; + } + if (pdev->major < 64 || (pdev->major == 64 && pdev->minor < 53)) return 0; From f8127476930b98fc9e9aa5de0bbf9eeaf45db219 Mon Sep 17 00:00:00 2001 From: Leon Romanovsky Date: Thu, 20 Oct 2022 08:28:28 +0300 Subject: [PATCH 43/57] net/mlx5e: Cleanup MACsec uninitialization routine The mlx5e_macsec_cleanup() routine has NULL pointer dereferencing if mlx5 device doesn't support MACsec (priv->macsec will be NULL). While at it delete comment line, assignment and extra blank lines, so fix everything in one patch. Fixes: 1f53da676439 ("net/mlx5e: Create advanced steering operation (ASO) object for MACsec") Reported-by: Dan Carpenter Signed-off-by: Leon Romanovsky Signed-off-by: David S. Miller --- .../net/ethernet/mellanox/mlx5/core/en_accel/macsec.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c index 41970067917b..4331235b21ee 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c @@ -1846,25 +1846,16 @@ err_hash: void mlx5e_macsec_cleanup(struct mlx5e_priv *priv) { struct mlx5e_macsec *macsec = priv->macsec; - struct mlx5_core_dev *mdev = macsec->mdev; + struct mlx5_core_dev *mdev = priv->mdev; if (!macsec) return; mlx5_notifier_unregister(mdev, &macsec->nb); - mlx5e_macsec_fs_cleanup(macsec->macsec_fs); - - /* Cleanup workqueue */ destroy_workqueue(macsec->wq); - mlx5e_macsec_aso_cleanup(&macsec->aso, mdev); - - priv->macsec = NULL; - rhashtable_destroy(&macsec->sci_hash); - mutex_destroy(&macsec->lock); - kfree(macsec); } From ee24395f91b9cddccae5f6c11c37ee4ed78ff354 Mon Sep 17 00:00:00 2001 From: Henning Schild Date: Mon, 24 Oct 2022 11:20:27 +0200 Subject: [PATCH 44/57] leds: simatic-ipc-leds-gpio: fix incorrect LED to GPIO mapping For apollolake the mapping between LEDs and GPIO pins was off because of a refactoring when we introduced a new device model. In addition to the reordering the indices in the lookup table need to be updated as well. Fixes: a97126265dfe ("leds: simatic-ipc-leds-gpio: add new model 227G") Signed-off-by: Henning Schild Link: https://lore.kernel.org/r/20221024092027.4529-1-henning.schild@siemens.com Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede --- drivers/leds/simple/simatic-ipc-leds-gpio.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/leds/simple/simatic-ipc-leds-gpio.c b/drivers/leds/simple/simatic-ipc-leds-gpio.c index b9eeb8702df0..07f0d79d604d 100644 --- a/drivers/leds/simple/simatic-ipc-leds-gpio.c +++ b/drivers/leds/simple/simatic-ipc-leds-gpio.c @@ -20,12 +20,12 @@ static struct gpiod_lookup_table *simatic_ipc_led_gpio_table; static struct gpiod_lookup_table simatic_ipc_led_gpio_table_127e = { .dev_id = "leds-gpio", .table = { - GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 52, NULL, 1, GPIO_ACTIVE_LOW), - GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 53, NULL, 2, GPIO_ACTIVE_LOW), - GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 57, NULL, 3, GPIO_ACTIVE_LOW), - GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 58, NULL, 4, GPIO_ACTIVE_LOW), - GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 60, NULL, 5, GPIO_ACTIVE_LOW), - GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 51, NULL, 0, GPIO_ACTIVE_LOW), + GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 52, NULL, 0, GPIO_ACTIVE_LOW), + GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 53, NULL, 1, GPIO_ACTIVE_LOW), + GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 57, NULL, 2, GPIO_ACTIVE_LOW), + GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 58, NULL, 3, GPIO_ACTIVE_LOW), + GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 60, NULL, 4, GPIO_ACTIVE_LOW), + GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 51, NULL, 5, GPIO_ACTIVE_LOW), GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 56, NULL, 6, GPIO_ACTIVE_LOW), GPIO_LOOKUP_IDX("apollolake-pinctrl.0", 59, NULL, 7, GPIO_ACTIVE_HIGH), }, From 555a68dd681b7437a2708001d465c85f6dfa6955 Mon Sep 17 00:00:00 2001 From: Gayatri Kammela Date: Mon, 12 Sep 2022 16:33:07 -0700 Subject: [PATCH 45/57] platform/x86/intel: pmc/core: Add Raptor Lake support to pmc core driver Add Raptor Lake client parts (both RPL and RPL_S) support to pmc core driver. Raptor Lake client parts reuse all the Alder Lake PCH IPs. Cc: Srinivas Pandruvada Cc: Andy Shevchenko Cc: David Box Acked-by: Rajneesh Bhardwaj Signed-off-by: Gayatri Kammela Link: https://lore.kernel.org/r/20220912233307.409954-2-gayatri.kammela@linux.intel.com Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede --- drivers/platform/x86/intel/pmc/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/platform/x86/intel/pmc/core.c b/drivers/platform/x86/intel/pmc/core.c index a1fe1e0dcf4a..17ec5825d13d 100644 --- a/drivers/platform/x86/intel/pmc/core.c +++ b/drivers/platform/x86/intel/pmc/core.c @@ -1914,6 +1914,8 @@ static const struct x86_cpu_id intel_pmc_core_ids[] = { X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, &tgl_reg_map), X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &adl_reg_map), X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, &tgl_reg_map), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, &adl_reg_map), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S, &adl_reg_map), {} }; From c99f0f7e68376dda5df8db7950cd6b67e73c6d3c Mon Sep 17 00:00:00 2001 From: Sean Anderson Date: Thu, 20 Oct 2022 11:50:41 -0400 Subject: [PATCH 46/57] net: fman: Use physical address for userspace interfaces Before 262f2b782e25 ("net: fman: Map the base address once"), the physical address of the MAC was exposed to userspace in two places: via sysfs and via SIOCGIFMAP. While this is not best practice, it is an external ABI which is in use by userspace software. The aforementioned commit inadvertently modified these addresses and made them virtual. This constitutes and ABI break. Additionally, it leaks the kernel's memory layout to userspace. Partially revert that commit, reintroducing the resource back into struct mac_device, while keeping the intended changes (the rework of the address mapping). Fixes: 262f2b782e25 ("net: fman: Map the base address once") Reported-by: Geert Uytterhoeven Signed-off-by: Sean Anderson Acked-by: Madalin Bucur Signed-off-by: David S. Miller --- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 4 ++-- drivers/net/ethernet/freescale/dpaa/dpaa_eth_sysfs.c | 2 +- drivers/net/ethernet/freescale/fman/mac.c | 12 ++++++------ drivers/net/ethernet/freescale/fman/mac.h | 2 +- 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c index 31cfa121333d..fc68a32ce2f7 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -221,8 +221,8 @@ static int dpaa_netdev_init(struct net_device *net_dev, net_dev->netdev_ops = dpaa_ops; mac_addr = mac_dev->addr; - net_dev->mem_start = (unsigned long)mac_dev->vaddr; - net_dev->mem_end = (unsigned long)mac_dev->vaddr_end; + net_dev->mem_start = (unsigned long)priv->mac_dev->res->start; + net_dev->mem_end = (unsigned long)priv->mac_dev->res->end; net_dev->min_mtu = ETH_MIN_MTU; net_dev->max_mtu = dpaa_get_max_mtu(); diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth_sysfs.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth_sysfs.c index 258eb6c8f4c0..4fee74c024bd 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth_sysfs.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth_sysfs.c @@ -18,7 +18,7 @@ static ssize_t dpaa_eth_show_addr(struct device *dev, if (mac_dev) return sprintf(buf, "%llx", - (unsigned long long)mac_dev->vaddr); + (unsigned long long)mac_dev->res->start); else return sprintf(buf, "none"); } diff --git a/drivers/net/ethernet/freescale/fman/mac.c b/drivers/net/ethernet/freescale/fman/mac.c index 7b7526fd7da3..65df308bad97 100644 --- a/drivers/net/ethernet/freescale/fman/mac.c +++ b/drivers/net/ethernet/freescale/fman/mac.c @@ -279,7 +279,6 @@ static int mac_probe(struct platform_device *_of_dev) struct device_node *mac_node, *dev_node; struct mac_device *mac_dev; struct platform_device *of_dev; - struct resource *res; struct mac_priv_s *priv; struct fman_mac_params params; u32 val; @@ -338,24 +337,25 @@ static int mac_probe(struct platform_device *_of_dev) of_node_put(dev_node); /* Get the address of the memory mapped registers */ - res = platform_get_mem_or_io(_of_dev, 0); - if (!res) { + mac_dev->res = platform_get_mem_or_io(_of_dev, 0); + if (!mac_dev->res) { dev_err(dev, "could not get registers\n"); return -EINVAL; } - err = devm_request_resource(dev, fman_get_mem_region(priv->fman), res); + err = devm_request_resource(dev, fman_get_mem_region(priv->fman), + mac_dev->res); if (err) { dev_err_probe(dev, err, "could not request resource\n"); return err; } - mac_dev->vaddr = devm_ioremap(dev, res->start, resource_size(res)); + mac_dev->vaddr = devm_ioremap(dev, mac_dev->res->start, + resource_size(mac_dev->res)); if (!mac_dev->vaddr) { dev_err(dev, "devm_ioremap() failed\n"); return -EIO; } - mac_dev->vaddr_end = mac_dev->vaddr + resource_size(res); if (!of_device_is_available(mac_node)) return -ENODEV; diff --git a/drivers/net/ethernet/freescale/fman/mac.h b/drivers/net/ethernet/freescale/fman/mac.h index b95d384271bd..13b69ca5f00c 100644 --- a/drivers/net/ethernet/freescale/fman/mac.h +++ b/drivers/net/ethernet/freescale/fman/mac.h @@ -20,8 +20,8 @@ struct mac_priv_s; struct mac_device { void __iomem *vaddr; - void __iomem *vaddr_end; struct device *dev; + struct resource *res; u8 addr[ETH_ALEN]; struct fman_port *port[2]; u32 if_support; From 15e4dabda11b0fa31d510a915d1a580f47dfc92e Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 20 Oct 2022 22:45:11 +0000 Subject: [PATCH 47/57] kcm: annotate data-races around kcm->rx_psock kcm->rx_psock can be read locklessly in kcm_rfree(). Annotate the read and writes accordingly. We do the same for kcm->rx_wait in the following patch. syzbot reported: BUG: KCSAN: data-race in kcm_rfree / unreserve_rx_kcm write to 0xffff888123d827b8 of 8 bytes by task 2758 on cpu 1: unreserve_rx_kcm+0x72/0x1f0 net/kcm/kcmsock.c:313 kcm_rcv_strparser+0x2b5/0x3a0 net/kcm/kcmsock.c:373 __strp_recv+0x64c/0xd20 net/strparser/strparser.c:301 strp_recv+0x6d/0x80 net/strparser/strparser.c:335 tcp_read_sock+0x13e/0x5a0 net/ipv4/tcp.c:1703 strp_read_sock net/strparser/strparser.c:358 [inline] do_strp_work net/strparser/strparser.c:406 [inline] strp_work+0xe8/0x180 net/strparser/strparser.c:415 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289 worker_thread+0x618/0xa70 kernel/workqueue.c:2436 kthread+0x1a9/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 read to 0xffff888123d827b8 of 8 bytes by task 5859 on cpu 0: kcm_rfree+0x14c/0x220 net/kcm/kcmsock.c:181 skb_release_head_state+0x8e/0x160 net/core/skbuff.c:841 skb_release_all net/core/skbuff.c:852 [inline] __kfree_skb net/core/skbuff.c:868 [inline] kfree_skb_reason+0x5c/0x260 net/core/skbuff.c:891 kfree_skb include/linux/skbuff.h:1216 [inline] kcm_recvmsg+0x226/0x2b0 net/kcm/kcmsock.c:1161 ____sys_recvmsg+0x16c/0x2e0 ___sys_recvmsg net/socket.c:2743 [inline] do_recvmmsg+0x2f1/0x710 net/socket.c:2837 __sys_recvmmsg net/socket.c:2916 [inline] __do_sys_recvmmsg net/socket.c:2939 [inline] __se_sys_recvmmsg net/socket.c:2932 [inline] __x64_sys_recvmmsg+0xde/0x160 net/socket.c:2932 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0xffff88812971ce00 -> 0x0000000000000000 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 5859 Comm: syz-executor.3 Not tainted 6.0.0-syzkaller-12189-g19d17ab7c68b-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022 Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller --- net/kcm/kcmsock.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index 27725464ec08..0109ef6ddf9a 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -178,7 +178,7 @@ static void kcm_rfree(struct sk_buff *skb) /* For reading rx_wait and rx_psock without holding lock */ smp_mb__after_atomic(); - if (!kcm->rx_wait && !kcm->rx_psock && + if (!kcm->rx_wait && !READ_ONCE(kcm->rx_psock) && sk_rmem_alloc_get(sk) < sk->sk_rcvlowat) { spin_lock_bh(&mux->rx_lock); kcm_rcv_ready(kcm); @@ -283,7 +283,8 @@ static struct kcm_sock *reserve_rx_kcm(struct kcm_psock *psock, kcm->rx_wait = false; psock->rx_kcm = kcm; - kcm->rx_psock = psock; + /* paired with lockless reads in kcm_rfree() */ + WRITE_ONCE(kcm->rx_psock, psock); spin_unlock_bh(&mux->rx_lock); @@ -310,7 +311,8 @@ static void unreserve_rx_kcm(struct kcm_psock *psock, spin_lock_bh(&mux->rx_lock); psock->rx_kcm = NULL; - kcm->rx_psock = NULL; + /* paired with lockless reads in kcm_rfree() */ + WRITE_ONCE(kcm->rx_psock, NULL); /* Commit kcm->rx_psock before sk_rmem_alloc_get to sync with * kcm_rfree From 0c745b5141a45a076f1cb9772a399f7ebcb0948a Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 20 Oct 2022 22:45:12 +0000 Subject: [PATCH 48/57] kcm: annotate data-races around kcm->rx_wait kcm->rx_psock can be read locklessly in kcm_rfree(). Annotate the read and writes accordingly. syzbot reported: BUG: KCSAN: data-race in kcm_rcv_strparser / kcm_rfree write to 0xffff88810784e3d0 of 1 bytes by task 1823 on cpu 1: reserve_rx_kcm net/kcm/kcmsock.c:283 [inline] kcm_rcv_strparser+0x250/0x3a0 net/kcm/kcmsock.c:363 __strp_recv+0x64c/0xd20 net/strparser/strparser.c:301 strp_recv+0x6d/0x80 net/strparser/strparser.c:335 tcp_read_sock+0x13e/0x5a0 net/ipv4/tcp.c:1703 strp_read_sock net/strparser/strparser.c:358 [inline] do_strp_work net/strparser/strparser.c:406 [inline] strp_work+0xe8/0x180 net/strparser/strparser.c:415 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289 worker_thread+0x618/0xa70 kernel/workqueue.c:2436 kthread+0x1a9/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 read to 0xffff88810784e3d0 of 1 bytes by task 17869 on cpu 0: kcm_rfree+0x121/0x220 net/kcm/kcmsock.c:181 skb_release_head_state+0x8e/0x160 net/core/skbuff.c:841 skb_release_all net/core/skbuff.c:852 [inline] __kfree_skb net/core/skbuff.c:868 [inline] kfree_skb_reason+0x5c/0x260 net/core/skbuff.c:891 kfree_skb include/linux/skbuff.h:1216 [inline] kcm_recvmsg+0x226/0x2b0 net/kcm/kcmsock.c:1161 ____sys_recvmsg+0x16c/0x2e0 ___sys_recvmsg net/socket.c:2743 [inline] do_recvmmsg+0x2f1/0x710 net/socket.c:2837 __sys_recvmmsg net/socket.c:2916 [inline] __do_sys_recvmmsg net/socket.c:2939 [inline] __se_sys_recvmmsg net/socket.c:2932 [inline] __x64_sys_recvmmsg+0xde/0x160 net/socket.c:2932 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x01 -> 0x00 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 17869 Comm: syz-executor.2 Not tainted 6.1.0-rc1-syzkaller-00010-gbb1a1146467a-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022 Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller --- net/kcm/kcmsock.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index 0109ef6ddf9a..63e32f181f43 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -162,7 +162,8 @@ static void kcm_rcv_ready(struct kcm_sock *kcm) /* Buffer limit is okay now, add to ready list */ list_add_tail(&kcm->wait_rx_list, &kcm->mux->kcm_rx_waiters); - kcm->rx_wait = true; + /* paired with lockless reads in kcm_rfree() */ + WRITE_ONCE(kcm->rx_wait, true); } static void kcm_rfree(struct sk_buff *skb) @@ -178,7 +179,7 @@ static void kcm_rfree(struct sk_buff *skb) /* For reading rx_wait and rx_psock without holding lock */ smp_mb__after_atomic(); - if (!kcm->rx_wait && !READ_ONCE(kcm->rx_psock) && + if (!READ_ONCE(kcm->rx_wait) && !READ_ONCE(kcm->rx_psock) && sk_rmem_alloc_get(sk) < sk->sk_rcvlowat) { spin_lock_bh(&mux->rx_lock); kcm_rcv_ready(kcm); @@ -237,7 +238,8 @@ try_again: if (kcm_queue_rcv_skb(&kcm->sk, skb)) { /* Should mean socket buffer full */ list_del(&kcm->wait_rx_list); - kcm->rx_wait = false; + /* paired with lockless reads in kcm_rfree() */ + WRITE_ONCE(kcm->rx_wait, false); /* Commit rx_wait to read in kcm_free */ smp_wmb(); @@ -280,7 +282,8 @@ static struct kcm_sock *reserve_rx_kcm(struct kcm_psock *psock, kcm = list_first_entry(&mux->kcm_rx_waiters, struct kcm_sock, wait_rx_list); list_del(&kcm->wait_rx_list); - kcm->rx_wait = false; + /* paired with lockless reads in kcm_rfree() */ + WRITE_ONCE(kcm->rx_wait, false); psock->rx_kcm = kcm; /* paired with lockless reads in kcm_rfree() */ @@ -1242,7 +1245,8 @@ static void kcm_recv_disable(struct kcm_sock *kcm) if (!kcm->rx_psock) { if (kcm->rx_wait) { list_del(&kcm->wait_rx_list); - kcm->rx_wait = false; + /* paired with lockless reads in kcm_rfree() */ + WRITE_ONCE(kcm->rx_wait, false); } requeue_rx_msgs(mux, &kcm->sk.sk_receive_queue); @@ -1795,7 +1799,8 @@ static void kcm_done(struct kcm_sock *kcm) if (kcm->rx_wait) { list_del(&kcm->wait_rx_list); - kcm->rx_wait = false; + /* paired with lockless reads in kcm_rfree() */ + WRITE_ONCE(kcm->rx_wait, false); } /* Move any pending receive messages to other kcm sockets */ requeue_rx_msgs(mux, &sk->sk_receive_queue); From c5884ef477b405aadf49419a417d62dc8137e7f3 Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Thu, 20 Oct 2022 11:30:31 -0700 Subject: [PATCH 49/57] docs: netdev: offer performance feedback to contributors Some of us gotten used to producing large quantities of peer feedback at work, every 3 or 6 months. Extending the same courtesy to community members seems like a logical step. It may be hard for some folks to get validation of how important their work is internally, especially at smaller companies which don't employ many kernel experts. The concept of "peer feedback" may be a hyperscaler / silicon valley thing so YMMV. Hopefully we can build more context as we go. Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller --- Documentation/process/maintainer-netdev.rst | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/Documentation/process/maintainer-netdev.rst b/Documentation/process/maintainer-netdev.rst index d14007081595..1fa5ab8754d3 100644 --- a/Documentation/process/maintainer-netdev.rst +++ b/Documentation/process/maintainer-netdev.rst @@ -319,3 +319,13 @@ unpatched tree to confirm infrastructure didn't mangle it. Finally, go back and read :ref:`Documentation/process/submitting-patches.rst ` to be sure you are not repeating some common mistake documented there. + +My company uses peer feedback in employee performance reviews. Can I ask netdev maintainers for feedback? +--------------------------------------------------------------------------------------------------------- + +Yes, especially if you spend significant amount of time reviewing code +and go out of your way to improve shared infrastructure. + +The feedback must be requested by you, the contributor, and will always +be shared with you (even if you request for it to be submitted to your +manager). From d266935ac43d57586e311a087510fe6a084af742 Mon Sep 17 00:00:00 2001 From: Zhengchao Shao Date: Thu, 20 Oct 2022 10:42:13 +0800 Subject: [PATCH 50/57] net: fix UAF issue in nfqnl_nf_hook_drop() when ops_init() failed When the ops_init() interface is invoked to initialize the net, but ops->init() fails, data is released. However, the ptr pointer in net->gen is invalid. In this case, when nfqnl_nf_hook_drop() is invoked to release the net, invalid address access occurs. The process is as follows: setup_net() ops_init() data = kzalloc(...) ---> alloc "data" net_assign_generic() ---> assign "date" to ptr in net->gen ... ops->init() ---> failed ... kfree(data); ---> ptr in net->gen is invalid ... ops_exit_list() ... nfqnl_nf_hook_drop() *q = nfnl_queue_pernet(net) ---> q is invalid The following is the Call Trace information: BUG: KASAN: use-after-free in nfqnl_nf_hook_drop+0x264/0x280 Read of size 8 at addr ffff88810396b240 by task ip/15855 Call Trace: dump_stack_lvl+0x8e/0xd1 print_report+0x155/0x454 kasan_report+0xba/0x1f0 nfqnl_nf_hook_drop+0x264/0x280 nf_queue_nf_hook_drop+0x8b/0x1b0 __nf_unregister_net_hook+0x1ae/0x5a0 nf_unregister_net_hooks+0xde/0x130 ops_exit_list+0xb0/0x170 setup_net+0x7ac/0xbd0 copy_net_ns+0x2e6/0x6b0 create_new_namespaces+0x382/0xa50 unshare_nsproxy_namespaces+0xa6/0x1c0 ksys_unshare+0x3a4/0x7e0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Allocated by task 15855: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 __kasan_kmalloc+0xa1/0xb0 __kmalloc+0x49/0xb0 ops_init+0xe7/0x410 setup_net+0x5aa/0xbd0 copy_net_ns+0x2e6/0x6b0 create_new_namespaces+0x382/0xa50 unshare_nsproxy_namespaces+0xa6/0x1c0 ksys_unshare+0x3a4/0x7e0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Freed by task 15855: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x2a/0x40 ____kasan_slab_free+0x155/0x1b0 slab_free_freelist_hook+0x11b/0x220 __kmem_cache_free+0xa4/0x360 ops_init+0xb9/0x410 setup_net+0x5aa/0xbd0 copy_net_ns+0x2e6/0x6b0 create_new_namespaces+0x382/0xa50 unshare_nsproxy_namespaces+0xa6/0x1c0 ksys_unshare+0x3a4/0x7e0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: f875bae06533 ("net: Automatically allocate per namespace data.") Signed-off-by: Zhengchao Shao Signed-off-by: David S. Miller --- net/core/net_namespace.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index 0ec2f5906a27..f64654df71a2 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -117,6 +117,7 @@ static int net_assign_generic(struct net *net, unsigned int id, void *data) static int ops_init(const struct pernet_operations *ops, struct net *net) { + struct net_generic *ng; int err = -ENOMEM; void *data = NULL; @@ -135,7 +136,13 @@ static int ops_init(const struct pernet_operations *ops, struct net *net) if (!err) return 0; + if (ops->id && ops->size) { cleanup: + ng = rcu_dereference_protected(net->gen, + lockdep_is_held(&pernet_ops_rwsem)); + ng->ptr[*ops->id] = NULL; + } + kfree(data); out: From 9c1eaa27ec599fcc25ed4970c0b73c247d147a2b Mon Sep 17 00:00:00 2001 From: Zhang Changzhong Date: Fri, 21 Oct 2022 09:32:24 +0800 Subject: [PATCH 51/57] net: lantiq_etop: don't free skb when returning NETDEV_TX_BUSY The ndo_start_xmit() method must not free skb when returning NETDEV_TX_BUSY, since caller is going to requeue freed skb. Fixes: 504d4721ee8e ("MIPS: Lantiq: Add ethernet driver") Signed-off-by: Zhang Changzhong Signed-off-by: David S. Miller --- drivers/net/ethernet/lantiq_etop.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/ethernet/lantiq_etop.c b/drivers/net/ethernet/lantiq_etop.c index 59aab4086dcc..f5961bdcc480 100644 --- a/drivers/net/ethernet/lantiq_etop.c +++ b/drivers/net/ethernet/lantiq_etop.c @@ -485,7 +485,6 @@ ltq_etop_tx(struct sk_buff *skb, struct net_device *dev) len = skb->len < ETH_ZLEN ? ETH_ZLEN : skb->len; if ((desc->ctl & (LTQ_DMA_OWN | LTQ_DMA_C)) || ch->skb[ch->dma.desc]) { - dev_kfree_skb_any(skb); netdev_err(dev, "tx ring full\n"); netif_tx_stop_queue(txq); return NETDEV_TX_BUSY; From ec791d8149ff60c40ad2074af3b92a39c916a03f Mon Sep 17 00:00:00 2001 From: Lu Wei Date: Fri, 21 Oct 2022 12:06:22 +0800 Subject: [PATCH 52/57] tcp: fix a signed-integer-overflow bug in tcp_add_backlog() The type of sk_rcvbuf and sk_sndbuf in struct sock is int, and in tcp_add_backlog(), the variable limit is caculated by adding sk_rcvbuf, sk_sndbuf and 64 * 1024, it may exceed the max value of int and overflow. This patch reduces the limit budget by halving the sndbuf to solve this issue since ACK packets are much smaller than the payload. Fixes: c9c3321257e1 ("tcp: add tcp_add_backlog()") Signed-off-by: Lu Wei Reviewed-by: Eric Dumazet Acked-by: Kuniyuki Iwashima Signed-off-by: David S. Miller --- net/ipv4/tcp_ipv4.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 7a250ef9d1b7..87d440f47a70 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1874,11 +1874,13 @@ bool tcp_add_backlog(struct sock *sk, struct sk_buff *skb, __skb_push(skb, hdrlen); no_coalesce: + limit = (u32)READ_ONCE(sk->sk_rcvbuf) + (u32)(READ_ONCE(sk->sk_sndbuf) >> 1); + /* Only socket owner can try to collapse/prune rx queues * to reduce memory overhead, so add a little headroom here. * Few sockets backlog are possibly concurrently non empty. */ - limit = READ_ONCE(sk->sk_rcvbuf) + READ_ONCE(sk->sk_sndbuf) + 64*1024; + limit += 64 * 1024; if (unlikely(sk_add_backlog(sk, skb, limit))) { bh_unlock_sock(sk); From e9cf4d9b9a6fdb1df6401a59f5ac5d24006bfeae Mon Sep 17 00:00:00 2001 From: Dmitry Osipenko Date: Mon, 24 Oct 2022 17:12:10 +0300 Subject: [PATCH 53/57] ACPI: video: Fix missing native backlight on Chromebooks Chromebooks don't have backlight in ACPI table, they suppose to use native backlight in this case. Check presence of the CrOS embedded controller ACPI device and prefer the native backlight if EC found. Suggested-by: Hans de Goede Fixes: 2600bfa3df99 ("ACPI: video: Add acpi_video_backlight_use_native() helper") Signed-off-by: Dmitry Osipenko Reviewed-by: Hans de Goede Acked-by: Rafael J. Wysocki Link: https://lore.kernel.org/r/20221024141210.67784-1-dmitry.osipenko@collabora.com Signed-off-by: Hans de Goede --- drivers/acpi/video_detect.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c index 0d9064a9804c..9cd8797d12bb 100644 --- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -668,6 +668,11 @@ static const struct dmi_system_id video_detect_dmi_table[] = { { }, }; +static bool google_cros_ec_present(void) +{ + return acpi_dev_found("GOOG0004"); +} + /* * Determine which type of backlight interface to use on this system, * First check cmdline, then dmi quirks, then do autodetect. @@ -730,6 +735,13 @@ static enum acpi_backlight_type __acpi_video_get_backlight_type(bool native) return acpi_backlight_video; } + /* + * Chromebooks that don't have backlight handle in ACPI table + * are supposed to use native backlight if it's available. + */ + if (google_cros_ec_present() && native_available) + return acpi_backlight_native; + /* No ACPI video (old hw), use vendor specific fw methods. */ return acpi_backlight_vendor; } From 3d2af9cce3133b3bc596a9d065c6f9d93419ccfb Mon Sep 17 00:00:00 2001 From: Neal Cardwell Date: Fri, 21 Oct 2022 17:08:21 +0000 Subject: [PATCH 54/57] tcp: fix indefinite deferral of RTO with SACK reneging This commit fixes a bug that can cause a TCP data sender to repeatedly defer RTOs when encountering SACK reneging. The bug is that when we're in fast recovery in a scenario with SACK reneging, every time we get an ACK we call tcp_check_sack_reneging() and it can note the apparent SACK reneging and rearm the RTO timer for srtt/2 into the future. In some SACK reneging scenarios that can happen repeatedly until the receive window fills up, at which point the sender can't send any more, the ACKs stop arriving, and the RTO fires at srtt/2 after the last ACK. But that can take far too long (O(10 secs)), since the connection is stuck in fast recovery with a low cwnd that cannot grow beyond ssthresh, even if more bandwidth is available. This fix changes the logic in tcp_check_sack_reneging() to only rearm the RTO timer if data is cumulatively ACKed, indicating forward progress. This avoids this kind of nearly infinite loop of RTO timer re-arming. In addition, this meets the goals of tcp_check_sack_reneging() in handling Windows TCP behavior that looks temporarily like SACK reneging but is not really. Many thanks to Jakub Kicinski and Neil Spring, who reported this issue and provided critical packet traces that enabled root-causing this issue. Also, many thanks to Jakub Kicinski for testing this fix. Fixes: 5ae344c949e7 ("tcp: reduce spurious retransmits due to transient SACK reneging") Reported-by: Jakub Kicinski Reported-by: Neil Spring Signed-off-by: Neal Cardwell Reviewed-by: Eric Dumazet Cc: Yuchung Cheng Tested-by: Jakub Kicinski Link: https://lore.kernel.org/r/20221021170821.1093930-1-ncardwell.kernel@gmail.com Signed-off-by: Jakub Kicinski --- net/ipv4/tcp_input.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index bc2ea12221f9..0640453fce54 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2192,7 +2192,8 @@ void tcp_enter_loss(struct sock *sk) */ static bool tcp_check_sack_reneging(struct sock *sk, int flag) { - if (flag & FLAG_SACK_RENEGING) { + if (flag & FLAG_SACK_RENEGING && + flag & FLAG_SND_UNA_ADVANCED) { struct tcp_sock *tp = tcp_sk(sk); unsigned long delay = max(usecs_to_jiffies(tp->srtt_us >> 4), msecs_to_jiffies(10)); From 720ca52bcef225b967a339e0fffb6d0c7e962240 Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Fri, 21 Oct 2022 09:03:04 -0700 Subject: [PATCH 55/57] net-memcg: avoid stalls when under memory pressure As Shakeel explains the commit under Fixes had the unintended side-effect of no longer pre-loading the cached memory allowance. Even tho we previously dropped the first packet received when over memory limit - the consecutive ones would get thru by using the cache. The charging was happening in batches of 128kB, so we'd let in 128kB (truesize) worth of packets per one drop. After the change we no longer force charge, there will be no cache filling side effects. This causes significant drops and connection stalls for workloads which use a lot of page cache, since we can't reclaim page cache under GFP_NOWAIT. Some of the latency can be recovered by improving SACK reneg handling but nowhere near enough to get back to the pre-5.15 performance (the application I'm experimenting with still sees 5-10x worst latency). Apply the suggested workaround of using GFP_ATOMIC. We will now be more permissive than previously as we'll drop _no_ packets in softirq when under pressure. But I can't think of any good and simple way to address that within networking. Link: https://lore.kernel.org/all/20221012163300.795e7b86@kernel.org/ Suggested-by: Shakeel Butt Fixes: 4b1327be9fe5 ("net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem()") Acked-by: Shakeel Butt Acked-by: Roman Gushchin Link: https://lore.kernel.org/r/20221021160304.1362511-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- include/net/sock.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/net/sock.h b/include/net/sock.h index 9e464f6409a7..22f8bab583dd 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2585,7 +2585,7 @@ static inline gfp_t gfp_any(void) static inline gfp_t gfp_memcg_charge(void) { - return in_softirq() ? GFP_NOWAIT : GFP_KERNEL; + return in_softirq() ? GFP_ATOMIC : GFP_KERNEL; } static inline long sock_rcvtimeo(const struct sock *sk, bool noblock) From a970174d7a1010cb29a5b0c9fa0626abdefcfcbe Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (Google)" Date: Mon, 24 Oct 2022 11:45:36 -0400 Subject: [PATCH 56/57] x86/mm: Do not verify W^X at boot up Adding on the kernel command line "ftrace=function" triggered: CPA detected W^X violation: 8000000000000063 -> 0000000000000063 range: 0xffffffffc0013000 - 0xffffffffc0013fff PFN 10031b WARNING: CPU: 0 PID: 0 at arch/x86/mm/pat/set_memory.c:609 verify_rwx+0x61/0x6d Call Trace: __change_page_attr_set_clr+0x146/0x8a6 change_page_attr_set_clr+0x135/0x268 change_page_attr_clear.constprop.0+0x16/0x1c set_memory_x+0x2c/0x32 arch_ftrace_update_trampoline+0x218/0x2db ftrace_update_trampoline+0x16/0xa1 __register_ftrace_function+0x93/0xb2 ftrace_startup+0x21/0xf0 register_ftrace_function_nolock+0x26/0x40 register_ftrace_function+0x4e/0x143 function_trace_init+0x7d/0xc3 tracer_init+0x23/0x2c tracing_set_tracer+0x1d5/0x206 register_tracer+0x1c0/0x1e4 init_function_trace+0x90/0x96 early_trace_init+0x25c/0x352 start_kernel+0x424/0x6e4 x86_64_start_reservations+0x24/0x2a x86_64_start_kernel+0x8c/0x95 secondary_startup_64_no_verify+0xe0/0xeb This is because at boot up, kernel text is writable, and there's no reason to do tricks to updated it. But the verifier does not distinguish updates at boot up and at run time, and causes a warning at time of boot. Add a check for system_state == SYSTEM_BOOTING and allow it if that is the case. [ These SYSTEM_BOOTING special cases are all pretty horrid, but the x86 text_poke() code does some odd things at bootup, forcing this for now - Linus ] Link: https://lore.kernel.org/r/20221024112730.180916b3@gandalf.local.home Fixes: 652c5bf380ad0 ("x86/mm: Refuse W^X violations") Signed-off-by: Steven Rostedt (Google) Signed-off-by: Linus Torvalds --- arch/x86/mm/pat/set_memory.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 97342c42dda8..2e5a045731de 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -587,6 +587,10 @@ static inline pgprot_t verify_rwx(pgprot_t old, pgprot_t new, unsigned long star { unsigned long end; + /* Kernel text is rw at boot up */ + if (system_state == SYSTEM_BOOTING) + return new; + /* * 32-bit has some unfixable W+X issues, like EFI code * and writeable data being in the same page. Disable From 1a2dcbdde82e3a5f1db9b2f4c48aa1aeba534fb2 Mon Sep 17 00:00:00 2001 From: Sreekanth Reddy Date: Tue, 13 Sep 2022 17:35:38 +0530 Subject: [PATCH 57/57] scsi: mpt3sas: re-do lost mpt3sas DMA mask fix This is a re-do of commit e0e0747de0ea ("scsi: mpt3sas: Fix return value check of dma_get_required_mask()"), which I ended up undoing in a mis-merge in commit 62e6e5940c0c ("Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi"). The original commit message was scsi: mpt3sas: Fix return value check of dma_get_required_mask() Fix the incorrect return value check of dma_get_required_mask(). Due to this incorrect check, the driver was always setting the DMA mask to 63 bit. Link: https://lore.kernel.org/r/20220913120538.18759-2-sreekanth.reddy@broadcom.com Fixes: ba27c5cf286d ("scsi: mpt3sas: Don't change the DMA coherent mask after allocations") Signed-off-by: Sreekanth Reddy Signed-off-by: Martin K. Petersen and this fix was lost when I mis-merged the conflict with commit 9df650963bf6 ("scsi: mpt3sas: Don't change DMA mask while reallocating pools"). Reported-by: Juergen Gross Fixes: 62e6e5940c0c ("Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi") Link: https://lore.kernel.org/all/CAHk-=wjaK-TxrNaGtFDpL9qNHL1MVkWXO1TT6vObD5tXMSC4Zg@mail.gmail.com Signed-off-by: Linus Torvalds --- drivers/scsi/mpt3sas/mpt3sas_base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 8b22df8c1792..4e981ccaac41 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -2993,7 +2993,7 @@ _base_config_dma_addressing(struct MPT3SAS_ADAPTER *ioc, struct pci_dev *pdev) u64 coherent_dma_mask, dma_mask; if (ioc->is_mcpu_endpoint || sizeof(dma_addr_t) == 4 || - dma_get_required_mask(&pdev->dev) <= 32) { + dma_get_required_mask(&pdev->dev) <= DMA_BIT_MASK(32)) { ioc->dma_mask = 32; coherent_dma_mask = dma_mask = DMA_BIT_MASK(32); /* Set 63 bit DMA mask for all SAS3 and SAS35 controllers */