A mask encoding of a cpu map is laid out as:
u16 nr
u16 long_size
unsigned long mask[];
However, the mask may be 8-byte aligned meaning there is a 4-byte pad
after long_size. This means 32-bit and 64-bit builds see the mask as
being at different offsets. On top of this the structure is in the byte
data[] encoded as:
u16 type
char data[]
This means the mask's struct isn't the required 4 or 8 byte aligned, but
is offset by 2. Consequently the long reads and writes are causing
undefined behavior as the alignment is broken.
Fix the mask struct by creating explicit 32 and 64-bit variants, use a
union to avoid data[] and casts; the struct must be packed so the
layout matches the existing perf.data layout. Taking an address of a
member of a packed struct breaks alignment so pass the packed
perf_record_cpu_map_data to functions, so they can access variables with
the right alignment.
As the 64-bit version has 4 bytes of padding, optimizing writing to only
write the 32-bit version.
Committer notes:
Disable warnings about 'packed' that break the build in some arches like
riscv64, but just around that specific struct.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Colin Ian King <colin.king@intel.com>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220614143353.1559597-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the existing code in store_latency_data(), the memory operation (mem_op)
returned to the user is always OP_LOAD where in fact, it should be OP_STORE.
This comes from the fact that the function is simply grabbing the information
from a data source map which covers only load accesses. Intel 12th gen CPU
offers precise store sampling that captures both the data source and latency.
Therefore it can use the data source mapping table but must override the
memory operation to reflect stores instead of loads.
Fixes: 61b985e3e7 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids")
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220818054613.1548130-1-eranian@google.com
The SDM explicitly states that PEBS Baseline implies Extended PEBS.
For cpu model forward compatibility (e.g. on ICX, SPR, ADL), it's
safe to stop doing FMS table thing such as setting pebs_capable and
PMU_FL_PEBS_ALL since it's already set in the intel_ds_init().
The Goldmont Plus is the only platform which supports extended PEBS
but doesn't have Baseline. Keep the status quo.
Reported-by: Like Xu <likexu@tencent.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lkml.kernel.org/r/20220816114057.51307-1-likexu@tencent.com
On the platform with Arch LBR, the HW raw branch type encoding may leak
to the perf tool when the SAVE_TYPE option is not set.
In the intel_pmu_store_lbr(), the HW raw branch type is stored in
lbr_entries[].type. If the SAVE_TYPE option is set, the
lbr_entries[].type will be converted into the generic PERF_BR_* type
in the intel_pmu_lbr_filter() and exposed to the user tools.
But if the SAVE_TYPE option is NOT set by the user, the current perf
kernel doesn't clear the field. The HW raw branch type leaks.
There are two solutions to fix the issue for the Arch LBR.
One is to clear the field if the SAVE_TYPE option is NOT set.
The other solution is to unconditionally convert the branch type and
expose the generic type to the user tools.
The latter is implemented here, because
- The branch type is valuable information. I don't see a case where
you would not benefit from the branch type. (Stephane Eranian)
- Not having the branch type DOES NOT save any space in the
branch record (Stephane Eranian)
- The Arch LBR HW can retrieve the common branch types from the
LBR_INFO. It doesn't require the high overhead SW disassemble.
Fixes: 47125db27e ("perf/x86/intel/lbr: Support Architectural LBR")
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220816125612.2042397-1-kan.liang@linux.intel.com
Pull sound fixes from Takashi Iwai:
"The only significant core change is ASoC DPCM fix for asymmetric
setup; other remaining changes are device-specific fixes, including
the hardening of string manipulations.
One change in platform/x86 is the patch I forgot to apply from a
series for CS35L41 codec"
* tag 'sound-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
ALSA: hda/realtek: Add quirk for Clevo NS50PU, NS70PU
ALSA: info: Fix llseek return value when using callback
ALSA: hda/cs8409: Support new Dolphin Variants
platform/x86: serial-multi-instantiate: Add CLSA0101 Laptop
ALSA: hda/realtek: Add quirk for Lenovo Yoga7 14IAL7
ALSA: hda: cs35l41: Clarify support for CSC3551 without _DSD Properties
ALSA: hda/realtek: Add quirks for ASUS Zenbooks using CS35L41
ASoC: codec: tlv320aic32x4: fix mono playback via I2S
ASoC: rt5640: Fix the JD voltage dropping issue
ASoC: tas2770: Fix handling of mute/unmute
ASoC: tas2770: Drop conflicting set_bias_level power setting
ASoC: tas2770: Allow mono streams
ASoC: tas2770: Set correct FSYNC polarity
ASoC: Intel: fix sof_es8336 probe
ASoC: DPCM: Don't pick up BE without substream
ASoC: SOF: ipc3-topology: Fix clang -Wformat warning
ASoC: sh: rz-ssi: Improve error handling in rz_ssi_probe() error path
ASoC: SOF: Intel: hda: Fix potential buffer overflow by snprintf()
ASoC: SOF: debug: Fix potential buffer overflow by snprintf()
ASoC: Intel: avs: Fix potential buffer overflow by snprintf()
...
Pull drm fixes from Dave Airlie:
"Regular weekly fixes.
The nouveau patch just enables modesetting on GA103 hw which is like
other ampere cards that are already supported. amdgpu has 2 weeks of
fixes, as Alex was away, so a bit larger than usual, otherwise some
i915 and misc other fixes.
ttm:
- NULL ptr dereference
i915:
- disable pci resize on 32-bit systems
- don't leak the ccs state
- TLB invalidation fixes
nouveau:
- GA103 enablement
- off-by-one fix
amdgpu:
- Revert some DML stack changes
- Rounding fixes in KFD allocations
- atombios vram info table parsing fix
- DCN 3.1.4 fixes
- Clockgating fixes for various new IPs
- SMU 13.0.4 fixes
- DCN 3.1.4 FP fixes
- TMDS fixes for YCbCr420 4k modes
- DCN 3.2.x fixes
- USB 4 fixes
- SMU 13.0 fixes
- SMU driver unload memory leak fixes
- Display orientation fix
- Regression fix for generic fbdev conversion
- SDMA 6.x fixes
- SR-IOV fixes
- IH 6.x fixes
- Use after free fix in bo list handling
- Revert pipe1 support
- XGMI hive reset fix
amdkfd:
- Fix potential crach in kfd_create_indirect_link_prop()
imx:
- warning fix
meson:
- refcounting fix
lvds-codec:
- error check fix
sun4i:
- underflow fix
- dt-binding fix"
* tag 'drm-fixes-2022-08-19' of git://anongit.freedesktop.org/drm/drm: (109 commits)
Revert "drm/amd/amdgpu: add pipe1 hardware support"
drm/amdgpu: Fix use-after-free on amdgpu_bo_list mutex
drm/amdgpu: Fix interrupt handling on ih_soft ring
drm/amdgpu: Add secure display TA load for Renoir
drm/amd/display: Include scaling factor for SubVP command
drm/amdgpu/vcn: Return void from the stop_dbg_mode
drm/amdgpu: remove useless condition in amdgpu_job_stop_all_jobs_on_sched()
drm/amdgpu: Add decode_iv_ts helper for ih_v6 block
drm/amd/display: add chip revision to DCN32
drm/amd/display: avoid doing vm_init multiple time
drm/amd/display: Use pitch when calculating size to cache in MALL
drm/amd/display: Don't set DSC for phantom pipes
drm/amd/display: Update clock table policy for DCN314
drm/amd/display: Modify header inclusion pattern
drm/amd/display: Fix plug/unplug external monitor will hang while playback MPO video
drm/amd/display: Add debug parameter to retain default clock table
drm/amdgpu: Increase tlb flush timeout for sriov
drm/amd/display: do not compare integers of different widths
drm/amd/display: Add reserved dc_log_type.
drm/amd/display: Fix pixel clock programming
...
Pull bitmap updates from Yury Norov:
"cpumask: UP optimisation fixes follow-up
As an older version of the UP optimisation fixes was merged, not all
review feedback has been implemented.
This implements the feedback received on the merged version [1], and
the respin [2], for changes related to <linux/cpumask.h> and
lib/cpumask.c"
Link: https://lore.kernel.org/lkml/cover.1656777646.git.sander@svanheule.net/ [1]
Link: https://lore.kernel.org/lkml/cover.1659077534.git.sander@svanheule.net/ [2]
It spent for more than a week with no issues.
* tag 'bitmap-6.0-rc2' of https://github.com/norov/linux:
lib/cpumask: drop always-true preprocessor guard
lib/cpumask: add inline cpumask_next_wrap() for UP
cpumask: align signatures of UP implementations
Commit c164fbb40c43f("x86/mm: thread pgprot_t through
init_memory_mapping()") mistakenly used __pgprot() which doesn't respect
__default_kernel_pte_mask when setting PUD mapping.
Fix it by only setting the one bit we actually need (PSE) and leaving
the other bits (that have been properly masked) alone.
Fixes: c164fbb40c ("x86/mm: thread pgprot_t through init_memory_mapping()")
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The following BUG was reported:
traps: Missing ENDBR: andw_ax_dx+0x0/0x10 [kvm]
------------[ cut here ]------------
kernel BUG at arch/x86/kernel/traps.c:253!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<TASK>
asm_exc_control_protection+0x2b/0x30
RIP: 0010:andw_ax_dx+0x0/0x10 [kvm]
Code: c3 cc cc cc cc 0f 1f 44 00 00 66 0f 1f 00 48 19 d0 c3 cc cc cc
cc 0f 1f 40 00 f3 0f 1e fa 20 d0 c3 cc cc cc cc 0f 1f 44 00 00
<66> 0f 1f 00 66 21 d0 c3 cc cc cc cc 0f 1f 40 00 66 0f 1f 00 21
d0
? andb_al_dl+0x10/0x10 [kvm]
? fastop+0x5d/0xa0 [kvm]
x86_emulate_insn+0x822/0x1060 [kvm]
x86_emulate_instruction+0x46f/0x750 [kvm]
complete_emulated_mmio+0x216/0x2c0 [kvm]
kvm_arch_vcpu_ioctl_run+0x604/0x650 [kvm]
kvm_vcpu_ioctl+0x2f4/0x6b0 [kvm]
? wake_up_q+0xa0/0xa0
The BUG occurred because the ENDBR in the andw_ax_dx() fastop function
had been incorrectly "sealed" (converted to a NOP) by apply_ibt_endbr().
Objtool marked it to be sealed because KVM has no compile-time
references to the function. Instead KVM calculates its address at
runtime.
Prevent objtool from annotating fastop functions as sealable by creating
throwaway dummy compile-time references to the functions.
Fixes: 6649fa876d ("x86/ibt,kvm: Add ENDBR to fastops")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Debugged-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Message-Id: <0d4116f90e9d0c1b754bb90c585e6f0415a1c508.1660837839.git.jpoimboe@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a macro which prevents a function from getting sealed if there are
no compile-time references to it.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Message-Id: <20220818213927.e44fmxkoq4yj6ybn@treble>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The motivation of this renaming is to make these variables and related
helper functions less mmu_notifier bound and can also be used for non
mmu_notifier based page invalidation. mmu_invalidate_* was chosen to
better describe the purpose of 'invalidating' a page that those
variables are used for.
- mmu_notifier_seq/range_start/range_end are renamed to
mmu_invalidate_seq/range_start/range_end.
- mmu_notifier_retry{_hva} helper functions are renamed to
mmu_invalidate_retry{_hva}.
- mmu_notifier_count is renamed to mmu_invalidate_in_progress to
avoid confusion with mn_active_invalidate_count.
- While here, also update kvm_inc/dec_notifier_count() to
kvm_mmu_invalidate_begin/end() to match the change for
mmu_notifier_count.
No functional change intended.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Message-Id: <20220816125322.1110439-3-chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM_PRIVATE_MEM_SLOTS defaults to zero, so it is not necessary to
define it in MIPS's asm/kvm_host.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Invoke kvm_coalesced_mmio_init() from kvm_create_vm() now that allocating
and initializing coalesced MMIO objects is separate from registering any
associated devices. Moving coalesced MMIO cleans up the last oddity
where KVM does VM creation/initialization after kvm_create_vm(), and more
importantly after kvm_arch_post_init_vm() is called and the VM is added
to the global vm_list, i.e. after the VM is fully created as far as KVM
is concerned.
Originally, kvm_coalesced_mmio_init() was called by kvm_create_vm(), but
the original implementation was completely devoid of error handling.
Commit 6ce5a090a9 ("KVM: coalesced_mmio: fix kvm_coalesced_mmio_init()'s
error handling" fixed the various bugs, and in doing so rightly moved the
call to after kvm_create_vm() because kvm_coalesced_mmio_init() also
registered the coalesced MMIO device. Commit 2b3c246a68 ("KVM: Make
coalesced mmio use a device per zone") cleaned up that mess by having
each zone register a separate device, i.e. moved device registration to
its logical home in kvm_vm_ioctl_register_coalesced_mmio(). As a result,
kvm_coalesced_mmio_init() is now a "pure" initialization helper and can
be safely called from kvm_create_vm().
Opportunstically drop the #ifdef, KVM provides stubs for
kvm_coalesced_mmio_{init,free}() when CONFIG_KVM_MMIO=n (s390).
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220816053937.2477106-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Unconditionally get a reference to the /dev/kvm module when creating a VM
instead of using try_get_module(), which will fail if the module is in
the process of being forcefully unloaded. The error handling when
try_get_module() fails doesn't properly unwind all that has been done,
e.g. doesn't call kvm_arch_pre_destroy_vm() and doesn't remove the VM
from the global list. Not removing VMs from the global list tends to be
fatal, e.g. leads to use-after-free explosions.
The obvious alternative would be to add proper unwinding, but the
justification for using try_get_module(), "rmmod --wait", is completely
bogus as support for "rmmod --wait", i.e. delete_module() without
O_NONBLOCK, was removed by commit 3f2b9c9cdf ("module: remove rmmod
--wait option.") nearly a decade ago.
It's still possible for try_get_module() to fail due to the module dying
(more like being killed), as the module will be tagged MODULE_STATE_GOING
by "rmmod --force", i.e. delete_module(..., O_TRUNC), but playing nice
with forced unloading is an exercise in futility and gives a falsea sense
of security. Using try_get_module() only prevents acquiring _new_
references, it doesn't magically put the references held by other VMs,
and forced unloading doesn't wait, i.e. "rmmod --force" on KVM is all but
guaranteed to cause spectacular fireworks; the window where KVM will fail
try_get_module() is tiny compared to the window where KVM is building and
running the VM with an elevated module refcount.
Addressing KVM's inability to play nice with "rmmod --force" is firmly
out-of-scope. Forcefully unloading any module taints kernel (for obvious
reasons) _and_ requires the kernel to be built with
CONFIG_MODULE_FORCE_UNLOAD=y, which is off by default and comes with the
amusing disclaimer that it's "mainly for kernel developers and desperate
users". In other words, KVM is free to scoff at bug reports due to using
"rmmod --force" while VMs may be running.
Fixes: 5f6de5cbeb ("KVM: Prevent module exit until all VMs are freed")
Cc: stable@vger.kernel.org
Cc: David Matlack <dmatlack@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220816053937.2477106-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Properly unwind VM creation if kvm_create_vm_debugfs() fails. A recent
change to invoke kvm_create_vm_debug() in kvm_create_vm() was led astray
by buggy try_get_module() handling adding by commit 5f6de5cbeb ("KVM:
Prevent module exit until all VMs are freed"). The debugfs error path
effectively inherits the bad error path of try_module_get(), e.g. KVM
leaves the to-be-free VM on vm_list even though KVM appears to do the
right thing by calling module_put() and falling through.
Opportunistically hoist kvm_create_vm_debugfs() above the call to
kvm_arch_post_init_vm() so that the "post-init" arch hook is actually
invoked after the VM is initialized (ignoring kvm_coalesced_mmio_init()
for the moment). x86 is the only non-nop implementation of the post-init
hook, and it doesn't allocate/initialize any objects that are reachable
via debugfs code (spawns a kthread worker for the NX huge page mitigation).
Leave the buggy try_get_module() alone for now, it will be fixed in a
separate commit.
Fixes: b74ed7a68e ("KVM: Actually create debugfs in kvm_create_vm()")
Reported-by: syzbot+744e173caec2e1627ee0@syzkaller.appspotmail.com
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Message-Id: <20220816053937.2477106-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Petr Machata says:
====================
selftests: mlxsw: Add ordering tests for unified bridge model
Amit Cohen writes:
Commit 798661c736 ("Merge branch 'mlxsw-unified-bridge-conversion-part-6'")
converted mlxsw driver to use unified bridge model. In the legacy model,
when a RIF was created / destroyed, it was firmware's responsibility to
update it in the relevant FID classification records. In the unified bridge
model, this responsibility moved to software.
This set adds tests to check the order of configuration for the following
classifications:
1. {Port, VID} -> FID
2. VID -> FID
3. VNI -> FID (after decapsulation)
In addition, in the legacy model, software is responsible to update a
table which is used to determine the packet's egress VID. Add a test to
check that the order of configuration does not impact switch behavior.
See more details in the commit messages.
Note that the tests supposed to pass also using the legacy model, they
are added now as with the new model they test the driver and not the
firmware.
Patch set overview:
Patch #1 adds test for {Port, VID} -> FID
Patch #2 adds test for VID -> FID
Patch #3 adds test for VNI -> FID
Patch #4 adds test for egress VID classification
====================
Link: https://lore.kernel.org/r/cover.1660747162.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After routing, the device always consults a table that determines the
packet's egress VID based on {egress RIF, egress local port}. In the
unified bridge model, it is up to software to maintain this table via
REIV register.
The table needs to be updated in the following flows:
1. When a RIF is set on a FID, for each FID's {Port, VID} mapping, a new
{RIF, Port}->VID mapping should be created.
2. When a {Port, VID} is mapped to a FID and the FID already has a RIF,
a new {RIF, Port}->VID mapping should be created.
Add a test to verify that packets get the correct VID after routing,
regardless of the order of the configuration.
# ./egress_vid_classification.sh
TEST: Add RIF for existing {port, VID}->FID mapping [ OK ]
TEST: Add {port, VID}->FID mapping for FID with a RIF [ OK ]
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.
For VXLAN decapsulation, the FID classification is done according to the
VNI. When a RIF is added on top of a FID, the existing VNI->FID mapping
should be updated by the software with the new RIF. In addition, when a new
mapping is added for FID which already has a RIF, the correct RIF should
be used for it.
Add a test to verify that packets can be routed after decapsulation which
is done after VNI->FID classification, regardless of the order of the
configuration.
# ./ingress_rif_conf_vxlan.sh
TEST: Add RIF for existing VNI->FID mapping [ OK ]
TEST: Add VNI->FID mapping for FID with a RIF [ OK ]
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.
For VLAN-aware bridges (802.1Q), the FID classification is done according
to VID. When a RIF is added on top of a FID, the existing VID->FID mapping
should be updated by the software with the new RIF.
We never map multiple VLANs to the same FID using VID->FID, so we cannot
create VID->FID for FID which already has a RIF using 802.1Q. Anyway,
verify that packets can be routed via port which is added after the FID
already has a RIF.
Add a test to verify that packets can be routed after VID->FID
classification, regardless of the order of the configuration.
# ./ingress_rif_conf_1q.sh
TEST: Add RIF for existing VID->FID mapping [ OK ]
TEST: Add port to VID->FID mapping for FID with a RIF [ OK ]
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.
For VLAN-unaware bridges (802.1D), the FID classification is done according
to {Port, VID}. When a RIF is added on top of a FID, all the existing
{Port, VID}->FID mappings should be updated by the software with the new
RIF. In addition, when a new mapping is added for FID which already has a
RIF, the correct RIF should be used for it.
Add a test to verify that packets can be routed after {Port, VID}->FID
classification, regardless of the order of the configuration.
# ./ingress_rif_conf_1d.sh
TEST: Add RIF for existing {port, VID}->FID mapping [ OK ]
TEST: Add {port, VID}->FID mapping for FID with a RIF [ OK ]
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter.
Current release - regressions:
- tcp: fix cleanup and leaks in tcp_read_skb() (the new way BPF
socket maps get data out of the TCP stack)
- tls: rx: react to strparser initialization errors
- netfilter: nf_tables: fix scheduling-while-atomic splat
- net: fix suspicious RCU usage in bpf_sk_reuseport_detach()
Current release - new code bugs:
- mlxsw: ptp: fix a couple of races, static checker warnings and
error handling
Previous releases - regressions:
- netfilter:
- nf_tables: fix possible module reference underflow in error path
- make conntrack helpers deal with BIG TCP (skbs > 64kB)
- nfnetlink: re-enable conntrack expectation events
- net: fix potential refcount leak in ndisc_router_discovery()
Previous releases - always broken:
- sched: cls_route: disallow handle of 0
- neigh: fix possible local DoS due to net iface start/stop loop
- rtnetlink: fix module refcount leak in rtnetlink_rcv_msg
- sched: fix adding qlen to qcpu->backlog in gnet_stats_add_queue_cpu
- virtio_net: fix endian-ness for RSS
- dsa: mv88e6060: prevent crash on an unused port
- fec: fix timer capture timing in `fec_ptp_enable_pps()`
- ocelot: stats: fix races, integer wrapping and reading incorrect
registers (the change of register definitions here accounts for
bulk of the changed LoC in this PR)"
* tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
net: moxa: MAC address reading, generating, validity checking
tcp: handle pure FIN case correctly
tcp: refactor tcp_read_skb() a bit
tcp: fix tcp_cleanup_rbuf() for tcp_read_skb()
tcp: fix sock skb accounting in tcp_read_skb()
igb: Add lock to avoid data race
dt-bindings: Fix incorrect "the the" corrections
net: genl: fix error path memory leak in policy dumping
stmmac: intel: Add a missing clk_disable_unprepare() call in intel_eth_pci_remove()
net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_xdp_run
net/mlx5e: Allocate flow steering storage during uplink initialization
net: mscc: ocelot: report ndo_get_stats64 from the wraparound-resistant ocelot->stats
net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset
net: mscc: ocelot: make struct ocelot_stat_layout array indexable
net: mscc: ocelot: fix race between ndo_get_stats64 and ocelot_check_stats_work
net: mscc: ocelot: turn stats_lock into a spinlock
net: mscc: ocelot: fix address of SYS_COUNT_TX_AGING counter
net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters
net: dsa: felix: fix ethtool 256-511 and 512-1023 TX packet counters
net: dsa: don't warn in dsa_port_set_state_now() when driver doesn't support it
...
Pull Kselftest fix from Shuah Khan:
- fix landlock test build regression
* tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/landlock: fix broken include of linux/landlock.h
Pull rtla tool fixes from Steven Rostedt:
"Fixes for the Real-Time Linux Analysis tooling:
- Fix tracer name in comments and prints
- Fix setting up symlinks
- Allow extra flags to be set in build
- Consolidate and show all necessary libraries not found in build
error"
* tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
rtla: Consolidate and show all necessary libraries that failed for building
tools/rtla: Build with EXTRA_{C,LD}FLAGS
tools/rtla: Fix command symlinks
rtla: Fix tracer name
Some (Juniper MX5) SFP link partners exhibit a disinclination to
autonegotiate with X550 configured in SFI mode. This patch enables
a manual AN-37 restart to work around the problem.
Signed-off-by: Jeff Daly <jeffd@silicom-usa.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Fix the warning:
arch/riscv/kernel/signal.c:316:27: warning: no previous prototype for function 'do_notify_resume' [-Wmissing-prototypes]
asmlinkage __visible void do_notify_resume(struct pt_regs *regs,
All other functions in the file are static & none of the existing
headers stood out as an obvious location. Create signal.h to hold the
declaration.
Fixes: e2c0cdfba7 ("RISC-V: User-facing API")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220814141237.493457-4-mail@conchuod.ie
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Wei Fang says:
====================
Add DT property to disable hibernation mode
The patches add the ability to disable the hibernation mode of AR803x
PHYs. Hibernation mode defaults to enabled after hardware reset on
these PHYs. If the AR803x PHYs enter hibernation mode, they will not
provide any clock. For some MACs, they might need the clocks which
provided by the PHYs to support their own hardware logic.
So, the patches add the support to disable hibernation mode by adding
a boolean:
qca,disable-hibernation-mode
If one wished to disable hibernation mode to better match with the
specifical MAC, just add this property in the phy node of DT.
====================
Link: https://lore.kernel.org/r/20220818030054.1010660-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the cable is unplugged, the Atheros AR803x PHYs will enter
hibernation mode after about 10 seconds if the hibernation mode
is enabled and will not provide any clock to the MAC. But for
some MACs, this feature might cause unexpected issues due to the
logic of MACs.
Taking SYNP MAC (stmmac) as an example, if the cable is unplugged
and the "eth0" interface is down, the AR803x PHY will enter
hibernation mode. Then perform the "ifconfig eth0 up" operation,
the stmmac can't be able to complete the software reset operation
and fail to init it's own DMA. Therefore, the "eth0" interface is
failed to ifconfig up. Why does it cause this issue? The truth is
that the software reset operation of the stmmac is designed to
depend on the RX_CLK of PHY.
So, this patch offers an option for the user to determine whether
to disable the hibernation mode of AR803x PHYs.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>