Commit Graph

900440 Commits

Author SHA1 Message Date
Mark Rutland
1cfbb484de KVM: arm/arm64: Correct AArch32 SPSR on exception entry
Confusingly, there are three SPSR layouts that a kernel may need to deal
with:

(1) An AArch64 SPSR_ELx view of an AArch64 pstate
(2) An AArch64 SPSR_ELx view of an AArch32 pstate
(3) An AArch32 SPSR_* view of an AArch32 pstate

When the KVM AArch32 support code deals with SPSR_{EL2,HYP}, it's either
dealing with #2 or #3 consistently. On arm64 the PSR_AA32_* definitions
match the AArch64 SPSR_ELx view, and on arm the PSR_AA32_* definitions
match the AArch32 SPSR_* view.

However, when we inject an exception into an AArch32 guest, we have to
synthesize the AArch32 SPSR_* that the guest will see. Thus, an AArch64
host needs to synthesize layout #3 from layout #2.

This patch adds a new host_spsr_to_spsr32() helper for this, and makes
use of it in the KVM AArch32 support code. For arm64 we need to shuffle
the DIT bit around, and remove the SS bit, while for arm we can use the
value as-is.

I've open-coded the bit manipulation for now to avoid having to rework
the existing PSR_* definitions into PSR64_AA32_* and PSR32_AA32_*
definitions. I hope to perform a more thorough refactoring in future so
that we can handle pstate view manipulation more consistently across the
kernel tree.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200108134324.46500-4-mark.rutland@arm.com
2020-01-19 18:06:14 +00:00
Mark Rutland
3c2483f154 KVM: arm/arm64: Correct CPSR on exception entry
When KVM injects an exception into a guest, it generates the CPSR value
from scratch, configuring CPSR.{M,A,I,T,E}, and setting all other
bits to zero.

This isn't correct, as the architecture specifies that some CPSR bits
are (conditionally) cleared or set upon an exception, and others are
unchanged from the original context.

This patch adds logic to match the architectural behaviour. To make this
simple to follow/audit/extend, documentation references are provided,
and bits are configured in order of their layout in SPSR_EL2. This
layout can be seen in the diagram on ARM DDI 0487E.a page C5-426.

Note that this code is used by both arm and arm64, and is intended to
fuction with the SPSR_EL2 and SPSR_HYP layouts.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200108134324.46500-3-mark.rutland@arm.com
2020-01-19 18:06:14 +00:00
Mark Rutland
a425372e73 KVM: arm64: Correct PSTATE on exception entry
When KVM injects an exception into a guest, it generates the PSTATE
value from scratch, configuring PSTATE.{M[4:0],DAIF}, and setting all
other bits to zero.

This isn't correct, as the architecture specifies that some PSTATE bits
are (conditionally) cleared or set upon an exception, and others are
unchanged from the original context.

This patch adds logic to match the architectural behaviour. To make this
simple to follow/audit/extend, documentation references are provided,
and bits are configured in order of their layout in SPSR_EL2. This
layout can be seen in the diagram on ARM DDI 0487E.a page C5-429.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200108134324.46500-2-mark.rutland@arm.com
2020-01-19 18:06:13 +00:00
James Morse
1559b7583f KVM: arm/arm64: Re-check VMA on detecting a poisoned page
When we check for a poisoned page, we use the VMA to tell userspace
about the looming disaster. But we pass a pointer to this VMA
after having released the mmap_sem, which isn't a good idea.

Instead, stash the shift value that goes with this pfn while
we are holding the mmap_sem.

Reported-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Link: https://lore.kernel.org/r/20191211165651.7889-3-maz@kernel.org
Link: https://lore.kernel.org/r/20191217123809.197392-1-james.morse@arm.com
2020-01-19 18:05:20 +00:00
YueHaibing
de9375634b KVM: arm: Remove duplicate include
Remove duplicate header which is included twice.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20191113014045.15276-1-yuehaibing@huawei.com
2020-01-19 18:03:33 +00:00
Shannon Zhao
c3e35409b5 KVM: ARM: Call hyp_cpu_pm_exit at the right place
It doesn't needs to call hyp_cpu_pm_exit() in init_hyp_mode() when some
error occurs. hyp_cpu_pm_exit() only needs to be called in
kvm_arch_init() if init_subsystems() fails. So move hyp_cpu_pm_exit()
out from teardown_hyp_mode() and call it directly in kvm_arch_init().

Signed-off-by: Shannon Zhao <shannon.zhao@linux.alibaba.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1575272531-3204-1-git-send-email-shannon.zhao@linux.alibaba.com
2020-01-19 18:03:31 +00:00
Alex Sverdlin
927d780ee3 ARM: 8950/1: ftrace/recordmcount: filter relocation types
Scenario 1, ARMv7
=================

If code in arch/arm/kernel/ftrace.c would operate on mcount() pointer
the following may be generated:

00000230 <prealloc_fixed_plts>:
 230:   b5f8            push    {r3, r4, r5, r6, r7, lr}
 232:   b500            push    {lr}
 234:   f7ff fffe       bl      0 <__gnu_mcount_nc>
                        234: R_ARM_THM_CALL     __gnu_mcount_nc
 238:   f240 0600       movw    r6, #0
                        238: R_ARM_THM_MOVW_ABS_NC      __gnu_mcount_nc
 23c:   f8d0 1180       ldr.w   r1, [r0, #384]  ; 0x180

FTRACE currently is not able to deal with it:

WARNING: CPU: 0 PID: 0 at .../kernel/trace/ftrace.c:1979 ftrace_bug+0x1ad/0x230()
...
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.116-... #1
...
[<c0314e3d>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14)
[<c03115e9>] (show_stack) from [<c051a7f1>] (dump_stack+0x81/0xa8)
[<c051a7f1>] (dump_stack) from [<c0321c5d>] (warn_slowpath_common+0x69/0x90)
[<c0321c5d>] (warn_slowpath_common) from [<c0321cf3>] (warn_slowpath_null+0x17/0x1c)
[<c0321cf3>] (warn_slowpath_null) from [<c038ee9d>] (ftrace_bug+0x1ad/0x230)
[<c038ee9d>] (ftrace_bug) from [<c038f1f9>] (ftrace_process_locs+0x27d/0x444)
[<c038f1f9>] (ftrace_process_locs) from [<c08915bd>] (ftrace_init+0x91/0xe8)
[<c08915bd>] (ftrace_init) from [<c0885a67>] (start_kernel+0x34b/0x358)
[<c0885a67>] (start_kernel) from [<00308095>] (0x308095)
---[ end trace cb88537fdc8fa200 ]---
ftrace failed to modify [<c031266c>] prealloc_fixed_plts+0x8/0x60
 actual: 44:f2:e1:36
ftrace record flags: 0
 (0)   expected tramp: c03143e9

Scenario 2, ARMv4T
==================

ftrace: allocating 14435 entries in 43 pages
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2029 ftrace_bug+0x204/0x310
CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.5 #1
Hardware name: Cirrus Logic EDB9302 Evaluation Board
[<c0010a24>] (unwind_backtrace) from [<c000ecb0>] (show_stack+0x20/0x2c)
[<c000ecb0>] (show_stack) from [<c03c72e8>] (dump_stack+0x20/0x30)
[<c03c72e8>] (dump_stack) from [<c0021c18>] (__warn+0xdc/0x104)
[<c0021c18>] (__warn) from [<c0021d7c>] (warn_slowpath_null+0x4c/0x5c)
[<c0021d7c>] (warn_slowpath_null) from [<c0095360>] (ftrace_bug+0x204/0x310)
[<c0095360>] (ftrace_bug) from [<c04dabac>] (ftrace_init+0x3b4/0x4d4)
[<c04dabac>] (ftrace_init) from [<c04cef4c>] (start_kernel+0x20c/0x410)
[<c04cef4c>] (start_kernel) from [<00000000>] (  (null))
---[ end trace 0506a2f5dae6b341 ]---
ftrace failed to modify
[<c000c350>] perf_trace_sys_exit+0x5c/0xe8
 actual:   1e:ff:2f:e1
Initializing ftrace call sites
ftrace record flags: 0
 (0)
 expected tramp: c000fb24

The analysis for this problem has been already performed previously,
refer to the link below.

Fix the above problems by allowing only selected reloc types in
__mcount_loc. The list itself comes from the legacy recordmcount.pl
script.

Link: https://lore.kernel.org/lkml/56961010.6000806@pengutronix.de/
Cc: stable@vger.kernel.org
Fixes: ed60453fa8 ("ARM: 6511/1: ftrace: add ARM support for C version of recordmcount")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-01-19 16:08:25 +00:00
Russell King
f5523423de arm64: kvm: Fix IDMAP overlap with HYP VA
Booting 5.4 on LX2160A reveals that KVM is non-functional:

kvm: Limiting the IPA size due to kernel Virtual Address limit
kvm [1]: IPA Size Limit: 43bits
kvm [1]: IDMAP intersecting with HYP VA, unable to continue
kvm [1]: error initializing Hyp mode: -22

Debugging shows:

kvm [1]: IDMAP page: 81a26000
kvm [1]: HYP VA range: 0:22ffffffff

as RAM is located at:

80000000-fbdfffff : System RAM
2080000000-237fffffff : System RAM

Comparing this with the same kernel on Armada 8040 shows:

kvm: Limiting the IPA size due to kernel Virtual Address limit
kvm [1]: IPA Size Limit: 43bits
kvm [1]: IDMAP page: 2a26000
kvm [1]: HYP VA range: 4800000000:493fffffff
...
kvm [1]: Hyp mode initialized successfully

which indicates that hyp_va_msb is set, and is always set to the
opposite value of the idmap page to avoid the overlap. This does not
happen with the LX2160A.

Further debugging shows vabits_actual = 39, kva_msb = 38 on LX2160A and
kva_msb = 33 on Armada 8040. Looking at the bit layout of the HYP VA,
there is still one bit available for hyp_va_msb. Set this bit
appropriately. This allows KVM to be functional on the LX2160A, but
without any HYP VA randomisation:

kvm: Limiting the IPA size due to kernel Virtual Address limit
kvm [1]: IPA Size Limit: 43bits
kvm [1]: IDMAP page: 81a24000
kvm [1]: HYP VA range: 4000000000:62ffffffff
...
kvm [1]: Hyp mode initialized successfully

Fixes: ed57cac83e ("arm64: KVM: Introduce EL2 VA randomisation")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
[maz: small additional cleanups, preserved case where the tag
 is legitimately 0 and we can just use the mask, Fixes tag]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/E1ilAiY-0000MA-RG@rmk-PC.armlinux.org.uk
2020-01-19 16:05:23 +00:00
Zenghui Yu
5f675c56ed KVM: arm/arm64: vgic: Handle GICR_PENDBASER.PTZ filed as RAZ
Although guest will hardly read and use the PTZ (Pending Table Zero)
bit in GICR_PENDBASER, let us emulate the architecture strictly.
As per IHI 0069E 9.11.30, PTZ field is WO, and reads as 0.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20191220111833.1422-1-yuzenghui@huawei.com
2020-01-19 16:05:11 +00:00
Eric Auger
8c58be3449 KVM: arm/arm64: vgic-its: Fix restoration of unmapped collections
Saving/restoring an unmapped collection is a valid scenario. For
example this happens if a MAPTI command was sent, featuring an
unmapped collection. At the moment the CTE fails to be restored.
Only compare against the number of online vcpus if the rdist
base is set.

Fixes: ea1ad53e1e ("KVM: arm64: vgic-its: Collection table save/restore")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191213094237.19627-1-eric.auger@redhat.com
2020-01-19 16:05:11 +00:00
Christoffer Dall
b6ae256afd KVM: arm64: Only sign-extend MMIO up to register width
On AArch64 you can do a sign-extended load to either a 32-bit or 64-bit
register, and we should only sign extend the register up to the width of
the register as specified in the operation (by using the 32-bit Wn or
64-bit Xn register specifier).

As it turns out, the architecture provides this decoding information in
the SF ("Sixty-Four" -- how cute...) bit.

Let's take advantage of this with the usual 32-bit/64-bit header file
dance and do the right thing on AArch64 hosts.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191212195055.5541-1-christoffer.dall@arm.com
2020-01-19 16:05:10 +00:00
David S. Miller
4ee9e6e027 Merge branch 'mlxsw-Add-tunnel-devlink-trap-support'
Ido Schimmel says:

====================
mlxsw: Add tunnel devlink-trap support

This patch set from Amit adds support in mlxsw for tunnel traps and a
few additional layer 3 traps that can report drops and exceptions via
devlink-trap.

These traps allow the user to more quickly diagnose problems relating to
tunnel decapsulation errors, such as packet being too short to
decapsulate or a packet containing wrong GRE key in its GRE header.

Patch set overview:

Patches #1-#4 add three additional layer 3 traps. Two of which are
mlxsw-specific as they relate to hardware-specific errors. The patches
include documentation of each trap and selftests.

Patches #5-#8 are preparations. They ensure that the correct ECN bits
are set in the outer header during IPinIP encapsulation and that packets
with an invalid ECN combination in underlay and overlay are trapped to
the kernel and not decapsulated in hardware.

Patches #9-#15 add support for two tunnel related traps. Each trap is
documented and selftested using both VXLAN and IPinIP tunnels, if
applicable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:53 +01:00
Amit Cohen
b3073dfba8 selftests: devlink_trap_tunnel_vxlan: Add test case for overlay_smac_is_mc
Test that the trap is triggered under the right conditions and that
devlink counters increase when action is trap.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
3aed0722f7 mlxsw: Add OVERLAY_SMAC_MC trap
Add a trap for NVE packets that the device decided to drop because their
overlay source MAC is multicast.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
c3cae4916e devlink: Add overlay source MAC is multicast trap
Add packet trap that can report NVE packets that the device decided to
drop because their overlay source MAC is multicast.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
27942c7000 selftests: devlink_trap_tunnel_ipip: Add test case for decap_error
Test that the trap is triggered under the right conditions and that
devlink counters increase.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
ca264ef6ed selftests: devlink_trap_tunnel_vxlan: Add test case for decap_error
Test that the trap is triggered under the right conditions and that
devlink counters increase.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
a318bf621a mlxsw: Add tunnel devlink-trap support
Add the trap IDs and trap group used to report tunnel drops. Register
tunnel packet traps and associated tunnel trap group with devlink
during driver initialization.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
13c056ec7d devlink: Add tunnel generic packet traps
Add packet traps that can report packets that were dropped during tunnel
decapsulation.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
f528dfc460 mlxsw: spectrum_trap: Reorder cases according to enum order
Move L3_DROPS case to appear after L2_DROPS case.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
4a44ee67a7 mlxsw: Add ECN configurations with IPinIP tunnels
Initialize ECN mapping registers during router init according to
INET_ECN_encapsulate() and INET_ECN_decapsulate().

For IP-in-IP encapsulation, this is required to ensure that ECN bits in
the underlay are set in accordance with the kernel. For decapsulation,
this is required to ensure that packets with invalid ECN combination in
underlay and overlay are trapped to the kernel and not forwarded.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
839607e2ec mlxsw: reg: Add Tunneling IPinIP Decapsulation ECN Mapping Register
This register configures the actions that are done during IPinIP
decapsulation based on the ECN bits.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
20174900ad mlxsw: reg: Add Tunneling IPinIP Encapsulation ECN Mapping Register
This register performs mapping from overlay ECN to underlay ECN during
IPinIP encapsulation.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
196442ec5f mlxsw: Add NON_ROUTABLE trap
Add a trap for packets that the device decided to drop because they are
not supposed to be routed. For example, IGMP queries can be flooded by
the device in layer 2 and reach the router. Such packets should not be
routed and instead dropped.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
95f0ead8f0 devlink: Add non-routable packet trap
Add packet trap that can report packets that reached the router, but are
non-routable. For example, IGMP queries can be flooded by the device in
layer 2 and reach the router. Such packets should not be routed and
instead dropped.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
740e87bc3d selftests: devlink_trap_l3_drops: Add test cases of irif and erif disabled
Add test cases to check that packets routed through disabled RIFs and
packets routed from disabled RIFs are dropped and devlink counters
increase when the action is trap.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
Amit Cohen
5b05162160 mlxsw: Add irif and erif disabled traps
IRIF_DISABLED and ERIF_DISABLED are driver specific traps. Packets are
dropped for these reasons when they need to be routed through/from
existing router interfaces (RIF) which are disabled.

Add devlink driver-specific traps and mlxsw trap IDs used to report
these traps.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:23:52 +01:00
David S. Miller
95ae2d1d11 Merge branch 'for-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
Mellanox, mlx5 E-Switch chains and prios

This series has two parts,

1) A merge commit with mlx5-next branch that include updates for mlx5
HW layouts needed for this and upcoming submissions.

2) From Paul, Increase the number of chains and prios

Currently the Mellanox driver supports offloading tc rules that
are defined on the first 4 chains and the first 16 priorities.
The restriction stems from the firmware flow level enforcement
requiring a flow table of a certain level to point to a flow
table of a higher level. This limitation may be ignored by setting
the ignore_flow_level bit when creating flow table entries.
Use unmanaged tables and ignore flow level to create more tables than
declared by fs_core steering. Manually manage the connections between the
tables themselves.

HW table is instantiated for every tc <chain,prio> tuple. The miss rule
of every table either jumps to the next <chain,prio> table, or continues
to slow_fdb. This logic is realized by following this sequence:

1. Create an auto-grouped flow table for the specified priority with
    reserved entries

Reserved entries are allocated at the end of the flow table.
Flow groups are evaluated in sequence and therefore it is guaranteed
that the flow group defined on the last FTEs will be the last to evaluate.

Define a "match all" flow group on the reserved entries, providing
the platform to add table miss actions.

2. Set the miss rule action to jump to the next <chain,prio> table
    or the slow_fdb.

3. Link the previous priority table to point to the new table by
    updating its miss rule.

Please pull and let me know if there's any problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:17:07 +01:00
Rahul Lakkireddy
b2383ad987 cxgb4: reject overlapped queues in TC-MQPRIO offload
A queue can't belong to multiple traffic classes. So, reject
any such configuration that results in overlapped queues for a
traffic class.

Fixes: b1396c2bd6 ("cxgb4: parse and configure TC-MQPRIO offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:12:53 +01:00
Rahul Lakkireddy
c856e2b6fc cxgb4: fix Tx multi channel port rate limit
T6 can support 2 egress traffic management channels per port to
double the total number of traffic classes that can be configured.
In this configuration, if the class belongs to the other channel,
then all the queues must be bound again explicitly to the new class,
for the rate limit parameters on the other channel to take effect.

So, always explicitly bind all queues to the port rate limit traffic
class, regardless of the traffic management channel that it belongs
to. Also, only bind queues to port rate limit traffic class, if all
the queues don't already belong to an existing different traffic
class.

Fixes: 4ec4762d8e ("cxgb4: add TC-MATCHALL classifier egress offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:12:02 +01:00
Dejin Zheng
0c58ac1e01 net: phy: adin: fix a warning about msleep
found a warning by the following command:
./scripts/checkpatch.pl -f drivers/net/phy/adin.c

WARNING: msleep < 20ms can sleep for up to 20ms; see Documentation/timers/timers-howto.rst
 #628: FILE: drivers/net/phy/adin.c:628:
+	msleep(10);

Signed-off-by: Dejin Zheng <zhengdejin5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:06:42 +01:00
Eric Dumazet
09d4f10a5e net: sched: act_ctinfo: fix memory leak
Implement a cleanup method to properly free ci->params

BUG: memory leak
unreferenced object 0xffff88811746e2c0 (size 64):
  comm "syz-executor617", pid 7106, jiffies 4294943055 (age 14.250s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    c0 34 60 84 ff ff ff ff 00 00 00 00 00 00 00 00  .4`.............
  backtrace:
    [<0000000015aa236f>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
    [<0000000015aa236f>] slab_post_alloc_hook mm/slab.h:586 [inline]
    [<0000000015aa236f>] slab_alloc mm/slab.c:3320 [inline]
    [<0000000015aa236f>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3549
    [<000000002c946bd1>] kmalloc include/linux/slab.h:556 [inline]
    [<000000002c946bd1>] kzalloc include/linux/slab.h:670 [inline]
    [<000000002c946bd1>] tcf_ctinfo_init+0x21a/0x530 net/sched/act_ctinfo.c:236
    [<0000000086952cca>] tcf_action_init_1+0x400/0x5b0 net/sched/act_api.c:944
    [<000000005ab29bf8>] tcf_action_init+0x135/0x1c0 net/sched/act_api.c:1000
    [<00000000392f56f9>] tcf_action_add+0x9a/0x200 net/sched/act_api.c:1410
    [<0000000088f3c5dd>] tc_ctl_action+0x14d/0x1bb net/sched/act_api.c:1465
    [<000000006b39d986>] rtnetlink_rcv_msg+0x178/0x4b0 net/core/rtnetlink.c:5424
    [<00000000fd6ecace>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
    [<0000000047493d02>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442
    [<00000000bdcf8286>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
    [<00000000bdcf8286>] netlink_unicast+0x223/0x310 net/netlink/af_netlink.c:1328
    [<00000000fc5b92d9>] netlink_sendmsg+0x2c0/0x570 net/netlink/af_netlink.c:1917
    [<00000000da84d076>] sock_sendmsg_nosec net/socket.c:639 [inline]
    [<00000000da84d076>] sock_sendmsg+0x54/0x70 net/socket.c:659
    [<0000000042fb2eee>] ____sys_sendmsg+0x2d0/0x300 net/socket.c:2330
    [<000000008f23f67e>] ___sys_sendmsg+0x8a/0xd0 net/socket.c:2384
    [<00000000d838e4f6>] __sys_sendmsg+0x80/0xf0 net/socket.c:2417
    [<00000000289a9cb1>] __do_sys_sendmsg net/socket.c:2426 [inline]
    [<00000000289a9cb1>] __se_sys_sendmsg net/socket.c:2424 [inline]
    [<00000000289a9cb1>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2424

Fixes: 24ec483cec ("net: sched: Introduce act_ctinfo action")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Kevin 'ldir' Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Kevin 'ldir' Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:02:15 +01:00
David S. Miller
d82f28726f Merge branch 'Rate-adaptation-for-Felix-DSA-switch'
Vladimir Oltean says:

====================
Rate adaptation for Felix DSA switch

When operating the MAC at 2.5Gbps (2500Base-X and USXGMII/QSXGMII) and
in combination with certain PHYs, it is possible that the copper side
may operate at lower link speeds. In this case, it is the PHY who has a
MAC inside of it that emits pause frames towards the switch's MAC,
telling it to slow down so that the transmission is lossless.

These patches are the support needed for the switch side of things to
work.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:00:17 +01:00
Alex Marginean
74984a1904 net: dsa: felix: Allow PHY to AN 10/100/1000 with 2500 serdes link
If the serdes link is set to 2500 using interfce type 2500base-X, lower
link speeds over on the line side should still be supported.
Rate adaptation is done out of band, in our case using AQR PHYs this is
done using flow control.

Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:00:17 +01:00
Alex Marginean
f3660937e1 net: dsa: felix: Handle PAUSE RX regardless of AN result
Flow control is used with 2500Base-X and AQR PHYs to do rate adaptation
between line side 100/1000 links and MAC running at 2.5G.

This is independent of the flow control configuration settled on line
side though AN.

In general, allowing the MAC to handle flow control even if not
negotiated with the link partner should not be a problem, so the patch
just enables it in all cases.

Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:00:17 +01:00
Greg Kroah-Hartman
7b2d7faa09 Merge tag 'iio-for-5.6b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
Jonathan writes:

Second set of new device support, features and minor fixes for IIO in the 5.6 cycle

Just a small set this time.

As we are very near the merge window, I've rolled a few fixes in here
rather than adding noise just before release.  A short delay here will
do little harm.

New device support
* adis16480
  - Add support for adis16490. After earlier rework this is simple ID plus
    chip info.

Features
* kxcjk1013
  - mount matrix support.
* lsm_6dsx
  - mount matrix support.

Cleanups / minor or late breaking fixes
* ad7124
  - add support to ad-sigma-delta and use it in this driver to allow
    the the interrupt type to be IRQF_TRIGGER_LOW unlike most other devices
    using this framework.
* adis
  - use delay structure now available in SPI to handle transfer delays
  - introduce a timeouts structure to allow support of new devices
* ak8975
  - drop platform data support.  No one is using it and it adds complexity.
  - use device_get_match_data rather than open coding much the same thing.
* dht11
  - drop meaningless todo
* at91-samad2_adc
  - switch to dma_request_chan
* altas-sensor
  - add a helper function to compute number of channels.  Needed for new device
    support that is under review.
* bma400
  - add a lower bound check on scale.
* inv_mpu6050
  - add support for temperature data in the fifos for all chips.
  - support an odd situation where a board supports only interrupt triggering
    on both edges.
* st_lsm6dsx
  - check and handle potential error return.
* st_sensors
  - fix some values for the LSM9DS0 which is ever so slightly different from
    other devices using the same whoami value.
  - switch over to generic functions from dt ones, avoiding need for separate
    ACPI support.
* stm32-adc
  - switch to dma_request_chan
  - suppress an error print in deferred probe case.
* stm32-dac
  - drop private data structure element for reset controller as only used in
    probe.
  - reflect more cleanly that the reset controller is optional whilst ensuring
    that if is specified any errors are caught.
* stm32-dfsdm
  - switch to dma_request_chan
  - fix missing application of formatting to single conversions.
  - ensure the sampling rate is updated when the oversampling ratio is changed.

* tag 'iio-for-5.6b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (29 commits)
  iio: dac: stm32-dac: better handle reset controller failures
  iio: dac: stm32-dac: use reset controller only at probe time
  dt-bindings: iio: accel: kxcjk1013: Document mount-matrix property
  iio: accel: kxcjk1013: Support orientation matrix
  iio: imu: st_lsm6dsx: add mount matrix support
  iio: adc: stm32-adc: don't print an error on probe deferral
  dt-bindings: iio: adis16480: add compatible entry for ADIS16490
  iio: imu: adis16480: Add support for ADIS16490
  iio: accel: bma400: prevent setting accel scale too low
  iio: imu/mpu6050: support dual-edge IRQ
  iio: imu: inv_mpu6050: add fifo temperature data support
  iio: magnetometer: ak8975: Convert to use device_get_match_data()
  iio: magnetometer: ak8975: Get rid of platform data
  iio: adc: ad7124: Set IRQ type to falling
  iio: adc: ad-sigma-delta: Allow custom IRQ flags
  iio: imu: adis: use new `delay` structure for SPI transfer delays
  iio: adc: stm32-dfsdm: adapt sampling rate to oversampling ratio
  iio: adc: stm32-dfsdm: fix single conversion
  iio: st_sensors: Make use of device properties
  iio: st_sensors: Drop redundant parameter from st_sensors_of_name_probe()
  ...
2020-01-19 14:59:05 +01:00
David S. Miller
7f013edeba Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next, they are:

1) Incorrect uapi header comment in bitwise, from Jeremy Sowden.

2) Fetch flow statistics if flow is still active.

3) Restrict flow matching on hardware based on input device.

4) Add nf_flow_offload_work_alloc() helper function.

5) Remove the last client of the FLOW_OFFLOAD_DYING flag, use teardown
   instead.

6) Use atomic bitwise operation to operate with flow flags.

7) Add nf_flowtable_hw_offload() helper function to check for the
   NF_FLOWTABLE_HW_OFFLOAD flag.

8) Add NF_FLOW_HW_REFRESH to retry hardware offload from the flowtable
   software datapath.

9) Remove indirect calls in xt_hashlimit, from Florian Westphal.

10) Add nf_flow_offload_tuple() helper to consolidate code.

11) Add nf_flow_table_offload_cmd() helper function.

12) A few whitespace cleanups in nf_tables in bitwise and the bitmap/hash
    set types, from Jeremy Sowden.

13) Cleanup netlink attribute checks in bitwise, from Jeremy Sowden.

14) Replace goto by return in error path of nft_bitwise_dump(), from
    Jeremy Sowden.

15) Add bitwise operation netlink attribute, also from Jeremy.

16) Add nft_bitwise_init_bool(), from Jeremy Sowden.

17) Add nft_bitwise_eval_bool(), also from Jeremy.

18) Add nft_bitwise_dump_bool(), from Jeremy Sowden.

19) Disallow hardware offload for other that NFT_BITWISE_BOOL,
    from Jeremy Sowden.

20) Add NFTA_BITWISE_DATA netlink attribute, again from Jeremy.

21) Add support for bitwise shift operation, from Jeremy Sowden.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 10:29:05 +01:00
Michael Walle
ccfb9299a0 mtd: spi-nor: Add support for at25sl321
This was tested in single, dual and quad mode on a custom board with the
NXP FlexSPI controller.

Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
2020-01-19 08:45:55 +02:00
Michael Walle
f3418718c0 mtd: spi-nor: Add support for w25q32jwm
Add support for the Winbond W25Q32JW-xM flashes. These have a
programmable QE bit. There is also the W25Q32JW-xQ variant which shares
the ID with the W25Q32DW and W25Q32FW parts. The W25Q32JW-xQ has the QE
bit hard strapped to 1, thus don't support the /HOLD and /WP pins.

This was tested in single, dual and quad mode on a custom board with the
NXP FlexSPI controller. Also the BP bits as well as the TB bit were
tested.

Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
2020-01-19 08:33:02 +02:00
Olof Johansson
fc585d4a5c riscv: Less inefficient gcc tishift helpers (and export their symbols)
The existing __lshrti3 was really inefficient, and the other two helpers
are also needed to compile some modules.

Add the missing versions, and export all of the symbols like arm64
already does.

This code is based on the assembly generated by libgcc builds.

This fixes a build break triggered by ubsan:

riscv64-unknown-linux-gnu-ld: lib/ubsan.o: in function `.L2':
ubsan.c:(.text.unlikely+0x38): undefined reference to `__ashlti3'
riscv64-unknown-linux-gnu-ld: ubsan.c:(.text.unlikely+0x42): undefined reference to `__ashrti3'

Signed-off-by: Olof Johansson <olof@lixom.net>
[paul.walmsley@sifive.com: use SYM_FUNC_{START,END} instead of
 ENTRY/ENDPROC; note libgcc origin]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-18 19:13:41 -08:00
Linus Torvalds
8f8972a312 Merge tag 'mtd/fixes-for-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fixes from Miquel Raynal:
 "Raw NAND:
   - GPMI: Fix the suspend/resume

  SPI-NOR:
   - Fix quad enable on Spansion like flashes
   - Fix selection of 4-byte addressing opcodes on Spansion"

* tag 'mtd/fixes-for-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: gpmi: Restore nfc timing setup after suspend/resume
  mtd: rawnand: gpmi: Fix suspend/resume problem
  mtd: spi-nor: Fix quad enable for Spansion like flashes
  mtd: spi-nor: Fix selection of 4-byte addressing opcodes on Spansion
2020-01-18 16:34:17 -08:00
Rob Herring
62b5efc919 arm64: dts: rockchip: Kill off "simple-panel" compatibles
"simple-panel" is a Linux driver and has never been an accepted upstream
compatible string, so remove it.

Cc: Heiko Stuebner <heiko@sntech.de>
Cc: linux-rockchip@lists.infradead.org
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20200117230851.25434-1-robh@kernel.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2020-01-18 23:58:56 +01:00
Rob Herring
8039c828a6 ARM: dts: rockchip: Kill off "simple-panel" compatibles
"simple-panel" is a Linux driver and has never been an accepted upstream
compatible string, so remove it.

Cc: Heiko Stuebner <heiko@sntech.de>
Cc: linux-rockchip@lists.infradead.org
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20200117230851.25434-1-robh@kernel.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2020-01-18 23:57:39 +01:00
Johan Jonker
3ef7c2558f arm64: dts: rockchip: rename dwmmc node names to mmc
Current dts files with 'dwmmc' nodes are manually verified.
In order to automate this process rockchip-dw-mshc.txt
has to be converted to yaml. In the new setup
rockchip-dw-mshc.yaml will inherit properties from
mmc-controller.yaml and synopsys-dw-mshc-common.yaml.
'dwmmc' will no longer be a valid name for a node,
so change them all to 'mmc'

Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Link: https://lore.kernel.org/r/20200115185244.18149-2-jbx6244@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2020-01-18 23:56:10 +01:00
Johan Jonker
fed1fc5194 ARM: dts: rockchip: rename dwmmc node names to mmc
Current dts files with 'dwmmc' nodes are manually verified.
In order to automate this process rockchip-dw-mshc.txt
has to be converted to yaml. In the new setup
rockchip-dw-mshc.yaml will inherit properties from
mmc-controller.yaml and synopsys-dw-mshc-common.yaml.
'dwmmc' will no longer be a valid name for a node,
so change them all to 'mmc'

Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Link: https://lore.kernel.org/r/20200115185244.18149-1-jbx6244@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2020-01-18 23:54:15 +01:00
Linus Torvalds
244dc26890 Merge tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Back from LCA2020, fixes wasn't too busy last week, seems to have
  quieten down appropriately, some amdgpu, i915, then a core mst fix and
  one fix for virtio-gpu and one for rockchip:

  core mst:
   - serialize down messages and clear timeslots are on unplug

  amdgpu:
   - Update golden settings for renoir
   - eDP fix

  i915:
   - uAPI fix: Remove dash and colon from PMU names to comply with
     tools/perf
   - Fix for include file that was indirectly included
   - Two fixes to make sure VMA are marked active for error capture

  virtio:
   - maintain obj reservation lock when submitting cmds

  rockchip:
   - increase link rate var size to accommodate rates"

* tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: Reorder detect_edp_sink_caps before link settings read.
  drm/amdgpu: update goldensetting for renoir
  drm/dp_mst: Have DP_Tx send one msg at a time
  drm/dp_mst: clear time slots for ports invalid
  drm/i915/pmu: Do not use colons or dashes in PMU names
  drm/rockchip: fix integer type used for storing dp data rate
  drm/i915/gt: Mark ring->vma as active while pinned
  drm/i915/gt: Mark context->state vma as active while pinned
  drm/i915/gt: Skip trying to unbind in restore_ggtt_mappings
  drm/i915: Add missing include file <linux/math64.h>
  drm/virtio: add missing virtio_gpu_array_lock_resv call
2020-01-18 13:57:31 -08:00
Ilie Halip
95f4d9cced riscv: delete temporary files
Temporary files used in the VDSO build process linger on even after make
mrproper: vdso-dummy.o.tmp, vdso.so.dbg.tmp.

Delete them once they're no longer needed.

Signed-off-by: Ilie Halip <ilie.halip@gmail.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-18 13:22:13 -08:00
Linus Torvalds
0cc2682d8b Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes:

   - a resctrl fix for uninitialized objects found by debugobjects

   - a resctrl memory leak fix

   - fix the unintended re-enabling of the of SME and SEV CPU flags if
     memory encryption was disabled at bootup via the MSR space"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Ensure clearing of SME/SEV features is maintained
  x86/resctrl: Fix potential memory leak
  x86/resctrl: Fix an imbalance in domain_remove_cpu()
2020-01-18 13:02:12 -08:00
Linus Torvalds
7ff15cd045 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
 "Three fixes: fix link failure on Alpha, fix a Sparse warning and
  annotate/robustify a lockless access in the NOHZ code"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick/sched: Annotate lockless access to last_jiffies_update
  lib/vdso: Make __cvdso_clock_getres() static
  time/posix-stubs: Provide compat itimer supoprt for alpha
2020-01-18 13:00:59 -08:00
Linus Torvalds
9e79c52332 Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cpu/SMT fix from Ingo Molnar:
 "Fix a build bug on CONFIG_HOTPLUG_SMT=y && !CONFIG_SYSFS kernels"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/SMT: Fix x86 link error without CONFIG_SYSFS
2020-01-18 12:57:41 -08:00