Commit Graph

694398 Commits

Author SHA1 Message Date
Linus Torvalds
26273939ac Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix handling of initial STATE message in TIPC, from Jon Paul Maloy.

 2) Fix stats handling in bcm_sysport_get_stats(), from Florian
    Fainelli.

 3) Reject 16777215 VNI value in geneve_validate(), from Girish
    Moodalbail.

 4) Fix initial IGMP sysctl setting regression, from Nikolay Borisov.

 5) Once a UFO fragmented frame is treated as UFO, we should continue
    doing so. Likewise once a frame has been segmented, we should
    continue doing that and not try to convert it to a UFO frame. From
    Willem de Bruijn.

 6) Test the AF_PACKET RX/TX ring pg_vec state under the socket lock to
    prevent races. From Willem de Bruijn.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  packet: fix tp_reserve race in packet_set_ring
  udp: consistently apply ufo or fragmentation
  net: sched: set xt_tgchk_param par.nft_compat as 0 in ipt_init_target
  igmp: Fix regression caused by igmp sysctl namespace code.
  geneve: maximum value of VNI cannot be used
  net: systemport: Fix software statistics for SYSTEMPORT Lite
  tipc: remove premature ESTABLISH FSM event at link synchronization
2017-08-10 10:30:29 -07:00
Willem de Bruijn
c27927e372 packet: fix tp_reserve race in packet_set_ring
Updates to tp_reserve can race with reads of the field in
packet_set_ring. Avoid this by holding the socket lock during
updates in setsockopt PACKET_RESERVE.

This bug was discovered by syzkaller.

Fixes: 8913336a7e ("packet: add PACKET_RESERVE sockopt")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 09:52:12 -07:00
Willem de Bruijn
85f1bd9a7b udp: consistently apply ufo or fragmentation
When iteratively building a UDP datagram with MSG_MORE and that
datagram exceeds MTU, consistently choose UFO or fragmentation.

Once skb_is_gso, always apply ufo. Conversely, once a datagram is
split across multiple skbs, do not consider ufo.

Sendpage already maintains the first invariant, only add the second.
IPv6 does not have a sendpage implementation to modify.

A gso skb must have a partial checksum, do not follow sk_no_check_tx
in udp_send_skb.

Found by syzkaller.

Fixes: e89e9cf539 ("[IPv4/IPv6]: UFO Scatter-gather approach")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 09:52:12 -07:00
David S. Miller
2e2d5d767c Merge branch 'rtnetlink-fix-initial-rtnl-pushdown-fallout'
Florian Westphal says:

====================
rtnetlink: fix initial rtnl pushdown fallout

This series fixes various bugs and splats reported since the
allow-handler-to-run-with-no-rtnl series went in.

Last patch adds a script that can be used to add further
tests in case more bugs are reported.
In case you prefer reverting the original series instead of
fixing fallout I can resend this patch on its own.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 09:50:22 -07:00
Florian Westphal
33b01b7b4f selftests: add rtnetlink test script
add a simple script to exercise some rtnetlink call paths, so KASAN,
lockdep etc. can yell at developer before patches are sent upstream.

This can be extended to also cover bond, team, vrf and the like.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 09:50:22 -07:00
Florian Westphal
8caa38b56c rtnetlink: fallback to UNSPEC if current family has no doit callback
We need to use PF_UNSPEC in case the requested family has no doit
callback, otherwise this now fails with EOPNOTSUPP instead of running the
unspec doit callback, as before.

Fixes: 6853dd4881 ("rtnetlink: protect handler table with rcu")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 09:50:22 -07:00
Florian Westphal
d38a65125f rtnetlink: init handler refcounts to 1
If using CONFIG_REFCOUNT_FULL=y we get following splat:
 refcount_t: increment on 0; use-after-free.
WARNING: CPU: 0 PID: 304 at lib/refcount.c:152 refcount_inc+0x47/0x50
Call Trace:
 rtnetlink_rcv_msg+0x191/0x260
 ...

This warning is harmless (0 is "no callback running", not "memory
was freed").

Use '1' as the new 'no handler is running' base instead of 0 to avoid
this.

Fixes: 019a316992 ("rtnetlink: add reference counting to prevent module unload while dump is in progress")
Reported-by: Sabrina Dubroca <sdubroca@redhat.com>
Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 09:50:22 -07:00
Florian Westphal
8515ae3843 rtnetlink: switch rtnl_link_get_slave_info_data_size to rcu
David Ahern reports following splat:
 RTNL: assertion failed at net/core/dev.c (5717)
 netdev_master_upper_dev_get+0x5f/0x70
 if_nlmsg_size+0x158/0x240
 rtnl_calcit.isra.26+0xa3/0xf0

rtnl_link_get_slave_info_data_size currently assumes RTNL protection, but
there appears to be no hard requirement for this, so use rcu instead.

At the time of this writing, there are three 'get_slave_size' callbacks
(now invoked under rcu): bond_get_slave_size, vrf_get_slave_size and
br_port_get_slave_size, all return constant only (i.e. they don't sleep).

Fixes: 6853dd4881 ("rtnetlink: protect handler table with rcu")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 09:50:22 -07:00
Florian Westphal
5c2bb9b6e2 rtnetlink: do not use RTM_GETLINK directly
Userspace sends RTM_GETLINK type, but the kernel substracts
RTM_BASE from this, i.e. 'type' doesn't contain RTM_GETLINK
anymore but instead RTM_GETLINK - RTM_BASE.

This caused the calcit callback to not be invoked when it
should have been (and vice versa).

While at it, also fix a off-by one when checking family index. vs
handler array size.

Fixes: e1fa6d216d ("rtnetlink: call rtnl_calcit directly")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 09:50:22 -07:00
Florian Westphal
377cb24884 rtnetlink: use rcu_dereference_raw to silence rcu splat
Ido reports a rcu splat in __rtnl_register.
The splat is correct; as rtnl_register doesn't grab any logs
and doesn't use rcu locks either.  It has always been like this.
handler families are not registered in parallel so there are no
races wrt. the kmalloc ordering.

The only reason to use rcu_dereference in the first place was to
avoid sparse from complaining about this.

Thus this switches to _raw() to not have rcu checks here.

The alternative is to add rtnl locking to register/unregister,
however, I don't see a compelling reason to do so as this has been
lockless for the past twenty years or so.

Fixes: 6853dd4881 ("rtnetlink: protect handler table with rcu")
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 09:50:21 -07:00
Linus Torvalds
f213ad386b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc updates from David Miller:

 1) Recognize M8 cpus, just basic chip ID matching, from Allen Pais.

 2) Prevent crashes when bringing up sunvdc virtual block devices in
    some environments. From Jim Quigley.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sunvdc: prevent sunvdc panic when mpgroup disk added to guest domain
  sparc64: Increase max_phys_bits to 51 and VA bits to 53 for M8.
  sparc64: recognize and support sparc M8 cpu type
  sparc64: properly name the cpu constants
2017-08-10 09:36:06 -07:00
John Crispin
2d57164564 net: core: fix compile error inside flow_dissector due to new dsa callback
The following error was introduced by
commit 43e665287f ("net-next: dsa: fix flow dissection")
due to a missing #if guard

net/core/flow_dissector.c: In function '__skb_flow_dissect':
net/core/flow_dissector.c:448:18: error: 'struct net_device' has no member named 'dsa_ptr'
ops = skb->dev->dsa_ptr->tag_ops;
                ^
make[3]: *** [net/core/flow_dissector.o] Error 1

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 09:16:05 -07:00
Lorenzo Pieralisi
a008873740 irqchip/gic-v3-its-platform-msi: Fix msi-parent parsing loop
While parsing the msi-parent property to chase up the IRQ domain
a given device belongs to, the index into the msi-parent tuple should
be incremented to ensure all properties entries are taken into account.

Current code missed the index update so the parsing loop does not work
in case multiple msi-parent phandles are present and may turn into
an infinite loop in of_pmsi_get_dev_id() if phandle at index 0 does
not correspond to the domain we are actually looking-up.

Fix the code by updating the phandle index at each iteration in
of_pmsi_get_dev_id().

Fixes: deac7fc1c8 ("irqchip/gic-v3-its: Parse new version of msi-parent property")
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-08-10 16:26:54 +01:00
Hanjun Guo
fdf6e7a8c9 irqchip/gic-v3-its: Allow GIC ITS number more than MAX_NUMNODES
When enabling ITS NUMA support on D05, I got the boot log:

[    0.000000] SRAT: PXM 0 -> ITS 0 -> Node 0
[    0.000000] SRAT: PXM 0 -> ITS 1 -> Node 0
[    0.000000] SRAT: PXM 0 -> ITS 2 -> Node 0
[    0.000000] SRAT: PXM 1 -> ITS 3 -> Node 1
[    0.000000] SRAT: ITS affinity exceeding max count[4]

This is wrong on D05 as we have 8 ITSs with 4 NUMA nodes.

So dynamically alloc the memory needed instead of using
its_srat_maps[MAX_NUMNODES], which count the number of
ITS entry(ies) in SRAT and alloc its_srat_maps as needed,
then build the mapping of numa node to ITS ID. Of course,
its_srat_maps will be freed after ITS probing because
we don't need that after boot.

After doing this, I got what I wanted:

[    0.000000] SRAT: PXM 0 -> ITS 0 -> Node 0
[    0.000000] SRAT: PXM 0 -> ITS 1 -> Node 0
[    0.000000] SRAT: PXM 0 -> ITS 2 -> Node 0
[    0.000000] SRAT: PXM 1 -> ITS 3 -> Node 1
[    0.000000] SRAT: PXM 2 -> ITS 4 -> Node 2
[    0.000000] SRAT: PXM 2 -> ITS 5 -> Node 2
[    0.000000] SRAT: PXM 2 -> ITS 6 -> Node 2
[    0.000000] SRAT: PXM 3 -> ITS 7 -> Node 3

Fixes: dbd2b82672 ("irqchip/gic-v3-its: Add ACPI NUMA node mapping")
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: John Garry <john.garry@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-08-10 16:22:50 +01:00
Vitaly Kuznetsov
10e66760fa x86/smpboot: Unbreak CPU0 hotplug
A hang on CPU0 onlining after a preceding offlining is observed. Trace
shows that CPU0 is stuck in check_tsc_sync_target() waiting for source
CPU to run check_tsc_sync_source() but this never happens. Source CPU,
in its turn, is stuck on synchronize_sched() which is called from
native_cpu_up() -> do_boot_cpu() -> unregister_nmi_handler().

So it's a classic ABBA deadlock, due to the use of synchronize_sched() in
unregister_nmi_handler().

Fix the bug by moving unregister_nmi_handler() from do_boot_cpu() to
native_cpu_up() after cpu onlining is done.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170803105818.9934-1-vkuznets@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 17:02:47 +02:00
Gustavo A. R. Silva
5c23a558a6 clocksource/drivers/em_sti: Fix error return codes in em_sti_probe()
Propagate the return values of platform_get_irq and devm_request_irq on
failure.

Cc: Frans Klaver <fransklaver@gmail.com>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-10 14:48:18 +02:00
Matthias Kaehlcke
d197f79887 clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
The loop to find the best memory frame in arch_timer_mem_acpi_init()
initializes the loop counter with itself ('i = i'), which is suspicious
in the first place and pointed out by clang. The loop condition is
'i < timer_count' and a prior for loop exits when 'i' reaches
'timer_count', therefore the second loop is never executed.

Initialize the loop counter with 0 to iterate over all timers, which
supposedly was the intention before the typo monster attacked.

Fixes: c2743a3676 ("clocksource: arm_arch_timer: add GTDT support for memory-mapped timer")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-10 14:48:17 +02:00
Andy Lutomirski
e93c17301a x86/asm/64: Clear AC on NMI entries
This closes a hole in our SMAP implementation.

This patch comes from grsecurity. Good catch!

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/314cc9f294e8f14ed85485727556ad4f15bb1659.1502159503.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 13:13:15 +02:00
Peter Zijlstra
9b231d9f47 perf/core: Fix time on IOC_ENABLE
Vince reported that when we do IOC_ENABLE/IOC_DISABLE while the task
is SIGSTOP'ed state the timestamps go wobbly.

It turns out we indeed fail to correctly account time while in 'OFF'
state and doing IOC_ENABLE without getting scheduled in exposes the
problem.

Further thinking about this problem, it occurred to me that we can
suffer a similar fate when we migrate an uncore event between CPUs.
The perf_event_install() on the 'new' CPU will do add_event_to_ctx()
which will reset all the time stamp, resulting in a subsequent
update_event_times() to overwrite the total_time_* fields with smaller
values.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:01:09 +02:00
Peter Zijlstra
bfe334924c perf/x86: Fix RDPMC vs. mm_struct tracking
Vince reported the following rdpmc() testcase failure:

 > Failing test case:
 >
 >	fd=perf_event_open();
 >	addr=mmap(fd);
 >	exec()  // without closing or unmapping the event
 >	fd=perf_event_open();
 >	addr=mmap(fd);
 >	rdpmc()	// GPFs due to rdpmc being disabled

The problem is of course that exec() plays tricks with what is
current->mm, only destroying the old mappings after having
installed the new mm.

Fix this confusion by passing along vma->vm_mm instead of relying on
current->mm.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 1e0fb9ec67 ("perf: Add pmu callbacks to track event mapping and unmapping")
Link: http://lkml.kernel.org/r/20170802173930.cstykcqefmqt7jau@hirez.programming.kicks-ass.net
[ Minor cleanups. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:01:08 +02:00
Max Gurtovoy
1c78f7735b nvme-pci: fix CMB sysfs file removal in reset path
Currently we create the sysfs entry even if we fail mapping
it. In that case, the unmapping will not remove the sysfs created
file. There is no good reason to create a sysfs entry for a non
working CMB and show his characteristics.

Fixes: f63572dff ("nvme: unmap CMB and remove sysfs file in reset path")
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Stephen Bates <sbates@raithlin.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-10 11:19:06 +02:00
James Smart
5073842093 lpfc: support nvmet_fc defer_rcv callback
Currently, calls to nvmet_fc_rcv_fcp_req() always copied the
FC-NVME cmd iu to a temporary buffer before returning, allowing
the driver to immediately repost the buffer to the hardware.

To address timing conditions on queue element structures vs async
command reception, the nvmet_fc transport occasionally may need to
hold on to the command iu buffer for a short period. In these cases,
the nvmet_fc_rcv_fcp_req() will return a special return code
(-EOVERFLOW). In these cases, the LLDD must delay until the new
defer_rcv lldd callback is called before recycling the buffer back
to the hw.

This patch adds support for the new nvmet_fc transport defer_rcv
callback and recognition of the new error code when passing commands
to the transport.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-10 11:19:05 +02:00
James Smart
0fb228d30b nvmet_fc: add defer_req callback for deferment of cmd buffer return
At queue creation, the transport allocates a local job struct
(struct nvmet_fc_fcp_iod) for each possible element of the queue.
When a new CMD is received from the wire, a jobs struct is allocated
from the queue and then used for the duration of the command.
The job struct contains buffer space for the wire command iu. Thus,
upon allocation of the job struct, the cmd iu buffer is copied to
the job struct and the LLDD may immediately free/reuse the CMD IU
buffer passed in the call.

However, in some circumstances, due to the packetized nature of FC
and the api of the FC LLDD which may issue a hw command to send the
wire response, but the LLDD may not get the hw completion for the
command and upcall the nvmet_fc layer before a new command may be
asynchronously received on the wire. In other words, its possible
for the initiator to get the response from the wire, thus believe a
command slot free, and send a new command iu. The new command iu
may be received by the LLDD and passed to the transport before the
LLDD had serviced the hw completion and made the teardown calls for
the original job struct. As such, there is no available job struct
available for the new io. E.g. it appears like the host sent more
queue elements than the queue size. It didn't based on it's
understanding.

Rather than treat this as a hard connection failure queue the new
request until the job struct does free up. As the buffer isn't
copied as there's no job struct, a special return value must be
returned to the LLDD to signify to hold off on recycling the cmd
iu buffer.  And later, when a job struct is allocated and the
buffer copied, a new LLDD callback is introduced to notify the
LLDD and allow it to recycle it's command iu buffer.

Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-10 11:06:38 +02:00
Martin Wilck
758f373558 nvme: strip trailing 0-bytes in wwid_show
Some broken controllers (such as earlier Linux targets) pad model or
serial fields with 0-bytes rather than spaces. The NVMe spec disallows
0 bytes in "ASCII" fields.  Thus strip trailing 0-bytes, too. Also make
sure that we get no underflow for pathological input.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-10 10:43:31 +02:00
Icenowy Zheng
56a9155074 arm64: allwinner: a64: sopine: add missing ethernet0 alias
The EMAC Ethernet controller was enabled, but an accompanying alias
was not added. This results in unstable numbering if other Ethernet
devices, such as a USB dongle, are present. Also, the bootloader uses
the alias to assign a generated stable MAC address to the device node.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Fixes: 96219b0048 ("arm64: allwinner: a64: add device tree for SoPine
		      with baseboard")
[wens@csie.org: Rewrite commit log as fixing a previous patch with Fixes]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2017-08-10 14:02:38 +08:00
Icenowy Zheng
dff751c689 arm64: allwinner: a64: pine64: add missing ethernet0 alias
The EMAC Ethernet controller was enabled, but an accompanying alias
was not added. This results in unstable numbering if other Ethernet
devices, such as a USB dongle, are present. Also, the bootloader uses
the alias to assign a generated stable MAC address to the device node.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Fixes: 9702394374 ("arm64: allwinner: pine64: Enable dwmac-sun8i")
[wens@csie.org: Rewrite commit log as fixing a previous patch with Fixes]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2017-08-10 14:01:38 +08:00
Icenowy Zheng
7dc88d2afb arm64: allwinner: a64: bananapi-m64: add missing ethernet0 alias
The EMAC Ethernet controller was enabled, but an accompanying alias
was not added. This results in unstable numbering if other Ethernet
devices, such as a USB dongle, are present. Also, the bootloader uses
the alias to assign a generated stable MAC address to the device node.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Fixes: e729549990 ("arm64: allwinner: bananapi-m64: Enable dwmac-sun8i")
[wens@csie.org: Rewrite commit log as fixing a previous patch with Fixes]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2017-08-10 14:00:10 +08:00
David S. Miller
7e1ecbcf79 Merge branch 'dsa-flow-dissection'
John Crispin says:

====================
net-next: dsa: fix flow dissection

RPS and probably other kernel features are currently broken on some if not
all DSA devices. The root cause of this is that skb_hash will call the
flow_dissector. At this point the skb still contains the magic switch
header and the skb->protocol field is not set up to the correct 802.3
value yet. By the time the tag specific code is called, removing the header
and properly setting the protocol an invalid hash is already set. In the
case of the mt7530 this will result in all flows always having the same
hash.

Changes since RFC:
* use a callback instead of static values
* add cover letter
====================

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:53:40 -07:00
John Crispin
43e665287f net-next: dsa: fix flow dissection
RPS and probably other kernel features are currently broken on some if not
all DSA devices. The root cause of this is that skb_hash will call the
flow_dissector. At this point the skb still contains the magic switch
header and the skb->protocol field is not set up to the correct 802.3
value yet. By the time the tag specific code is called, removing the header
and properly setting the protocol an invalid hash is already set. In the
case of the mt7530 this will result in all flows always having the same
hash.

Signed-off-by: Muciri Gatimu <muciri@openmesh.com>
Signed-off-by: Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com>
Signed-off-by: John Crispin <john@phrozen.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:51:47 -07:00
John Crispin
2dd592b274 net-next: tag_mtk: add flow_dissect callback to the ops struct
The MT7530 inserts the 4 magic header in between the 802.3 address and
protocol field. The patch implements the callback that can be called by
the flow dissector to figure out the real protocol and offset of the
network header. With this patch applied we can properly parse the packet
and thus make hashing function properly.

Signed-off-by: Muciri Gatimu <muciri@openmesh.com>
Signed-off-by: Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com>
Signed-off-by: John Crispin <john@phrozen.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:51:47 -07:00
John Crispin
598a968011 net-next: dsa: add flow_dissect callback to struct dsa_device_ops
When the flow dissector first sees packets coming in on a DSA devices the
802.3 header wont be located where the code expects it to be as the tag
is still present. Adding this new callback allows a DSA device to provide a
new function that the flow_dissector can use to get the correct protocol
and offset of the network header.

Signed-off-by: Muciri Gatimu <muciri@openmesh.com>
Signed-off-by: Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com>
Signed-off-by: John Crispin <john@phrozen.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:51:47 -07:00
John Crispin
68277a2c9d net-next: dsa: move struct dsa_device_ops to the global header file
We need to access this struct from within the flow_dissector to fix
dissection for packets coming in on DSA devices.

Signed-off-by: Muciri Gatimu <muciri@openmesh.com>
Signed-off-by: Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:51:47 -07:00
Xin Long
96d9703050 net: sched: set xt_tgchk_param par.nft_compat as 0 in ipt_init_target
Commit 55917a21d0 ("netfilter: x_tables: add context to know if
extension runs from nft_compat") introduced a member nft_compat to
xt_tgchk_param structure.

But it didn't set it's value for ipt_init_target. With unexpected
value in par.nft_compat, it may return unexpected result in some
target's checkentry.

This patch is to set all it's fields as 0 and only initialize the
non-zero fields in ipt_init_target.

v1->v2:
  As Wang Cong's suggestion, fix it by setting all it's fields as
  0 and only initializing the non-zero fields.

Fixes: 55917a21d0 ("netfilter: x_tables: add context to know if extension runs from nft_compat")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:46:44 -07:00
Nikolay Borisov
1714020e42 igmp: Fix regression caused by igmp sysctl namespace code.
Commit dcd87999d4 ("igmp: net: Move igmp namespace init to correct file")
moved the igmp sysctls initialization from tcp_sk_init to igmp_net_init. This
function is only called as part of per-namespace initialization, only if
CONFIG_IP_MULTICAST is defined, otherwise igmp_mc_init() call in ip_init is
compiled out, casuing the igmp pernet ops to not be registerd and those sysctl
being left initialized with 0. However, there are certain functions, such as
ip_mc_join_group which are always compiled and make use of some of those
sysctls. Let's do a partial revert of the aforementioned commit and move the
sysctl initialization into inet_init_net, that way they will always have
sane values.

Fixes: dcd87999d4 ("igmp: net: Move igmp namespace init to correct file")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196595
Reported-by: Gerardo Exequiel Pozzi <vmlinuz386@gmail.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:46:44 -07:00
David S. Miller
098b8d7e50 Merge branch 'mediatek-bring-up-QDMA-RX-ring-0'
John Crispin says:

====================
net-next: mediatek: bring up QDMA RX ring 0

The MT7623 has several DMA rings. Inside the SW path, the core will use
the PDMA when receiving traffic. While bringing up the HW path we noticed
that the PPE requires the QDMA RX to also be brought up as it uses this
ring internally for its flow scheduling.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:45:36 -07:00
John Crispin
6427dc1da5 net-next: mediatek: bring up QDMA RX ring 0
This patch is in preparation for adding HW flow and QoS offloading. For
those features to work, the driver needs to bring up the first QDMA RX
ring. This ring is used by the PPE offloading HW.

Signed-off-by: John Crisp in <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:45:36 -07:00
John Crispin
0c07ce7f1a net-next: mediatek: fix typos inside the header file
Trivial patch fixing 2 typos.

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:45:36 -07:00
Bhumika Goyal
800bb47e71 net: atm: make atmdev_ops const
Make these const as they are only stored in the ops field of a atm_dev
structure, which is const.
Done using Coccinelle.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:43:50 -07:00
Bhumika Goyal
46c4b7a569 atm: make atmdev_ops const
Make these structures const as they are either passed to the function
atm_dev_register having the corresponding argument as const or stored in
the ops field of a atm_dev structure, which is also const.
Done using Coccinelle.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:43:50 -07:00
Bhumika Goyal
d78d6776bc net: dsa: make dsa_switch_ops const
Make these structures const as they are only stored in the ops field of
a dsa_switch structure, which is const.
Done using Coccinelle.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:42:49 -07:00
Intiyaz Basha
42013e9038 liquidio: napi cleanup
Disable napi when interface is going down.
Delete napi when destroying the interface.

Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:42:07 -07:00
Girish Moodalbail
04db70d9fe geneve: maximum value of VNI cannot be used
Geneve's Virtual Network Identifier (VNI) is 24 bit long, so the range
of values for it would be from 0 to 16777215 (2^24 -1).  However, one
cannot create a geneve device with VNI set to 16777215. This patch fixes
this issue.

Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:41:04 -07:00
David Ahern
6eb7939371 net: ipv6: lower ndisc notifier priority below addrconf
ndisc_notify is used to send unsolicited neighbor advertisements
(e.g., on a link up). Currently, the ndisc notifier is run before the
addrconf notifer which means NA's are not sent for link-local addresses
which are added by the addrconf notifier.

Fix by lowering the priority of the ndisc notifier. Setting the priority
to ADDRCONF_NOTIFY_PRIORITY - 5 means it runs after addrconf and before
the route notifier which is ADDRCONF_NOTIFY_PRIORITY - 10.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:40:04 -07:00
Florian Fainelli
50ddfbafcd net: systemport: Fix software statistics for SYSTEMPORT Lite
With SYSTEMPORT Lite we have holes in our statistics layout that make us
skip over the hardware MIB counters, bcm_sysport_get_stats() was not
taking that into account resulting in reporting 0 for all SW-maintained
statistics, fix this by skipping accordingly.

Fixes: 44a4524c54 ("net: systemport: Add support for SYSTEMPORT Lite")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:39:17 -07:00
Jon Paul Maloy
ed43594aed tipc: remove premature ESTABLISH FSM event at link synchronization
When a link between two nodes come up, both endpoints will initially
send out a STATE message to the peer, to increase the probability that
the peer endpoint also is up when the first traffic message arrives.
Thereafter, if the establishing link is the second link between two
nodes, this first "traffic" message is a TUNNEL_PROTOCOL/SYNCH message,
helping the peer to perform initial synchronization between the two
links.

However, the initial STATE message may be lost, in which case the SYNCH
message will be the first one arriving at the peer. This should also
work, as the SYNCH message itself will be used to take up the link
endpoint before  initializing synchronization.

Unfortunately the code for this case is broken. Currently, the link is
brought up through a tipc_link_fsm_evt(ESTABLISHED) when a SYNCH
arrives, whereupon __tipc_node_link_up() is called to distribute the
link slots and take the link into traffic. But, __tipc_node_link_up() is
itself starting with a test for whether the link is up, and if true,
returns without action. Clearly, the tipc_link_fsm_evt(ESTABLISHED) call
is unnecessary, since tipc_node_link_up() is itself issuing such an
event, but also harmful, since it inhibits tipc_node_link_up() to
perform the test of its tasks, and the link endpoint in question hence
is never taken into traffic.

This problem has been exposed when we set up dual links between pre-
and post-4.4 kernels, because the former ones don't send out the
initial STATE message described above.

We fix this by removing the unnecessary event call.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:38:06 -07:00
Nathan Fontenot
16587c210c ibmvnic: Correct 'unused variable' warning in build.
Commit a248878d7a ("ibmvnic: Check for transport event on driver resume")
removed the loop to kick irqs on driver resume but didn't remove the now
unused loop variable 'i'.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:30:52 -07:00
Nathan Fontenot
d1cf33d931 ibmvnic: Add netdev_dbg output for debugging
To ease debugging of the ibmvnic driver add a series of netdev_dbg()
statements to track driver status, especially during initialization,
removal, and resetting of the driver.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:30:52 -07:00
Nathan Fontenot
7c1885ae9a ibmvnic: Clean up resources on probe failure
Ensure that any resources allocated during probe are released if the
probe of the driver fails.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:30:52 -07:00
Jim Quigley
3ee70591d6 sunvdc: prevent sunvdc panic when mpgroup disk added to guest domain
Using mpgroup to define multiple paths for a virtual disk causes multiple
virtual-device-port ports to be created for that virtual device.
Each virtual-device-port port then gets a vdisk created for it by the Linux
sunvdc driver. As mpgroup is not supported by the Linux sunvdc driver it
cannot handle multiple ports for a single vdisk, leading to a kernel panic
at startup.

This fix prevents more than one vdisk per virtual-device-port being created
until full virtual disk multipathing (mpgroup) support is implemented.

Signed-off-by: Jim Quigley <Jim.Quigley@oracle.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Reviewed-by: Aaron Young <aaron.young@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 22:22:32 -07:00
Nicholas Bellinger
6f48655fac target: Fix node_acl demo-mode + uncached dynamic shutdown regression
This patch fixes a generate_node_acls = 1 + cache_dynamic_acls = 0
regression, that was introduced by

  commit 01d4d67355
  Author: Nicholas Bellinger <nab@linux-iscsi.org>
  Date:   Wed Dec 7 12:55:54 2016 -0800

which originally had the proper list_del_init() usage, but was
dropped during list review as it was thought unnecessary by HCH.

However, list_del_init() usage is required during the special
generate_node_acls = 1 + cache_dynamic_acls = 0 case when
transport_free_session() does a list_del(&se_nacl->acl_list),
followed by target_complete_nacl() doing the same thing.

This was manifesting as a general protection fault as reported
by Justin:

kernel: general protection fault: 0000 [#1] SMP
kernel: Modules linked in:
kernel: CPU: 0 PID: 11047 Comm: iscsi_ttx Not tainted 4.13.0-rc2.x86_64.1+ #20
kernel: Hardware name: Intel Corporation S5500BC/S5500BC, BIOS S5500.86B.01.00.0064.050520141428 05/05/2014
kernel: task: ffff88026939e800 task.stack: ffffc90007884000
kernel: RIP: 0010:target_put_nacl+0x49/0xb0
kernel: RSP: 0018:ffffc90007887d70 EFLAGS: 00010246
kernel: RAX: dead000000000200 RBX: ffff8802556ca000 RCX: 0000000000000000
kernel: RDX: dead000000000100 RSI: 0000000000000246 RDI: ffff8802556ce028
kernel: RBP: ffffc90007887d88 R08: 0000000000000001 R09: 0000000000000000
kernel: R10: ffffc90007887df8 R11: ffffea0009986900 R12: ffff8802556ce020
kernel: R13: ffff8802556ce028 R14: ffff8802556ce028 R15: ffffffff88d85540
kernel: FS:  0000000000000000(0000) GS:ffff88027fc00000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007fffe36f5f94 CR3: 0000000009209000 CR4: 00000000003406f0
kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kernel: Call Trace:
kernel:  transport_free_session+0x67/0x140
kernel:  transport_deregister_session+0x7a/0xc0
kernel:  iscsit_close_session+0x92/0x210
kernel:  iscsit_close_connection+0x5f9/0x840
kernel:  iscsit_take_action_for_connection_exit+0xfe/0x110
kernel:  iscsi_target_tx_thread+0x140/0x1e0
kernel:  ? wait_woken+0x90/0x90
kernel:  kthread+0x124/0x160
kernel:  ? iscsit_thread_get_cpumask+0x90/0x90
kernel:  ? kthread_create_on_node+0x40/0x40
kernel:  ret_from_fork+0x22/0x30
kernel: Code: 00 48 89 fb 4c 8b a7 48 01 00 00 74 68 4d 8d 6c 24 08 4c
89 ef e8 e8 28 43 00 48 8b 93 20 04 00 00 48 8b 83 28 04 00 00 4c 89
ef <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 20
kernel: RIP: target_put_nacl+0x49/0xb0 RSP: ffffc90007887d70
kernel: ---[ end trace f12821adbfd46fed ]---

To address this, go ahead and use proper list_del_list() for all
cases of se_nacl->acl_list deletion.

Reported-by: Justin Maggard <jmaggard01@gmail.com>
Tested-by: Justin Maggard <jmaggard01@gmail.com>
Cc: Justin Maggard <jmaggard01@gmail.com>
Cc: stable@vger.kernel.org # 4.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-08-09 20:55:19 -07:00