Commit Graph

636652 Commits

Author SHA1 Message Date
memeka
c1907fa376 hperf_hmp: code fixes for kernel 4.9 2017-02-23 11:27:09 +09:00
Arseniy Krasnov
7c1b2a8b55 hperf_hmp: cpufreq routines.
when governor changes frequency, it calls callback from this patch. Frequency of
each CPU is used for imbalance calculation.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
e912ef3d90 hperf_hmp: rest of logic.
calculation during enqueue/dequeue task from runqueue and affinity mask
change callback for fair scheduling class.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
a35492a605 hperf_hmp: task CPU selection logic.
or it is not WF_SYNC wakeup, idlest CPU from both clusters is selected. Else,
default wake up logic is used('want_affine'). If it fails, idlest CPU from both
clusters is selected.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
77c044cf30 hperf_hmp: idle pull function.
from another cluster when it is overloaded. Also A7 can't pull alone task from
A15, but A15 can do that with A7 core. Task for migration is chosen in the same
way as for other HMP migration cases - using 'druntime' metric. Only difference
is that migration task doesn't need to run 5ms on its cluster before migration.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
d538498ae9 hperf_hmp: one way balancing function.
to/from another cluster. Called when balancing between clusters is broken and we
need to fix it.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
66bc70c6d3 hperf_hmp: swap tasks function.
cluster. It scans two runqueues looking for tasks using 'druntime' metric. When
both tasks are found it pulls task from another cluster, and push task from the
current CPU.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
a029b845cd hperf_hmp: migration auxiliary functions.
cluster for migration process, searching task to migrate from runqueue mentioned
above and function to move task from one CPU to another.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
62f3b069e4 hperf_hmp: is_hmp_imbalance introduced.
cases are possible: balancing from/to one of clusters, task swap(when clusters
are balanced) or skip rebalance. Function calculates load difference between two
cluster(cluster load / cluster power) and threshold when balancing is needed.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
9cfcccae3d hperf_hmp: introduce druntime metric.
This patch adds special per-task metric to look for candidate for
migration between HMP domains(clusters). 'druntime' grows up when task runs on
A7 cluster, and goes down on A15 cluster. Also druntime is scaled according load
on little cluster in order to align its value with big cluster's total druntime.
For migration from big/little to little/big cluster task with lowest/highest
'druntime' chosen. 'druntime' is used to execute each task on each cluster
approximately same amount of time. 'druntime' is calculated each call of default
'update_curr' function.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
d97de18389 hperf_hmp: scheduler initialization routines.
setup, which initializes some HMP scheduler variables: big and little cluster
masks. They are read from kernel config(if set), else default values are used.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
29894de8f1 hperf_hmp: add sched domains initialization.
has two pointers to A15 and A7 scheduling groups(struct sched_group).

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
b8ac462153 hperf_hmp: introduce new domain flag.
scheduler as HMP domain. HPERF_HMP logic works between two HMP domains, the
default CFS logic, in turn, works inside the HMP domain.

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
Arseniy Krasnov
57c56bce47 hperf_hmp: add new config for arm and arm64.
adds the following options:
'HPERF_HMP_DEBUG': enables extra runtime checks of balancing parameteres.
'HMP_FAST_CPU_MASK': CPU mask of A15 cluster(in hex string).
'HMP_SLOW_CPU_MASK': CPU mask of A7 cluster(in hex string).

Signed-off-by: Tarek Dakhran <t.dakhran <at> samsung.com>
Signed-off-by: Sergey Dyasly <s.dyasly <at> samsung.com>
Signed-off-by: Dmitriy Safonov <d.safonov <at> partner.samsung.com>
Signed-off-by: Arseniy Krasnov <a.krasnov <at> samsung.com>
Signed-off-by: Ilya Maximets <i.maximets <at> samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
2017-02-23 11:27:09 +09:00
DarkBahamut
c20fbc62da usb: host: xhci-plat: Bring USB phy up earlier to ensure
streams supported
jenkins-deb_kernel_5422_4.9-9
2017-02-23 10:50:30 +09:00
Mauro (mdrjr) Ribeiro
821fe3147f defconfig: enable more filesystems support
Change-Id: Ie60ded021122ceb99329a8110adb3d50b7b101d1
2017-02-23 10:44:29 +09:00
Mauro (mdrjr) Ribeiro
bd9d57788d Merge branch 'odroidxu4-4.9.y' of 192.168.2.249:root/linux into odroidxu4-4.9.y
Change-Id: I2fa5679fa73b82c4e95b62b9094a72127771102a
jenkins-deb_kernel_5422_4.9-8 jenkins-deb_kernel_5422_4.9-7
2017-02-22 15:14:16 +09:00
Mauro (mdrjr) Ribeiro
062743eb81 Merge tag 'v4.9.11' of git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable into odroidxu3-4.9.y
This is the 4.9.11 stable release
2017-02-22 15:02:12 +09:00
Mauro (mdrjr) Ribeiro
a1a07e0283 ODROID: defconfig: enable several options for release
Change-Id: I3cdffe22e01e94822c8d7619cfb4d340e86d1121
2017-02-22 14:59:39 +09:00
Mauro (mdrjr) Ribeiro
42aaddbb35 ODROID: defconfig: enable several options for release
Change-Id: I3cdffe22e01e94822c8d7619cfb4d340e86d1121
jenkins-deb_kernel_5422_4.9-6 jenkins-deb_kernel_5422_4.9-5 jenkins-deb_kernel_5422_4.9-4 jenkins-deb_kernel_5422_4.9-3 jenkins-deb_kernel_5422_4.9-1
2017-02-21 20:37:56 +09:00
Mauro (mdrjr) Ribeiro
38fe8c408f Merge branch 'odroidxu3-4.9.y' 2017-02-21 19:05:21 +09:00
charles.park
9d5e17cecf ODROID-XU4 : RTL8723 Driver enable (WIFI/BLUETOOTH)
Change-Id: Id9fccd668f4ddc68c7deef77c124f3c93f5d5ded
2017-02-21 18:16:15 +09:00
charles.park
b49fa801ad ODROID-XU4 : WIFI,Bluetooth(RTL8723) Debug Message disable.
Change-Id: I6677c7ee89125895b99cc071a80c8059433fa6fb
2017-02-21 18:14:25 +09:00
charles.park
3d6cbf3602 ODROID-XU4 : bluetooth 8723DU compile error fix.
Change-Id: Ia44ef12c7ba99535942319edfcc565ebe86996e7
2017-02-21 17:35:02 +09:00
charles.park
abeeb41432 ODROID-XU4 : 8723DU Bluetooth driver add.
Change-Id: Ia8faca72f27b0b3c78913396f1778f50e3c931fb
2017-02-21 17:07:28 +09:00
Greg Kroah-Hartman
eee1550b3e Linux 4.9.11 2017-02-18 15:11:56 +01:00
Yu-cheng Yu
724aedaa5c x86/fpu/xstate: Fix xcomp_bv in XSAVES header
commit dffba9a31c upstream.

The compacted-format XSAVES area is determined at boot time and
never changed after.  The field xsave.header.xcomp_bv indicates
which components are in the fixed XSAVES format.

In fpstate_init() we did not set xcomp_bv to reflect the XSAVES
format since at the time there is no valid data.

However, after we do copy_init_fpstate_to_fpregs() in fpu__clear(),
as in commit:

  b22cbe404a x86/fpu: Fix invalid FPU ptrace state after execve()

and when __fpu_restore_sig() does fpu__restore() for a COMPAT-mode
app, a #GP occurs.  This can be easily triggered by doing valgrind on
a COMPAT-mode "Hello World," as reported by Joakim Tjernlund and
others:

	https://bugzilla.kernel.org/show_bug.cgi?id=190061

Fix it by setting xcomp_bv correctly.

This patch also moves the xcomp_bv initialization to the proper
place, which was in copyin_to_xsaves() as of:

  4c833368f0 x86/fpu: Set the xcomp_bv when we fake up a XSAVES area

which fixed the bug too, but it's more efficient and cleaner to
initialize things once per boot, not for every signal handling
operation.

Reported-by: Kevin Hao <haokexin@gmail.com>
Reported-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: haokexin@gmail.com
Link: http://lkml.kernel.org/r/1485212084-4418-1-git-send-email-yu-cheng.yu@intel.com
[ Combined it with 4c833368f0. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:44 +01:00
Pablo Neira
0d4c19ee68 tcp: don't annotate mark on control socket from tcp_v6_send_response()
commit 92e55f412c upstream.

Unlike ipv4, this control socket is shared by all cpus so we cannot use
it as scratchpad area to annotate the mark that we pass to ip6_xmit().

Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket
family caches the flowi6 structure in the sctp_transport structure, so
we cannot use to carry the mark unless we later on reset it back, which
I discarded since it looks ugly to me.

Fixes: bf99b4ded5 ("tcp: fix mark propagation with fwmark_reflect enabled")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:44 +01:00
Mark Bloch
0e0751cdfa net/mlx5: Don't unlock fte while still using it
commit 0fd758d611 upstream.

When adding a new rule to an fte, we need to hold the fte lock
until we add that rule to the fte and increase the fte ref count.

Fixes: 0c56b97503 ("net/mlx5_core: Introduce flow steering API")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:43 +01:00
Pau Espin Pedrol
7c4c32a297 tcp: fix mark propagation with fwmark_reflect enabled
commit bf99b4ded5 upstream.

Otherwise, RST packets generated by the TCP stack for non-existing
sockets always have mark 0.
The mark from the original packet is assigned to the netns_ipv4/6
socket used to send the response so that it can get copied into the
response skb when the socket sends it.

Fixes: e110861f86 ("net: add a sysctl to reflect the fwmark on replies")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Pau Espin Pedrol <pau.espin@tessares.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:43 +01:00
Hangbin Liu
16a3fbe523 igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()
[ Upstream commit 9c8bb163ae ]

In function igmpv3/mld_add_delrec() we allocate pmc and put it in
idev->mc_tomb, so we should free it when we don't need it in del_delrec().
But I removed kfree(pmc) incorrectly in latest two patches. Now fix it.

Fixes: 24803f38a5 ("igmp: do not remove igmp souce list info when ...")
Fixes: 1666d49e1d ("mld: do not remove mld souce list info when ...")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:43 +01:00
Hangbin Liu
53a76d633b mld: do not remove mld souce list info when set link down
[ Upstream commit 1666d49e1d ]

This is an IPv6 version of commit 24803f38a5 ("igmp: do not remove igmp
souce list..."). In mld_del_delrec(), we will restore back all source filter
info instead of flush them.

Move mld_clear_delrec() from ipv6_mc_down() to ipv6_mc_destroy_dev() since
we should not remove source list info when set link down. Remove
igmp6_group_dropped() in ipv6_mc_destroy_dev() since we have called it in
ipv6_mc_down().

Also clear all source info after igmp6_group_dropped() instead of in it
because ipv6_mc_down() will call igmp6_group_dropped().

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:43 +01:00
Eric Dumazet
5b1bb4cbd7 l2tp: do not use udp_ioctl()
[ Upstream commit 72fb96e7bd ]

udp_ioctl(), as its name suggests, is used by UDP protocols,
but is also used by L2TP :(

L2TP should use its own handler, because it really does not
look the same.

SIOCINQ for instance should not assume UDP checksum or headers.

Thanks to Andrey and syzkaller team for providing the report
and a nice reproducer.

While crashes only happen on recent kernels (after commit
7c13f97ffd ("udp: do fwd memory scheduling on dequeue")), this
probably needs to be backported to older kernels.

Fixes: 7c13f97ffd ("udp: do fwd memory scheduling on dequeue")
Fixes: 8558467201 ("udp: Fix udp_poll() and ioctl()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:43 +01:00
Florian Fainelli
12758a2824 net: dsa: Do not destroy invalid network devices
[ Upstream commit 382e1eea2d ]

dsa_slave_create() can fail, and dsa_user_port_unapply() will properly check
for the network device not being NULL before attempting to destroy it. We were
not setting the slave network device as NULL if dsa_slave_create() failed, so
we would later on be calling dsa_slave_destroy() on a now free'd and
unitialized network device, causing crashes in dsa_slave_destroy().

Fixes: 83c0afaec7 ("net: dsa: Add new binding implementation")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:43 +01:00
WANG Cong
a700cf26a3 ping: fix a null pointer dereference
[ Upstream commit 73d2c6678e ]

Andrey reported a kernel crash:

  general protection fault: 0000 [#1] SMP KASAN
  Dumping ftrace buffer:
     (ftrace buffer empty)
  Modules linked in:
  CPU: 2 PID: 3880 Comm: syz-executor1 Not tainted 4.10.0-rc6+ #124
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  task: ffff880060048040 task.stack: ffff880069be8000
  RIP: 0010:ping_v4_push_pending_frames net/ipv4/ping.c:647 [inline]
  RIP: 0010:ping_v4_sendmsg+0x1acd/0x23f0 net/ipv4/ping.c:837
  RSP: 0018:ffff880069bef8b8 EFLAGS: 00010206
  RAX: dffffc0000000000 RBX: ffff880069befb90 RCX: 0000000000000000
  RDX: 0000000000000018 RSI: ffff880069befa30 RDI: 00000000000000c2
  RBP: ffff880069befbb8 R08: 0000000000000008 R09: 0000000000000000
  R10: 0000000000000002 R11: 0000000000000000 R12: ffff880069befab0
  R13: ffff88006c624a80 R14: ffff880069befa70 R15: 0000000000000000
  FS:  00007f6f7c716700(0000) GS:ffff88006de00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000004a6f28 CR3: 000000003a134000 CR4: 00000000000006e0
  Call Trace:
   inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
   sock_sendmsg_nosec net/socket.c:635 [inline]
   sock_sendmsg+0xca/0x110 net/socket.c:645
   SYSC_sendto+0x660/0x810 net/socket.c:1687
   SyS_sendto+0x40/0x50 net/socket.c:1655
   entry_SYSCALL_64_fastpath+0x1f/0xc2

This is because we miss a check for NULL pointer for skb_peek() when
the queue is empty. Other places already have the same check.

Fixes: c319b4d76b ("net: ipv4: add IPPROTO_ICMP socket kind")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:43 +01:00
Willem de Bruijn
8284954189 packet: round up linear to header len
[ Upstream commit 57031eb794 ]

Link layer protocols may unconditionally pull headers, as Ethernet
does in eth_type_trans. Ensure that the entire link layer header
always lies in the skb linear segment. tpacket_snd has such a check.
Extend this to packet_snd.

Variable length link layer headers complicate the computation
somewhat. Here skb->len may be smaller than dev->hard_header_len.

Round up the linear length to be at least as long as the smallest of
the two.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:43 +01:00
Willem de Bruijn
6ebde312a8 net: introduce device min_header_len
[ Upstream commit 217e6fa24c ]

The stack must not pass packets to device drivers that are shorter
than the minimum link layer header length.

Previously, packet sockets would drop packets smaller than or equal
to dev->hard_header_len, but this has false positives. Zero length
payload is used over Ethernet. Other link layer protocols support
variable length headers. Support for validation of these protocols
removed the min length check for all protocols.

Introduce an explicit dev->min_header_len parameter and drop all
packets below this value. Initially, set it to non-zero only for
Ethernet and loopback. Other protocols can follow in a patch to
net-next.

Fixes: 9ed988cd59 ("packet: validate variable length ll headers")
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:43 +01:00
WANG Cong
4cd0362114 sit: fix a double free on error path
[ Upstream commit d7426c69a1 ]

Dmitry reported a double free in sit_init_net():

  kernel BUG at mm/percpu.c:689!
  invalid opcode: 0000 [#1] SMP KASAN
  Dumping ftrace buffer:
     (ftrace buffer empty)
  Modules linked in:
  CPU: 0 PID: 15692 Comm: syz-executor1 Not tainted 4.10.0-rc6-next-20170206 #1
  Hardware name: Google Google Compute Engine/Google Compute Engine,
  BIOS Google 01/01/2011
  task: ffff8801c9cc27c0 task.stack: ffff88017d1d8000
  RIP: 0010:pcpu_free_area+0x68b/0x810 mm/percpu.c:689
  RSP: 0018:ffff88017d1df488 EFLAGS: 00010046
  RAX: 0000000000010000 RBX: 00000000000007c0 RCX: ffffc90002829000
  RDX: 0000000000010000 RSI: ffffffff81940efb RDI: ffff8801db841d94
  RBP: ffff88017d1df590 R08: dffffc0000000000 R09: 1ffffffff0bb3bdd
  R10: dffffc0000000000 R11: 00000000000135dd R12: ffff8801db841d80
  R13: 0000000000038e40 R14: 00000000000007c0 R15: 00000000000007c0
  FS:  00007f6ea608f700(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000000002000aff8 CR3: 00000001c8d44000 CR4: 00000000001426f0
  DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
  Call Trace:
   free_percpu+0x212/0x520 mm/percpu.c:1264
   ipip6_dev_free+0x43/0x60 net/ipv6/sit.c:1335
   sit_init_net+0x3cb/0xa10 net/ipv6/sit.c:1831
   ops_init+0x10a/0x530 net/core/net_namespace.c:115
   setup_net+0x2ed/0x690 net/core/net_namespace.c:291
   copy_net_ns+0x26c/0x530 net/core/net_namespace.c:396
   create_new_namespaces+0x409/0x860 kernel/nsproxy.c:106
   unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205
   SYSC_unshare kernel/fork.c:2281 [inline]
   SyS_unshare+0x64e/0xfc0 kernel/fork.c:2231
   entry_SYSCALL_64_fastpath+0x1f/0xc2

This is because when tunnel->dst_cache init fails, we free dev->tstats
once in ipip6_tunnel_init() and twice in sit_init_net(). This looks
redundant but its ndo_uinit() does not seem enough to clean up everything
here. So avoid this by setting dev->tstats to NULL after the first free,
at least for -net.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:43 +01:00
David Ahern
2b7f50d67f lwtunnel: valid encap attr check should return 0 when lwtunnel is disabled
[ Upstream commit 2bd137de53 ]

An error was reported upgrading to 4.9.8:
    root@Typhoon:~# ip route add default table 210 nexthop dev eth0 via 10.68.64.1
    weight 1 nexthop dev eth0 via 10.68.64.2 weight 1
    RTNETLINK answers: Operation not supported

The problem occurs when CONFIG_LWTUNNEL is not enabled and a multipath
route is submitted.

The point of lwtunnel_valid_encap_type_attr is catch modules that
need to be loaded before any references are taken with rntl held. With
CONFIG_LWTUNNEL disabled, there will be no modules to load so the
lwtunnel_valid_encap_type_attr stub should just return 0.

Fixes: 9ed59592e3 ("lwtunnel: fix autoload of lwt modules")
Reported-by: pupilla@libero.it
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:42 +01:00
Marcelo Ricardo Leitner
00eff2ebbd sctp: avoid BUG_ON on sctp_wait_for_sndbuf
[ Upstream commit 2dcab59848 ]

Alexander Popov reported that an application may trigger a BUG_ON in
sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is
waiting on it to queue more data and meanwhile another thread peels off
the association being used by the first thread.

This patch replaces the BUG_ON call with a proper error handling. It
will return -EPIPE to the original sendmsg call, similarly to what would
have been done if the association wasn't found in the first place.

Acked-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:42 +01:00
Benjamin Poirier
4400acce68 mlx4: Invoke softirqs after napi_reschedule
[ Upstream commit bd4ce941c8 ]

mlx4 may schedule napi from a workqueue. Afterwards, softirqs are not run
in a deterministic time frame and the following message may be logged:
NOHZ: local_softirq_pending 08

The problem is the same as what was described in commit ec13ee8014
("virtio_net: invoke softirqs after __napi_schedule") and this patch
applies the same fix to mlx4.

Fixes: 07841f9d94 ("net/mlx4_en: Schedule napi when RX buffers allocation fails")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:42 +01:00
Ben Hutchings
970390fd5d catc: Use heap buffer for memory size test
[ Upstream commit 2d6a0e9de0 ]

Allocating USB buffers on the stack is not portable, and no longer
works on x86_64 (with VMAP_STACK enabled as per default).

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:42 +01:00
Ben Hutchings
61bf9f381c catc: Combine failure cleanup code in catc_probe()
[ Upstream commit d41149145f ]

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:42 +01:00
Ben Hutchings
e898f6f008 rtl8150: Use heap buffers for all register access
[ Upstream commit 7926aff5c5 ]

Allocating USB buffers on the stack is not portable, and no longer
works on x86_64 (with VMAP_STACK enabled as per default).

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:42 +01:00
Ben Hutchings
878b015bcc pegasus: Use heap buffers for all register access
[ Upstream commit 5593523f96 ]

Allocating USB buffers on the stack is not portable, and no longer
works on x86_64 (with VMAP_STACK enabled as per default).

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
References: https://bugs.debian.org/852556
Reported-by: Lisandro Damián Nicanor Pérez Meyer <lisandro@debian.org>
Tested-by: Lisandro Damián Nicanor Pérez Meyer <lisandro@debian.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:42 +01:00
Willem de Bruijn
b90cb484c0 macvtap: read vnet_hdr_size once
[ Upstream commit 837585a537 ]

When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
Data length is verified to be greater than or equal to expected header
length tun->vnet_hdr_sz before copying.

Macvtap functions read the value once, but unless READ_ONCE is used,
the compiler may ignore this and read multiple times. Enforce a single
read and locally cached value to avoid updates between test and use.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:42 +01:00
Willem de Bruijn
26989c9d99 tun: read vnet_hdr_sz once
[ Upstream commit e1edab87fa ]

When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
Data length is verified to be greater than or equal to expected header
length tun->vnet_hdr_sz before copying.

Read this value once and cache locally, as it can be updated between
the test and use (TOCTOU).

Signed-off-by: Willem de Bruijn <willemb@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
CC: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:42 +01:00
Eric Dumazet
0f895f51a8 tcp: avoid infinite loop in tcp_splice_read()
[ Upstream commit ccf7abb93a ]

Splicing from TCP socket is vulnerable when a packet with URG flag is
received and stored into receive queue.

__tcp_splice_read() returns 0, and sk_wait_data() immediately
returns since there is the problematic skb in queue.

This is a nice way to burn cpu (aka infinite loop) and trigger
soft lockups.

Again, this gem was found by syzkaller tool.

Fixes: 9c55e01c0c ("[TCP]: Splice receive support.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov  <dvyukov@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:42 +01:00
Eric Dumazet
1e340bb22a ipv6: tcp: add a missing tcp_v6_restore_cb()
[ Upstream commit ebf6c9cb23 ]

Dmitry reported use-after-free in ip6_datagram_recv_specific_ctl()

A similar bug was fixed in commit 8ce48623f0 ("ipv6: tcp: restore
IP6CB for pktoptions skbs"), but I missed another spot.

tcp_v6_syn_recv_sock() can indeed set np->pktoptions from ireq->pktopts

Fixes: 971f10eca1 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:41 +01:00
Eric Dumazet
ae1768bbbc ip6_gre: fix ip6gre_err() invalid reads
[ Upstream commit 7892032cfe ]

Andrey Konovalov reported out of bound accesses in ip6gre_err()

If GRE flags contains GRE_KEY, the following expression
*(((__be32 *)p) + (grehlen / 4) - 1)

accesses data ~40 bytes after the expected point, since
grehlen includes the size of IPv6 headers.

Let's use a "struct gre_base_hdr *greh" pointer to make this
code more readable.

p[1] becomes greh->protocol.
grhlen is the GRE header length.

Fixes: c12b395a46 ("gre: Support GRE over IPv6")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-18 15:11:41 +01:00