Commit Graph

858756 Commits

Author SHA1 Message Date
Paul Mackerras
c395fe1d8e KVM: PPC: Book3S HV: Avoid touching arch.mmu_ready in XIVE release functions
Currently, kvmppc_xive_release() and kvmppc_xive_native_release() clear
kvm->arch.mmu_ready and call kick_all_cpus_sync() as a way of ensuring
that no vcpus are executing in the guest.  However, future patches will
change the mutex associated with kvm->arch.mmu_ready to a new mutex that
nests inside the vcpu mutexes, making it difficult to continue to use
this method.

In fact, taking the vcpu mutex for a vcpu excludes execution of that
vcpu, and we already take the vcpu mutex around the call to
kvmppc_xive_[native_]cleanup_vcpu().  Once the cleanup function is
done and we release the vcpu mutex, the vcpu can execute once again,
but because we have cleared vcpu->arch.xive_vcpu, vcpu->arch.irq_type,
vcpu->arch.xive_esc_vaddr and vcpu->arch.xive_esc_raddr, that vcpu will
not be going into XIVE code any more.  Thus, once we have cleaned up
all of the vcpus, we are safe to clean up the rest of the XIVE state,
and we don't need to use kvm->arch.mmu_ready to hold off vcpu execution.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-05-29 13:44:36 +10:00
Eduardo Valentin
ca657468a0 Revert "drivers: thermal: tsens: Add new operation to check if a sensor is enabled"
This reverts commit 3e6a8fb330.

Cc: Andy Gross <agross@kernel.org>
Cc: David Brown <david.brown@linaro.org>
Cc: Amit Kucheria <amit.kucheria@linaro.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Suggested-by: Amit Kucheria <amit.kucheria@linaro.org>
Reported-by: Andy Gross <andygro@gmail.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-28 19:30:33 -07:00
Saeed Mahameed
c0194e2d0e net/mlx5e: Disable rxhash when CQE compress is enabled
When CQE compression is enabled (Multi-host systems), compressed CQEs
might arrive to the driver rx, compressed CQEs don't have a valid hash
offload and the driver already reports a hash value of 0 and invalid hash
type on the skb for compressed CQEs, but this is not good enough.

On a congested PCIe, where CQE compression will kick in aggressively,
gro will deliver lots of out of order packets due to the invalid hash
and this might cause a serious performance drop.

The only valid solution, is to disable rxhash offload at all when CQE
compression is favorable (Multi-host systems).

Fixes: 7219ab34f1 ("net/mlx5e: CQE compression")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28 18:25:42 -07:00
wenxu
24bcd210e2 net/mlx5e: restrict the real_dev of vlan device is the same as uplink device
When register indr block for vlan device, it should check the real_dev
of vlan device is same as uplink device. Or it will set offload rule
to mlx5e which will never hit.

Fixes: 35a605db16 ("net/mlx5e: Offload TC e-switch rules with ingress VLAN device")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28 18:25:42 -07:00
Parav Pandit
25fa506b70 net/mlx5: Allocate root ns memory using kzalloc to match kfree
root ns is yet another fs core node which is freed using kfree() by
tree_put_node().
Rest of the other fs core objects are also allocated using kmalloc
variants.

However, root ns memory is allocated using kvzalloc().
Hence allocate root ns memory using kzalloc().

Fixes: 2530236303 ("net/mlx5_core: Flow steering tree initialization")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28 18:25:42 -07:00
Parav Pandit
9414277a5d net/mlx5: Avoid double free in fs init error unwinding path
In below code flow, for ingress acl table root ns memory leads
to double free.

mlx5_init_fs
  init_ingress_acls_root_ns()
    init_ingress_acl_root_ns
       kfree(steering->esw_ingress_root_ns);
       /* steering->esw_ingress_root_ns is not marked NULL */
  mlx5_cleanup_fs
    cleanup_ingress_acls_root_ns
       steering->esw_ingress_root_ns non NULL check passes.
       kfree(steering->esw_ingress_root_ns);
       /* double free */

Similar issue exist for other tables.

Hence zero out the pointers to not process the table again.

Fixes: 9b93ab981e ("net/mlx5: Separate ingress/egress namespaces for each vport")
Fixes: 40c3eebb49e51 ("net/mlx5: Add support in RDMA RX steering")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28 18:25:41 -07:00
Parav Pandit
905f6bd30b net/mlx5: Avoid double free of root ns in the error flow path
When root ns setup for rdma, sniffer tx and sniffer rx fails,
such root ns cleanup is done by the error unwinding path of
mlx5_cleanup_fs().
Below call graph shows an example for sniffer_rx_root_ns.

mlx5_init_fs()
  init_sniffer_rx_root_ns()
    cleanup_root_ns(steering->sniffer_rx_root_ns);
mlx5_cleanup_fs()
  cleanup_root_ns(steering->sniffer_rx_root_ns);
  /* double free of sniffer_rx_root_ns */

Hence, use the existing cleanup_fs to cleanup.

Fixes: d83eb50e29 ("net/mlx5: Add support in RDMA RX steering")
Fixes: 87d22483ce ("net/mlx5: Add sniffer namespaces")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28 18:25:41 -07:00
Saeed Mahameed
8788392995 net/mlx5: Fix error handling in mlx5_load()
In case mlx5_core_set_hca_defaults fails, it should jump to
mlx5_cleanup_fs, fix that.

Fixes: c85023e153 ("IB/mlx5: Add raw ethernet local loopback support")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28 18:25:41 -07:00
David S. Miller
602e0f295a Merge branch 'hns3-next'
Huazhong Tan says:

====================
code optimizations & bugfixes for HNS3 driver

This patch-set includes code optimizations and bugfixes for the HNS3
ethernet controller driver.

[patch 1/12] fixes a compile warning reported by kbuild test robot.

[patch 2/12] fixes HNS3_RXD_GRO_SIZE_M macro definition error.

[patch 3/12] adds a debugfs command to dump firmware information.

[patch 4/12 - 10/12] adds some code optimizaions and cleanups for
reset and driver unloading.

[patch 11/12 - 12/12] adds two bugfixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Huazhong Tan
49f971bd30 net: hns3: fix a memory leak issue for hclge_map_unmap_ring_to_vf_vector
When hclge_bind_ring_with_vector() fails,
hclge_map_unmap_ring_to_vf_vector() returns the error
directly, so nobody will free the memory allocated by
hclge_get_ring_chain_from_mbx().

So hclge_free_vector_ring_chain() should be called no matter
hclge_bind_ring_with_vector() fails or not.

Fixes: 84e095d64e ("net: hns3: Change PF to add ring-vect binding & resetQ to mailbox")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Huazhong Tan
0d2f68c7bc net: hns3: adjust hns3_uninit_phy()'s location in the hns3_client_uninit()
hns3_uninit_phy() should be called before checking
HNS3_NIC_STATE_INITED flags, otherwise when this checking fails,
there is nobody to call hns3_uninit_phy().

Fixes: c8a8045b2d ("net: hns3: Fix NULL deref when unloading driver")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Huazhong Tan
acfc3d55b7 net: hns3: stop schedule reset service while unloading driver
When unloading driver, the reset task should not be scheduled
anymore. If disable IRQ before cancel ongoing reset task,
the IRQ may be re-enabled by the reset task.

This patch uses HCLGE_STATE_REMOVING/HCLGEVF_STATE_REMOVING
flag to indicate that the driver is unloading, and we should
stop new coming reset service to be scheduled, otherwise,
reset service will access some resource which has been freed
by unloading.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Huazhong Tan
ada13ee3db net: hns3: add handshake with hardware while doing reset
When reset happens, the hardware reset should begin after the
driver has finished its preparatory work, otherwise it may cause
some hardware error.

Before Hardware's reset, it will wait for the driver to write
bit HCLGE_NIC_CMQ_ENABLE of register HCLGE_NIC_CSQ_DEPTH_REG
to 1, while the driver finishes its preparatory work will do that.
BTW, since some cases this register will be cleared, so it needs
some sync time before driver's writing.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Huazhong Tan
1db58f8697 net: hns3: modify hclgevf_init_client_instance()
hclgevf_init_client_instance() is a little bloated and there is
some duplicated code. This patch adds some cleanup for it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Huazhong Tan
994e04f16e net: hns3: modify hclge_init_client_instance()
hclge_init_client_instance() is a little bloated and there is
some duplicated code. This patch adds some cleanup for it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Huazhong Tan
25d1817c4e net: hns3: use HCLGEVF_STATE_NIC_REGISTERED to indicate VF NIC client has registered
When VF NIC client's init_instance() succeeds, it means this client
has been registered successfully, so we use HCLGEVF_STATE_NIC_REGISTERED
to indicate that. And before calling VF NIC client's uninit_instance(),
we clear this state.

So any operation of VF NIC client from HCLGEVF is not allowed if this
state is not set.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Huazhong Tan
2a0bfc3618 net: hns3: use HCLGE_STATE_ROCE_REGISTERED to indicate PF ROCE client has registered
When PF ROCE client's init_instance() succeeds, it means this client
has been registered successfully, so we use HCLGE_STATE_ROCE_REGISTERED
to indicate that. And before calling PF ROCE client's uninit_instance(),
we clear this state.

So any operation of the ROCE client from HCLGE is not allowed if this
state is not set.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Huazhong Tan
bd9109c9b1 net: hns3: use HCLGE_STATE_NIC_REGISTERED to indicate PF NIC client has registered
When PF NIC client's init_instance() succeeds, it means this client
has been registered successfully, so we use HCLGE_STATE_NIC_REGISTERED
to indicate that. And before calling PF NIC client's uninit_instance(),
we clear this state.

So any operation of PF NIC client from HCLGE is not allowed if this
state is not set.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Zhongzhu Liu
33a90e2f20 net: hns3: add support for dump firmware statistics by debugfs
This patch prints firmware statistics information.

debugfs command:
echo dump m7 info > cmd

estuary:/dbg/hns3/0000:7d:00.0$ echo dump m7 info > cmd
[  172.577240] hns3 0000:7d:00.0: 0x00000000  0x00000000  0x00000000
[  172.583471] hns3 0000:7d:00.0: 0x00000000  0x00000000  0x00000000
[  172.589552] hns3 0000:7d:00.0: 0x00000030  0x00000000  0x00000000
[  172.595632] hns3 0000:7d:00.0: 0x00000000  0x00000000  0x00000000
estuary:/dbg/hns3/0000:7d:00.0$

Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:01 -07:00
Yunsheng Lin
eff858c178 net: hns3: fix for HNS3_RXD_GRO_SIZE_M macro
According to hardware user menual, the GRO_SIZE is 14 bits width,
the HNS3_RXD_GRO_SIZE_M is 10 bits width now, which may cause
hardware GRO received packet error problem.

Fixes: a6d53b97a2 ("net: hns3: Adds GRO params to SKB for the stack")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:00 -07:00
Jian Shen
4c1522765c net: hns3: fix compile warning without CONFIG_RFS_ACCEL
The ifdef condition of function hclge_add_fd_entry_by_arfs() is
unnecessary. It may cause compile warning when CONFIG_RFS_ACCEL
is not chosen. This patch fixes it by removing the ifdef condition.

Fixes: d93ed94fbe ("net: hns3: add aRFS support for PF")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:39:00 -07:00
Florian Fainelli
a6cd0d2d49 Documentation: net-sysfs: Remove duplicate PHY device documentation
Both sysfs-bus-mdio and sysfs-class-net-phydev contain the same
duplication information. There is not currently any MDIO bus specific
attribute, but there are PHY device (struct phy_device) specific
attributes. Use the more precise description from sysfs-bus-mdio and
carry that over to sysfs-class-net-phydev.

Fixes: 86f22d04df ("net: sysfs: Document PHY device sysfs attributes")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:26:44 -07:00
Eric Dumazet
8fb44d60d4 llc: fix skb leak in llc_build_and_send_ui_pkt()
If llc_mac_hdr_init() returns an error, we must drop the skb
since no llc_build_and_send_ui_pkt() caller will take care of this.

BUG: memory leak
unreferenced object 0xffff8881202b6800 (size 2048):
  comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.590s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    1a 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
  backtrace:
    [<00000000e25b5abe>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [<00000000e25b5abe>] slab_post_alloc_hook mm/slab.h:439 [inline]
    [<00000000e25b5abe>] slab_alloc mm/slab.c:3326 [inline]
    [<00000000e25b5abe>] __do_kmalloc mm/slab.c:3658 [inline]
    [<00000000e25b5abe>] __kmalloc+0x161/0x2c0 mm/slab.c:3669
    [<00000000a1ae188a>] kmalloc include/linux/slab.h:552 [inline]
    [<00000000a1ae188a>] sk_prot_alloc+0xd6/0x170 net/core/sock.c:1608
    [<00000000ded25bbe>] sk_alloc+0x35/0x2f0 net/core/sock.c:1662
    [<000000002ecae075>] llc_sk_alloc+0x35/0x170 net/llc/llc_conn.c:950
    [<00000000551f7c47>] llc_ui_create+0x7b/0x140 net/llc/af_llc.c:173
    [<0000000029027f0e>] __sock_create+0x164/0x250 net/socket.c:1430
    [<000000008bdec225>] sock_create net/socket.c:1481 [inline]
    [<000000008bdec225>] __sys_socket+0x69/0x110 net/socket.c:1523
    [<00000000b6439228>] __do_sys_socket net/socket.c:1532 [inline]
    [<00000000b6439228>] __se_sys_socket net/socket.c:1530 [inline]
    [<00000000b6439228>] __x64_sys_socket+0x1e/0x30 net/socket.c:1530
    [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
    [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0xffff88811d750d00 (size 224):
  comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.600s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 f0 0c 24 81 88 ff ff 00 68 2b 20 81 88 ff ff  ...$.....h+ ....
  backtrace:
    [<0000000053026172>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [<0000000053026172>] slab_post_alloc_hook mm/slab.h:439 [inline]
    [<0000000053026172>] slab_alloc_node mm/slab.c:3269 [inline]
    [<0000000053026172>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579
    [<00000000fa8f3c30>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198
    [<00000000d96fdafb>] alloc_skb include/linux/skbuff.h:1058 [inline]
    [<00000000d96fdafb>] alloc_skb_with_frags+0x5f/0x250 net/core/skbuff.c:5327
    [<000000000a34a2e7>] sock_alloc_send_pskb+0x269/0x2a0 net/core/sock.c:2225
    [<00000000ee39999b>] sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2242
    [<00000000e034d810>] llc_ui_sendmsg+0x10a/0x540 net/llc/af_llc.c:933
    [<00000000c0bc8445>] sock_sendmsg_nosec net/socket.c:652 [inline]
    [<00000000c0bc8445>] sock_sendmsg+0x54/0x70 net/socket.c:671
    [<000000003b687167>] __sys_sendto+0x148/0x1f0 net/socket.c:1964
    [<00000000922d78d9>] __do_sys_sendto net/socket.c:1976 [inline]
    [<00000000922d78d9>] __se_sys_sendto net/socket.c:1972 [inline]
    [<00000000922d78d9>] __x64_sys_sendto+0x2a/0x30 net/socket.c:1972
    [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
    [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:25:23 -07:00
Xue Chaojing
66350023d5 hinic: fix a bug in set rx mode
in set_rx_mode, __dev_mc_sync and netdev_for_each_mc_addr will
repeatedly set the multicast mac address. so we delete this loop.

Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:24:42 -07:00
David S. Miller
2e56571d82 Merge branch 'inet-frags-followup'
Eric Dumazet says:

====================
inet: frags: followup to 'inet-frags-avoid-possible-races-at-netns-dismantle'

Latest patch series ('inet-frags-avoid-possible-races-at-netns-dismantle')
brought another syzbot report shown in the third patch changelog.

While fixing the issue, I had to call inet_frags_fini() later
in IPv6 and ilowpan.

Also I believe a completion is needed to ensure proper dismantle
at module removal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:22:15 -07:00
Eric Dumazet
dc93f46bc4 inet: frags: fix use-after-free read in inet_frag_destroy_rcu
As caught by syzbot [1], the rcu grace period that is respected
before fqdir_rwork_fn() proceeds and frees fqdir is not enough
to prevent inet_frag_destroy_rcu() being run after the freeing.

We need a proper rcu_barrier() synchronization to replace
the one we had in inet_frags_fini()

We also have to fix a potential problem at module removal :
inet_frags_fini() needs to make sure that all queued work queues
(fqdir_rwork_fn) have completed, otherwise we might
call kmem_cache_destroy() too soon and get another use-after-free.

[1]
BUG: KASAN: use-after-free in inet_frag_destroy_rcu+0xd9/0xe0 net/ipv4/inet_fragment.c:201
Read of size 8 at addr ffff88806ed47a18 by task swapper/1/0

CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.2.0-rc1+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188
 __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 kasan_report+0x12/0x20 mm/kasan/common.c:614
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
 inet_frag_destroy_rcu+0xd9/0xe0 net/ipv4/inet_fragment.c:201
 __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
 rcu_do_batch kernel/rcu/tree.c:2092 [inline]
 invoke_rcu_callbacks kernel/rcu/tree.c:2310 [inline]
 rcu_core+0xba5/0x1500 kernel/rcu/tree.c:2291
 __do_softirq+0x25c/0x94c kernel/softirq.c:293
 invoke_softirq kernel/softirq.c:374 [inline]
 irq_exit+0x180/0x1d0 kernel/softirq.c:414
 exiting_irq arch/x86/include/asm/apic.h:536 [inline]
 smp_apic_timer_interrupt+0x13b/0x550 arch/x86/kernel/apic/apic.c:1068
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:806
 </IRQ>
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
Code: ff ff 48 89 df e8 f2 95 8c fa eb 82 e9 07 00 00 00 0f 00 2d e4 45 4b 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d d4 45 4b 00 fb f4 <c3> 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 e8 8e 18 42 fa e8 99
RSP: 0018:ffff8880a98e7d78 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
RAX: 1ffffffff1164e11 RBX: ffff8880a98d4340 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000006 RDI: ffff8880a98d4bbc
RBP: ffff8880a98e7da8 R08: ffff8880a98d4340 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: ffffffff88b27078 R14: 0000000000000001 R15: 0000000000000000
 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571
 default_idle_call+0x36/0x90 kernel/sched/idle.c:94
 cpuidle_idle_call kernel/sched/idle.c:154 [inline]
 do_idle+0x377/0x560 kernel/sched/idle.c:263
 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:354
 start_secondary+0x34e/0x4c0 arch/x86/kernel/smpboot.c:267
 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243

Allocated by task 8877:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_kmalloc mm/kasan/common.c:489 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462
 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:503
 kmem_cache_alloc_trace+0x151/0x750 mm/slab.c:3555
 kmalloc include/linux/slab.h:547 [inline]
 kzalloc include/linux/slab.h:742 [inline]
 fqdir_init include/net/inet_frag.h:115 [inline]
 ipv6_frags_init_net+0x48/0x460 net/ipv6/reassembly.c:513
 ops_init+0xb3/0x410 net/core/net_namespace.c:130
 setup_net+0x2d3/0x740 net/core/net_namespace.c:316
 copy_net_ns+0x1df/0x340 net/core/net_namespace.c:439
 create_new_namespaces+0x400/0x7b0 kernel/nsproxy.c:107
 unshare_nsproxy_namespaces+0xc2/0x200 kernel/nsproxy.c:206
 ksys_unshare+0x440/0x980 kernel/fork.c:2692
 __do_sys_unshare kernel/fork.c:2760 [inline]
 __se_sys_unshare kernel/fork.c:2758 [inline]
 __x64_sys_unshare+0x31/0x40 kernel/fork.c:2758
 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 17:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:451
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:459
 __cache_free mm/slab.c:3432 [inline]
 kfree+0xcf/0x220 mm/slab.c:3755
 fqdir_rwork_fn+0x33/0x40 net/ipv4/inet_fragment.c:154
 process_one_work+0x989/0x1790 kernel/workqueue.c:2269
 worker_thread+0x98/0xe40 kernel/workqueue.c:2415
 kthread+0x354/0x420 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

The buggy address belongs to the object at ffff88806ed47a00
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 24 bytes inside of
 512-byte region [ffff88806ed47a00, ffff88806ed47c00)
The buggy address belongs to the page:
page:ffffea0001bb51c0 refcount:1 mapcount:0 mapping:ffff8880aa400940 index:0x0
flags: 0x1fffc0000000200(slab)
raw: 01fffc0000000200 ffffea000282a788 ffffea0001bb53c8 ffff8880aa400940
raw: 0000000000000000 ffff88806ed47000 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88806ed47900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88806ed47980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88806ed47a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                            ^
 ffff88806ed47a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88806ed47b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 3c8fc87820 ("inet: frags: rework rhashtable dismantle")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:22:15 -07:00
Eric Dumazet
ae7352d384 inet: frags: call inet_frags_fini() after unregister_pernet_subsys()
Both IPv6 and 6lowpan are calling inet_frags_fini() too soon.

inet_frags_fini() is dismantling a kmem_cache, that might be needed
later when unregister_pernet_subsys() eventually has to remove
frags queues from hash tables and free them.

This fixes potential use-after-free, and is a prereq for the following patch.

Fixes: d4ad4d22e7 ("inet: frags: use kmem_cache for inet_frag_queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:22:15 -07:00
Eric Dumazet
6b73d19711 inet: frags: uninline fqdir_init()
fqdir_init() is not fast path and is getting bigger.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:22:15 -07:00
Willem de Bruijn
3fb321fde2 selftests/net: ipv6 flowlabel
Test the IPv6 flowlabel control and datapath interfaces:

Acquire and release the right to use flowlabels with socket option
IPV6_FLOWLABEL_MGR.

Then configure flowlabels on send and read them on recv with cmsg
IPV6_FLOWINFO. Also verify auto-flowlabel if not explicitly set.

This helped identify the issue fixed in commit 95c169251b ("ipv6:
invert flowlabel sharing check in process and user mode")

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:18:53 -07:00
Stefano Brivio
73f51d151e selftests: pmtu: Fix encapsulating device in pmtu_vti6_link_change_mtu
In the pmtu_vti6_link_change_mtu test, both local and remote addresses
for the vti6 tunnel are assigned to the same address given to the dummy
interface that we use as encapsulating device with a known MTU.

This works as long as the dummy interface is actually selected, via
rt6_lookup(), as encapsulating device. But if the remote address of the
tunnel is a local address too, the loopback interface could also be
selected, and there's nothing wrong with it.

This is what some older -stable kernels do (3.18.z, at least), and
nothing prevents us from subtly changing FIB implementation to revert
back to that behaviour in the future.

Define an IPv6 prefix instead, and use two separate addresses as local
and remote for vti6, so that the encapsulating device can't be a
loopback interface.

Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes: 1fad59ea1c ("selftests: pmtu: Add pmtu_vti6_link_change_mtu test")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:15:06 -07:00
Gen Zhang
50fbc13dc1 dfs_cache: fix a wrong use of kfree in flush_cache_ent()
In flush_cache_ent(), 'ce->ce_path' is allocated by kstrdup_const().
It should be freed by kfree_const(), rather than kfree().

Signed-off-by: Gen Zhang <blackgod016574@gmail.com>
Reviewed-by: Paulo Alcantara <palcantara@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-28 19:13:58 -05:00
Murphy Zhou
6457c20e33 fs/cifs/smb2pdu.c: fix buffer free in SMB2_ioctl_free
The 2nd buffer could be NULL even if iov_len is not zero. This can
trigger a panic when handling symlinks. It's easy to reproduce with
LTP fs_racer scripts[1] which are randomly craete/delete/link files
and dirs. Fix this panic by checking if the 2nd buffer is padding
before kfree, like what we do in SMB2_open_free.

[1] https://github.com/linux-test-project/ltp/tree/master/testcases/kernel/fs/racer

Fixes: 2c87d6a94d ("cifs: Allocate memory for all iovs in smb2_ioctl")
Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie sahlberg <lsahlber@redhat.com>
2019-05-28 19:11:35 -05:00
Colin Ian King
210782038b cifs: fix memory leak of pneg_inbuf on -EOPNOTSUPP ioctl case
Currently in the case where SMB2_ioctl returns the -EOPNOTSUPP error
there is a memory leak of pneg_inbuf. Fix this by returning via
the out_free_inbuf exit path that will perform the relevant kfree.

Addresses-Coverity: ("Resource leak")
Fixes: 969ae8e8d4 ("cifs: Accept validate negotiate if server return NT_STATUS_NOT_SUPPORTED")
CC: Stable <stable@vger.kernel.org> # v5.1+
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-28 19:11:35 -05:00
Camelia Groza
cbe9e83594 enetc: Enable TC offloading with mqprio
Add support to configure multiple prioritized TX traffic
classes with mqprio.

Configure one BD ring per TC for the moment, one netdev
queue per TC.

Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:11:02 -07:00
David S. Miller
7f3343234c Merge branch 'stmmac-SPDX'
Neil Armstrong says:

====================
net: stmmac: dwmac-meson: update with SPDX Licence identifier

Update the SPDX Licence identifier for the Amlogic Meson6 and Meson8 dwmac
glue drivers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:09:15 -07:00
Neil Armstrong
56aaa114f0 net: stmmac: dwmac-meson8b: update with SPDX Licence identifier
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:09:15 -07:00
Neil Armstrong
f87845cf0f net: stmmac: dwmac-meson: update with SPDX Licence identifier
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:09:15 -07:00
Sasha Neftin
62a5b8429e igc: Cleanup the redundant code
The default flow control settings for the i225 device is both
'rx' and 'tx' pause frames. There is no depend on the NVM value.
This patch comes to fix this and clean up the driver code.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-28 16:15:55 -07:00
Sasha Neftin
0373ad4d05 igc: Add flow control support
This change adds flow control settings. This is required to
enable the legacy flow control support.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-28 16:13:57 -07:00
Konstantin Khlebnikov
d17ba0f616 e1000e: start network tx queue only when link is up
Driver does not want to keep packets in Tx queue when link is lost.
But present code only reset NIC to flush them, but does not prevent
queuing new packets. Moreover reset sequence itself could generate
new packets via netconsole and NIC falls into endless reset loop.

This patch wakes Tx queue only when NIC is ready to send packets.

This is proper fix for problem addressed by commit 0f9e980bf5
("e1000e: fix cyclic resets at link up with active tx").

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Tested-by: Joseph Yasi <joe.yasi@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Oleksandr Natalenko <oleksandr@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-28 16:08:43 -07:00
Konstantin Khlebnikov
caff422ea8 Revert "e1000e: fix cyclic resets at link up with active tx"
This reverts commit 0f9e980bf5.

That change cased false-positive warning about hardware hang:

e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
e1000e 0000:00:1f.6 eth0: Detected Hardware Unit Hang:
   TDH                  <0>
   TDT                  <1>
   next_to_use          <1>
   next_to_clean        <0>
buffer_info[next_to_clean]:
   time_stamp           <fffba7a7>
   next_to_watch        <0>
   jiffies              <fffbb140>
   next_to_watch.status <0>
MAC Status             <40080080>
PHY Status             <7949>
PHY 1000BASE-T Status  <0>
PHY Extended Status    <3000>
PCI Status             <10>
e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

Besides warning everything works fine.
Original issue will be fixed property in following patch.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reported-by: Joseph Yasi <joe.yasi@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203175
Tested-by: Joseph Yasi <joe.yasi@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Oleksandr Natalenko <oleksandr@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-28 16:01:01 -07:00
Sasha Neftin
16ecd8d9af igc: Remove the obsolete workaround
Enables a resend request after the completion timeout workaround is not
relevant for i225 device. This patch is clean code relevant this
workaround.
Minor cosmetic fixes, replace the 'spaces' with 'tabs'

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-28 15:57:59 -07:00
Sasha Neftin
796bfb1035 igc: Clean up unused pointers
Few function pointers from phy_operations structure were unused.
This patch cleans those.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-28 15:56:11 -07:00
Sasha Neftin
ae586f0b39 igc: Fix double definitions
Collision threshold and threshold's shift has been defined twice.
This patch comes to fix that.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-28 15:54:39 -07:00
Gustavo A. R. Silva
42277cedba igb: mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

This patch fixes the following warning:

drivers/net/ethernet/intel/igb/e1000_82575.c: In function ‘igb_get_invariants_82575’:
drivers/net/ethernet/intel/igb/e1000_82575.c:636:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (igb_sgmii_uses_mdio_82575(hw)) {
      ^
drivers/net/ethernet/intel/igb/e1000_82575.c:642:2: note: here
  case E1000_CTRL_EXT_LINK_MODE_PCIE_SERDES:
  ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

Notice that, in this particular case, the code comment is modified
in accordance with what GCC is expecting to find.

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-28 15:52:37 -07:00
Gustavo A. R. Silva
b7b3ad7aaf igb: mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

This patch fixes the following warning:

drivers/net/ethernet/intel/igb/igb_main.c: In function ‘__igb_notify_dca’:
drivers/net/ethernet/intel/igb/igb_main.c:6694:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (dca_add_requester(dev) == 0) {
      ^
drivers/net/ethernet/intel/igb/igb_main.c:6701:2: note: here
  case DCA_PROVIDER_REMOVE:
  ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

Notice that, in this particular case, the code comment is modified
in accordance with what GCC is expecting to find.

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-28 15:48:42 -07:00
Feng Tang
47e16692b2 igb/igc: warn when fatal read failure happens
Failed in read the HW register is very serious for igb/igc driver,
as its hw_addr will be set to NULL and cause the adapter be seen as
"REMOVED".

We saw the error only a few times in the MTBF test for suspend/resume,
but can hardly get any useful info to debug.

Adding WARN() so that we can get the necessary information about
where and how it happens, and use it for root causing and fixing
this "PCIe link lost issue"

This affects igb, igc.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-28 15:42:32 -07:00
Richard Guy Briggs
0223fad3c9 audit: enforce op for string fields
The field operator is ignored on several string fields.  WATCH, DIR,
PERM and FILETYPE field operators are completely ignored and meaningless
since the op is not referenced in audit_filter_rules().  Range and
bitwise operators are already addressed in ghak73.

Honour the operator for WATCH, DIR, PERM, FILETYPE fields as is done in
the EXE field.

Please see github issue
https://github.com/linux-audit/audit-kernel/issues/114

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-05-28 17:46:43 -04:00
Adrian Hunter
14f1cfd4f7 perf intel-pt: Rationalize intel_pt_sync_switch()'s use of next_tid
Returning 1 from intel_pt_sync_switch() causes the current tid to be
set. That negates the need to keep next_tid anymore. Rationalize the
code to that effect.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190412113830.4126-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:45 -03:00
Adrian Hunter
c7b4f15ff7 perf intel-pt: Improve sync_switch by processing PERF_RECORD_SWITCH* in events
sync_switch is a facility to synchronize decoding more closely with the
point in the kernel when the context actually switched.

Improve it by processing "context switch in" events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190412113830.4126-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:45 -03:00