Commit Graph

856310 Commits

Author SHA1 Message Date
Wesley Terpstra
251a448881 riscv: include generic support for MSI irqdomains
Some RISC-V systems include PCIe host controllers that support PCIe
message-signaled interrupts.  For this to work on Linux, we need to
enable PCI_MSI_IRQ_DOMAIN and define struct msi_alloc_info.  Support
for the latter is enabled by including the architecture-generic msi.h
include.

Signed-off-by: Wesley Terpstra <wesley@sifive.com>
[paul.walmsley@sifive.com: split initial patch into one arch/riscv
 patch and one drivers/pci patch]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-07-22 13:06:07 -07:00
Qian Cai
bbb6fc43f1 drm: silence variable 'conn' set but not used
The "struct drm_connector" iteration cursor from
"for_each_new_connector_in_state" is never used in atomic_remove_fb()
which generates a compilation warning,

drivers/gpu/drm/drm_framebuffer.c: In function 'atomic_remove_fb':
drivers/gpu/drm/drm_framebuffer.c:838:24: warning: variable 'conn' set
but not used [-Wunused-but-set-variable]

Silence it by marking "conn" __maybe_unused.

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1563822886-13570-1-git-send-email-cai@lca.pw
2019-07-22 16:04:53 -04:00
Palmer Dabbelt
f4da5d074c MAINTAINERS: Add Paul as a RISC-V maintainer
The RISC-V port has grown significantly over the past year.  Paul's been
helping out for a while ago.  We agreed in person that he'd take over
collecting the patches and submitting the PRs, but it looks like I
forgot to make it official.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-07-22 13:03:57 -07:00
Jonathan Corbet
48ffc3d12b Merge branch 'pdf_fixes_v1' of https://git.linuxtv.org/mchehab/experimental into mauro
Bring in a set of post-thrashup fixes from Mauro.
2019-07-22 13:51:20 -06:00
Federico Vaga
143134ba49 doc:it_IT: rephrase statement
The statement sounds more like a literal translation

Signed-off-by: Federico Vaga <federico.vaga@vaga.pv.it>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-07-22 13:44:31 -06:00
Federico Vaga
5adcce34f8 doc:it_IT: align translation to mainline
The patch translates the following patches in Italian:

d9d7c0c497 docs: Note that :c:func: should no longer be used
83e8b971f8 sphinx.rst: Add note about code snippets embedded in the text
cca5e0b8a4 Documentation: PGP: update for newer HW devices

Signed-off-by: Federico Vaga <federico.vaga@vaga.pv.it>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-07-22 13:44:17 -06:00
Jonathan Corbet
e27a24210a Merge tag 'v5.3-rc1' into docs-next
Pull in all of the massive docs changes from elsewhere.
2019-07-22 13:42:10 -06:00
Sven Eckelmann
f7af86ccf1 batman-adv: Fix deletion of RTR(4|6) mcast list entries
The multicast code uses the lists bat_priv->mcast.want_all_rtr*_list to
store all all originator nodes which don't have the flag no-RTR4 or no-RTR6
set. When an originator is purged, it has to be removed from these lists.

Since all entries without the BATADV_MCAST_WANT_NO_RTR4/6 are stored in
these lists, they have to be handled like entries which have these flags
set to force the update routines to remove them from the lists when purging
the originator.

Not doing so will leave a pointer to a freed memory region inside the list.
Trying to operate on these lists will then cause an use-after-free error:

  BUG: KASAN: use-after-free in batadv_mcast_want_rtr4_update+0x335/0x3a0 [batman_adv]
  Write of size 8 at addr ffff888007b41a38 by task swapper/0/0

Fixes: 61caf3d109 ("batman-adv: mcast: detect, distribute and maintain multicast router presence")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-07-22 21:34:58 +02:00
Sven Eckelmann
fa3a03da54 batman-adv: Fix netlink dumping of all mcast_flags buckets
The bucket variable is only updated outside the loop over the mcast_flags
buckets. It will only be updated during a dumping run when the dumping has
to be interrupted and a new message has to be started.

This could result in repeated or missing entries when the multicast flags
are dumped to userspace.

Fixes: d2d489b7d8 ("batman-adv: Add inconsistent multicast netlink dump detection")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-07-22 21:34:58 +02:00
Thomas Falcon
12185dfe44 bonding: Force slave speed check after link state recovery for 802.3ad
The following scenario was encountered during testing of logical
partition mobility on pseries partitions with bonded ibmvnic
adapters in LACP mode.

1. Driver receives a signal that the device has been
   swapped, and it needs to reset to initialize the new
   device.

2. Driver reports loss of carrier and begins initialization.

3. Bonding driver receives NETDEV_CHANGE notifier and checks
   the slave's current speed and duplex settings. Because these
   are unknown at the time, the bond sets its link state to
   BOND_LINK_FAIL and handles the speed update, clearing
   AD_PORT_LACP_ENABLE.

4. Driver finishes recovery and reports that the carrier is on.

5. Bond receives a new notification and checks the speed again.
   The speeds are valid but miimon has not altered the link
   state yet.  AD_PORT_LACP_ENABLE remains off.

Because the slave's link state is still BOND_LINK_FAIL,
no further port checks are made when it recovers. Though
the slave devices are operational and have valid speed
and duplex settings, the bond will not send LACPDU's. The
simplest fix I can see is to force another speed check
in bond_miimon_commit. This way the bond will update
AD_PORT_LACP_ENABLE if needed when transitioning from
BOND_LINK_FAIL to BOND_LINK_UP.

CC: Jarod Wilson <jarod@redhat.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-22 12:09:32 -07:00
Rob Clark
0036bc73cc drm/msm: stop abusing dma_map/unmap for cache
Recently splats like this started showing up:

   WARNING: CPU: 4 PID: 251 at drivers/iommu/dma-iommu.c:451 __iommu_dma_unmap+0xb8/0xc0
   Modules linked in: ath10k_snoc ath10k_core fuse msm ath mac80211 uvcvideo cfg80211 videobuf2_vmalloc videobuf2_memops vide
   CPU: 4 PID: 251 Comm: kworker/u16:4 Tainted: G        W         5.2.0-rc5-next-20190619+ #2317
   Hardware name: LENOVO 81JL/LNVNB161216, BIOS 9UCN23WW(V1.06) 10/25/2018
   Workqueue: msm msm_gem_free_work [msm]
   pstate: 80c00005 (Nzcv daif +PAN +UAO)
   pc : __iommu_dma_unmap+0xb8/0xc0
   lr : __iommu_dma_unmap+0x54/0xc0
   sp : ffff0000119abce0
   x29: ffff0000119abce0 x28: 0000000000000000
   x27: ffff8001f9946648 x26: ffff8001ec271068
   x25: 0000000000000000 x24: ffff8001ea3580a8
   x23: ffff8001f95ba010 x22: ffff80018e83ba88
   x21: ffff8001e548f000 x20: fffffffffffff000
   x19: 0000000000001000 x18: 00000000c00001fe
   x17: 0000000000000000 x16: 0000000000000000
   x15: ffff000015b70068 x14: 0000000000000005
   x13: 0003142cc1be1768 x12: 0000000000000001
   x11: ffff8001f6de9100 x10: 0000000000000009
   x9 : ffff000015b78000 x8 : 0000000000000000
   x7 : 0000000000000001 x6 : fffffffffffff000
   x5 : 0000000000000fff x4 : ffff00001065dbc8
   x3 : 000000000000000d x2 : 0000000000001000
   x1 : fffffffffffff000 x0 : 0000000000000000
   Call trace:
    __iommu_dma_unmap+0xb8/0xc0
    iommu_dma_unmap_sg+0x98/0xb8
    put_pages+0x5c/0xf0 [msm]
    msm_gem_free_work+0x10c/0x150 [msm]
    process_one_work+0x1e0/0x330
    worker_thread+0x40/0x438
    kthread+0x12c/0x130
    ret_from_fork+0x10/0x18
   ---[ end trace afc0dc5ab81a06bf ]---

Not quite sure what triggered that, but we really shouldn't be abusing
dma_{map,unmap}_sg() for cache maint.

Cc: Stephen Boyd <sboyd@kernel.org>
Tested-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190630124735.27786-1-robdclark@gmail.com
2019-07-22 14:39:26 -04:00
Mao Wenan
af0653d566 RDMA/siw: Remove set but not used variables 'rv'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/infiniband/sw/siw/siw_cm.c: In function siw_cep_set_inuse:
drivers/infiniband/sw/siw/siw_cm.c:223:6: warning: variable rv set but not used [-Wunused-but-set-variable]

Fixes: 6c52fdc244 ("rdma/siw: connection management")
Link: https://lore.kernel.org/r/20190719012938.100628-1-maowenan@huawei.com
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 15:22:46 -03:00
Chuhong Yuan
b7f406bb88 IB/mlx5: Replace kfree with kvfree
Memory allocated by kvzalloc should not be freed by kfree(), use kvfree()
instead.

Fixes: 813e90b1ae ("IB/mlx5: Add advise_mr() support")
Link: https://lore.kernel.org/r/20190717082101.14196-1-hslester96@gmail.com
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 15:19:40 -03:00
Selvin Xavier
c56b593d2a RDMA/bnxt_re: Honor vlan_id in GID entry comparison
A GID entry consists of GID, vlan, netdev and smac.  Extend GID duplicate
check comparisons to consider vlan_id as well to support IPv6 VLAN based
link local addresses. Introduce a new structure (bnxt_qplib_gid_info) to
hold gid and vlan_id information.

The issue is discussed in the following thread
https://lore.kernel.org/r/AM0PR05MB4866CFEDCDF3CDA1D7D18AA5D1F20@AM0PR05MB4866.eurprd05.prod.outlook.com

Fixes: 823b23da71 ("IB/core: Allow vlan link local address based RoCE GIDs")
Cc: <stable@vger.kernel.org> # v5.2+
Link: https://lore.kernel.org/r/20190715091913.15726-1-selvin.xavier@broadcom.com
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Co-developed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 15:04:04 -03:00
Kaike Wan
f4d46119f2 IB/hfi1: Drop all TID RDMA READ RESP packets after r_next_psn
When a TID sequence error occurs while receiving TID RDMA READ RESP
packets, all packets after flow->flow_state.r_next_psn should be dropped,
including those response packets for subsequent segments.

The current implementation will drop the subsequent response packets for
the segment to complete next, but may accept packets for subsequent
segments and therefore mistakenly advance the r_next_psn fields for the
corresponding software flows. This may result in failures to complete
subsequent segments after the current segment is completed.

The fix is to only use the flow pointed by req->clear_tail for checking
KDETH PSN instead of finding a flow from the request's flow array.

Fixes: b885d5be9c ("IB/hfi1: Unify the software PSN check for TID RDMA READ/WRITE")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190715164540.74174.54702.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 14:57:55 -03:00
Kaike Wan
dc25b239eb IB/hfi1: Field not zero-ed when allocating TID flow memory
The field flow->resync_npkts is added for TID RDMA WRITE request and
zero-ed when a TID RDMA WRITE RESP packet is received by the requester.
This field is used to rewind a request during retry in the function
hfi1_tid_rdma_restart_req() shared by both TID RDMA WRITE and TID RDMA
READ requests. Therefore, when a TID RDMA READ request is retried, this
field may not be initialized at all, which causes the retry to start at an
incorrect psn, leading to the drop of the retry request by the responder.

This patch fixes the problem by zeroing out the field when the flow memory
is allocated.

Fixes: 838b6fd2d9 ("IB/hfi1: TID RDMA RcvArray programming and TID allocation")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190715164534.74174.6177.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 14:57:55 -03:00
Kaike Wan
2b74c878b0 IB/hfi1: Unreserve a flushed OPFN request
When an OPFN request is flushed, the request is completed without
unreserving itself from the send queue. Subsequently, when a new
request is post sent, the following warning will be triggered:

WARNING: CPU: 4 PID: 8130 at rdmavt/qp.c:1761 rvt_post_send+0x72a/0x880 [rdmavt]
Call Trace:
[<ffffffffbbb61e41>] dump_stack+0x19/0x1b
[<ffffffffbb497688>] __warn+0xd8/0x100
[<ffffffffbb4977cd>] warn_slowpath_null+0x1d/0x20
[<ffffffffc01c941a>] rvt_post_send+0x72a/0x880 [rdmavt]
[<ffffffffbb4dcabe>] ? account_entity_dequeue+0xae/0xd0
[<ffffffffbb61d645>] ? __kmalloc+0x55/0x230
[<ffffffffc04e1a4c>] ib_uverbs_post_send+0x37c/0x5d0 [ib_uverbs]
[<ffffffffc04e5e36>] ? rdma_lookup_put_uobject+0x26/0x60 [ib_uverbs]
[<ffffffffc04dbce6>] ib_uverbs_write+0x286/0x460 [ib_uverbs]
[<ffffffffbb6f9457>] ? security_file_permission+0x27/0xa0
[<ffffffffbb641650>] vfs_write+0xc0/0x1f0
[<ffffffffbb64246f>] SyS_write+0x7f/0xf0
[<ffffffffbbb74ddb>] system_call_fastpath+0x22/0x27

This patch fixes the problem by moving rvt_qp_wqe_unreserve() into
rvt_qp_complete_swqe() to simplify the code and make it less
error-prone.

Fixes: ca95f802ef ("IB/hfi1: Unreserve a reserved request when it is completed")
Link: https://lore.kernel.org/r/20190715164528.74174.31364.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 14:57:55 -03:00
John Fleck
cd48a82087 IB/hfi1: Check for error on call to alloc_rsm_map_table
The call to alloc_rsm_map_table does not check if the kmalloc fails.
Check for a NULL on alloc, and bail if it fails.

Fixes: 372cc85a13 ("IB/hfi1: Extract RSM map table init from QOS")
Link: https://lore.kernel.org/r/20190715164521.74174.27047.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: John Fleck <john.fleck@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 14:57:54 -03:00
Shubhashree Dhar
2e7b801ead drm/msm/dpu: Correct dpu encoder spinlock initialization
dpu encoder spinlock should be initialized during dpu encoder
init instead of dpu encoder setup which is part of modeset init.

Signed-off-by: Shubhashree Dhar <dhar@codeaurora.org>
[seanpaul resolved conflict in old init removal and revised the commit message]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1561357632-15361-1-git-send-email-dhar@codeaurora.org
2019-07-22 13:44:17 -04:00
Xi Wang
60c3becfd1 RDMA/hns: Fix sg offset non-zero issue
When run perftest in many times, the system will report a BUG as follows:

   BUG: Bad rss-counter state mm:(____ptrval____) idx:0 val:-1
   BUG: Bad rss-counter state mm:(____ptrval____) idx:1 val:1

We tested with different kernel version and found it started from the the
following commit:

commit d10bcf947a ("RDMA/umem: Combine contiguous PAGE_SIZE regions in
SGEs")

In this commit, the sg->offset is always 0 when sg_set_page() is called in
ib_umem_get() and the drivers are not allowed to change the sgl, otherwise
it will get bad page descriptor when unfolding SGEs in __ib_umem_release()
as sg_page_count() will get wrong result while sgl->offset is not 0.

However, there is a weird sgl usage in the current hns driver, the driver
modified sg->offset after calling ib_umem_get(), which caused we iterate
past the wrong number of pages in for_each_sg_page iterator.

This patch fixes it by correcting the non-standard sgl usage found in the
hns_roce_db_map_user() function.

Fixes: d10bcf947a ("RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs")
Fixes: 0425e3e6e0 ("RDMA/hns: Support flush cqe for hip08 in kernel space")
Link: https://lore.kernel.org/r/1562808737-45723-1-git-send-email-oulijun@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 14:44:08 -03:00
Brian Masney
7af5cdb158 drm/msm: correct NULL pointer dereference in context_init
Correct attempted NULL pointer dereference in context_init() when
running without an IOMMU.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Brian Masney <masneyb@onstation.org>
Fixes: 295b22ae59 ("drm/msm: Pass the MMU domain index in struct msm_file_private")
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190627020515.5660-1-masneyb@onstation.org
2019-07-22 13:40:14 -04:00
Wei Yongjun
d5121ffebc RDMA/siw: Fix error return code in siw_init_module()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: bdcf26bf9b ("rdma/siw: network and RDMA core interface")
Link: https://lore.kernel.org/r/20190718092710.85709-1-weiyongjun1@huawei.com
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 14:28:17 -03:00
Linus Torvalds
7b5cf701ea Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull preemption Kconfig fix from Thomas Gleixner:
 "The PREEMPT_RT stub config renamed PREEMPT to PREEMPT_LL and defined
  PREEMPT outside of the menu and made it selectable by both PREEMPT_LL
  and PREEMPT_RT.

  Stupid me missed that 114 defconfigs select CONFIG_PREEMPT which
  obviously can't work anymore. oldconfig builds are affected as well,
  but it's more obvious as the user gets asked. [old]defconfig silently
  fixes it up and selects PREEMPT_NONE.

  Unbreak it by undoing the rename and adding a intermediate config
  symbol which is selected by both PREEMPT and PREEMPT_RT. That requires
  to chase down a few #ifdefs, but it's better than tweaking 114
  defconfigs and annoying users"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/rt, Kconfig: Unbreak def/oldconfig with CONFIG_PREEMPT=y
2019-07-22 09:30:34 -07:00
Linus Torvalds
44b912cd0b Merge tag 'for-linus-20190722' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull pidfd polling fix from Christian Brauner:
 "A fix for pidfd polling. It ensures that the task's exit state is
  visible to all waiters"

* tag 'for-linus-20190722' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  pidfd: fix a poll race when setting exit_state
2019-07-22 09:14:19 -07:00
Linus Torvalds
21c730d734 Merge tag 'for-5.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:

 - fixes for leaks caused by recently merged patches

 - one build fix

 - a fix to prevent mixing of incompatible features

* tag 'for-5.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: don't leak extent_map in btrfs_get_io_geometry()
  btrfs: free checksum hash on in close_ctree
  btrfs: Fix build error while LIBCRC32C is module
  btrfs: inode: Don't compress if NODATASUM or NODATACOW set
2019-07-22 09:08:38 -07:00
Thomas Gleixner
b8d3349803 sched/rt, Kconfig: Unbreak def/oldconfig with CONFIG_PREEMPT=y
The merge of the CONFIG_PREEMPT_RT stub renamed CONFIG_PREEMPT to
CONFIG_PREEMPT_LL which causes all defconfigs which have CONFIG_PREEMPT=y
set to fall back to CONFIG_PREEMPT_NONE because CONFIG_PREEMPT depends on
the preemption mode choice wich defaults to NONE. This also affects
oldconfig builds.

So rather than changing 114 defconfig files and being an annoyance to
users, revert the rename and select a new config symbol PREEMPTION. That
keeps everything working smoothly and the revelant ifdef's are going to be
fixed up step by step.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Fixes: a50a3f4b6a ("sched/rt, Kconfig: Introduce CONFIG_PREEMPT_RT")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-07-22 18:05:11 +02:00
Linus Torvalds
c92f038067 Merge tag 'media/v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
 "For two regressions in media core:

   - v4l2-subdev: fix regression in check_pad()

   - videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already
     in use"

* tag 'media/v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already in use
  media: v4l2-subdev: fix regression in check_pad()
2019-07-22 09:01:47 -07:00
Sai Praneeth Prakhya
7f6cade5b6 iommu/vt-d: Print pasid table entries MSB to LSB in debugfs
Commit dd5142ca5d ("iommu/vt-d: Add debugfs support to show scalable mode
DMAR table internals") prints content of pasid table entries from LSB to
MSB where as other entries are printed MSB to LSB. So, to maintain
uniformity among all entries and to not confuse the user, print MSB first.

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Fixes: dd5142ca5d ("iommu/vt-d: Add debugfs support to show scalable mode DMAR table internals")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-22 17:52:57 +02:00
Jean-Philippe Brucker
ae24fb49d0 iommu/virtio: Update to most recent specification
Following specification review a few things were changed in v8 of the
virtio-iommu series [1], but have been omitted when merging the base
driver. Add them now:

* Remove the EXEC flag.
* Add feature bit for the MMIO flag.
* Change domain_bits to domain_range.
* Add NOMEM status flag.

[1] https://lore.kernel.org/linux-iommu/20190530170929.19366-1-jean-philippe.brucker@arm.com/

Fixes: edcd69ab9a ("iommu: Add virtio-iommu driver")
Reported-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
2019-07-22 11:52:27 -04:00
Chris Wilson
9eed17d37c iommu/iova: Remove stale cached32_node
Since the cached32_node is allowed to be advanced above dma_32bit_pfn
(to provide a shortcut into the limited range), we need to be careful to
remove the to be freed node if it is the cached32_node.

[   48.477773] BUG: KASAN: use-after-free in __cached_rbnode_delete_update+0x68/0x110
[   48.477812] Read of size 8 at addr ffff88870fc19020 by task kworker/u8:1/37
[   48.477843]
[   48.477879] CPU: 1 PID: 37 Comm: kworker/u8:1 Tainted: G     U            5.2.0+ #735
[   48.477915] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[   48.478047] Workqueue: i915 __i915_gem_free_work [i915]
[   48.478075] Call Trace:
[   48.478111]  dump_stack+0x5b/0x90
[   48.478137]  print_address_description+0x67/0x237
[   48.478178]  ? __cached_rbnode_delete_update+0x68/0x110
[   48.478212]  __kasan_report.cold.3+0x1c/0x38
[   48.478240]  ? __cached_rbnode_delete_update+0x68/0x110
[   48.478280]  ? __cached_rbnode_delete_update+0x68/0x110
[   48.478308]  __cached_rbnode_delete_update+0x68/0x110
[   48.478344]  private_free_iova+0x2b/0x60
[   48.478378]  iova_magazine_free_pfns+0x46/0xa0
[   48.478403]  free_iova_fast+0x277/0x340
[   48.478443]  fq_ring_free+0x15a/0x1a0
[   48.478473]  queue_iova+0x19c/0x1f0
[   48.478597]  cleanup_page_dma.isra.64+0x62/0xb0 [i915]
[   48.478712]  __gen8_ppgtt_cleanup+0x63/0x80 [i915]
[   48.478826]  __gen8_ppgtt_cleanup+0x42/0x80 [i915]
[   48.478940]  __gen8_ppgtt_clear+0x433/0x4b0 [i915]
[   48.479053]  __gen8_ppgtt_clear+0x462/0x4b0 [i915]
[   48.479081]  ? __sg_free_table+0x9e/0xf0
[   48.479116]  ? kfree+0x7f/0x150
[   48.479234]  i915_vma_unbind+0x1e2/0x240 [i915]
[   48.479352]  i915_vma_destroy+0x3a/0x280 [i915]
[   48.479465]  __i915_gem_free_objects+0xf0/0x2d0 [i915]
[   48.479579]  __i915_gem_free_work+0x41/0xa0 [i915]
[   48.479607]  process_one_work+0x495/0x710
[   48.479642]  worker_thread+0x4c7/0x6f0
[   48.479687]  ? process_one_work+0x710/0x710
[   48.479724]  kthread+0x1b2/0x1d0
[   48.479774]  ? kthread_create_worker_on_cpu+0xa0/0xa0
[   48.479820]  ret_from_fork+0x1f/0x30
[   48.479864]
[   48.479907] Allocated by task 631:
[   48.479944]  save_stack+0x19/0x80
[   48.479994]  __kasan_kmalloc.constprop.6+0xc1/0xd0
[   48.480038]  kmem_cache_alloc+0x91/0xf0
[   48.480082]  alloc_iova+0x2b/0x1e0
[   48.480125]  alloc_iova_fast+0x58/0x376
[   48.480166]  intel_alloc_iova+0x90/0xc0
[   48.480214]  intel_map_sg+0xde/0x1f0
[   48.480343]  i915_gem_gtt_prepare_pages+0xb8/0x170 [i915]
[   48.480465]  huge_get_pages+0x232/0x2b0 [i915]
[   48.480590]  ____i915_gem_object_get_pages+0x40/0xb0 [i915]
[   48.480712]  __i915_gem_object_get_pages+0x90/0xa0 [i915]
[   48.480834]  i915_gem_object_prepare_write+0x2d6/0x330 [i915]
[   48.480955]  create_test_object.isra.54+0x1a9/0x3e0 [i915]
[   48.481075]  igt_shared_ctx_exec+0x365/0x3c0 [i915]
[   48.481210]  __i915_subtests.cold.4+0x30/0x92 [i915]
[   48.481341]  __run_selftests.cold.3+0xa9/0x119 [i915]
[   48.481466]  i915_live_selftests+0x3c/0x70 [i915]
[   48.481583]  i915_pci_probe+0xe7/0x220 [i915]
[   48.481620]  pci_device_probe+0xe0/0x180
[   48.481665]  really_probe+0x163/0x4e0
[   48.481710]  device_driver_attach+0x85/0x90
[   48.481750]  __driver_attach+0xa5/0x180
[   48.481796]  bus_for_each_dev+0xda/0x130
[   48.481831]  bus_add_driver+0x205/0x2e0
[   48.481882]  driver_register+0xca/0x140
[   48.481927]  do_one_initcall+0x6c/0x1af
[   48.481970]  do_init_module+0x106/0x350
[   48.482010]  load_module+0x3d2c/0x3ea0
[   48.482058]  __do_sys_finit_module+0x110/0x180
[   48.482102]  do_syscall_64+0x62/0x1f0
[   48.482147]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   48.482190]
[   48.482224] Freed by task 37:
[   48.482273]  save_stack+0x19/0x80
[   48.482318]  __kasan_slab_free+0x12e/0x180
[   48.482363]  kmem_cache_free+0x70/0x140
[   48.482406]  __free_iova+0x1d/0x30
[   48.482445]  fq_ring_free+0x15a/0x1a0
[   48.482490]  queue_iova+0x19c/0x1f0
[   48.482624]  cleanup_page_dma.isra.64+0x62/0xb0 [i915]
[   48.482749]  __gen8_ppgtt_cleanup+0x63/0x80 [i915]
[   48.482873]  __gen8_ppgtt_cleanup+0x42/0x80 [i915]
[   48.482999]  __gen8_ppgtt_clear+0x433/0x4b0 [i915]
[   48.483123]  __gen8_ppgtt_clear+0x462/0x4b0 [i915]
[   48.483250]  i915_vma_unbind+0x1e2/0x240 [i915]
[   48.483378]  i915_vma_destroy+0x3a/0x280 [i915]
[   48.483500]  __i915_gem_free_objects+0xf0/0x2d0 [i915]
[   48.483622]  __i915_gem_free_work+0x41/0xa0 [i915]
[   48.483659]  process_one_work+0x495/0x710
[   48.483704]  worker_thread+0x4c7/0x6f0
[   48.483748]  kthread+0x1b2/0x1d0
[   48.483787]  ret_from_fork+0x1f/0x30
[   48.483831]
[   48.483868] The buggy address belongs to the object at ffff88870fc19000
[   48.483868]  which belongs to the cache iommu_iova of size 40
[   48.483920] The buggy address is located 32 bytes inside of
[   48.483920]  40-byte region [ffff88870fc19000, ffff88870fc19028)
[   48.483964] The buggy address belongs to the page:
[   48.484006] page:ffffea001c3f0600 refcount:1 mapcount:0 mapping:ffff8888181a91c0 index:0x0 compound_mapcount: 0
[   48.484045] flags: 0x8000000000010200(slab|head)
[   48.484096] raw: 8000000000010200 ffffea001c421a08 ffffea001c447e88 ffff8888181a91c0
[   48.484141] raw: 0000000000000000 0000000000120012 00000001ffffffff 0000000000000000
[   48.484188] page dumped because: kasan: bad access detected
[   48.484230]
[   48.484265] Memory state around the buggy address:
[   48.484314]  ffff88870fc18f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   48.484361]  ffff88870fc18f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   48.484406] >ffff88870fc19000: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc
[   48.484451]                                ^
[   48.484494]  ffff88870fc19080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   48.484530]  ffff88870fc19100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108602
Fixes: e60aa7b538 ("iommu/iova: Extend rbtree node caching")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: <stable@vger.kernel.org> # v4.15+
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-22 17:50:49 +02:00
Linus Torvalds
83768245a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Several netfilter fixes including a nfnetlink deadlock fix from
    Florian Westphal and fix for dropping VRF packets from Miaohe Lin.

 2) Flow offload fixes from Pablo Neira Ayuso including a fix to restore
    proper block sharing.

 3) Fix r8169 PHY init from Thomas Voegtle.

 4) Fix memory leak in mac80211, from Lorenzo Bianconi.

 5) Missing NULL check on object allocation in cxgb4, from Navid
    Emamdoost.

 6) Fix scaling of RX power in sfp phy driver, from Andrew Lunn.

 7) Check that there is actually an ip header to access in skb->data in
    VRF, from Peter Kosyh.

 8) Remove spurious rcu unlock in hv_netvsc, from Haiyang Zhang.

 9) One more tweak the the TCP fragmentation memory limit changes, to be
    less harmful to applications setting small SO_SNDBUF values. From
    Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits)
  tcp: be more careful in tcp_fragment()
  hv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback()
  vrf: make sure skb->data contains ip header to make routing
  connector: remove redundant input callback from cn_dev
  qed: Prefer pcie_capability_read_word()
  igc: Prefer pcie_capability_read_word()
  cxgb4: Prefer pcie_capability_read_word()
  be2net: Synchronize be_update_queues with dev_watchdog
  bnx2x: Prevent load reordering in tx completion processing
  net: phy: sfp: hwmon: Fix scaling of RX power
  net: sched: verify that q!=NULL before setting q->flags
  chelsio: Fix a typo in a function name
  allocate_flower_entry: should check for null deref
  net: hns3: typo in the name of a constant
  kbuild: add net/netfilter/nf_tables_offload.h to header-test blacklist.
  tipc: Fix a typo
  mac80211: don't warn about CW params when not using them
  mac80211: fix possible memory leak in ieee80211_assign_beacon
  nl80211: fix NL80211_HE_MAX_CAPABILITY_LEN
  nl80211: fix VENDOR_CMD_RAW_DATA
  ...
2019-07-22 08:49:22 -07:00
Dmitry Safonov
3ee9eca760 iommu/vt-d: Check if domain->pgd was allocated
There is a couple of places where on domain_init() failure domain_exit()
is called. While currently domain_init() can fail only if
alloc_pgtable_page() has failed.

Make domain_exit() check if domain->pgd present, before calling
domain_unmap(), as it theoretically should crash on clearing pte entries
in dma_pte_clear_level().

Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-22 17:43:06 +02:00
Dmitry Safonov
effa467870 iommu/vt-d: Don't queue_iova() if there is no flush queue
Intel VT-d driver was reworked to use common deferred flushing
implementation. Previously there was one global per-cpu flush queue,
afterwards - one per domain.

Before deferring a flush, the queue should be allocated and initialized.

Currently only domains with IOMMU_DOMAIN_DMA type initialize their flush
queue. It's probably worth to init it for static or unmanaged domains
too, but it may be arguable - I'm leaving it to iommu folks.

Prevent queuing an iova flush if the domain doesn't have a queue.
The defensive check seems to be worth to keep even if queue would be
initialized for all kinds of domains. And is easy backportable.

On 4.19.43 stable kernel it has a user-visible effect: previously for
devices in si domain there were crashes, on sata devices:

 BUG: spinlock bad magic on CPU#6, swapper/0/1
  lock: 0xffff88844f582008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #1
 Call Trace:
  <IRQ>
  dump_stack+0x61/0x7e
  spin_bug+0x9d/0xa3
  do_raw_spin_lock+0x22/0x8e
  _raw_spin_lock_irqsave+0x32/0x3a
  queue_iova+0x45/0x115
  intel_unmap+0x107/0x113
  intel_unmap_sg+0x6b/0x76
  __ata_qc_complete+0x7f/0x103
  ata_qc_complete+0x9b/0x26a
  ata_qc_complete_multiple+0xd0/0xe3
  ahci_handle_port_interrupt+0x3ee/0x48a
  ahci_handle_port_intr+0x73/0xa9
  ahci_single_level_irq_intr+0x40/0x60
  __handle_irq_event_percpu+0x7f/0x19a
  handle_irq_event_percpu+0x32/0x72
  handle_irq_event+0x38/0x56
  handle_edge_irq+0x102/0x121
  handle_irq+0x147/0x15c
  do_IRQ+0x66/0xf2
  common_interrupt+0xf/0xf
 RIP: 0010:__do_softirq+0x8c/0x2df

The same for usb devices that use ehci-pci:
 BUG: spinlock bad magic on CPU#0, swapper/0/1
  lock: 0xffff88844f402008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #4
 Call Trace:
  <IRQ>
  dump_stack+0x61/0x7e
  spin_bug+0x9d/0xa3
  do_raw_spin_lock+0x22/0x8e
  _raw_spin_lock_irqsave+0x32/0x3a
  queue_iova+0x77/0x145
  intel_unmap+0x107/0x113
  intel_unmap_page+0xe/0x10
  usb_hcd_unmap_urb_setup_for_dma+0x53/0x9d
  usb_hcd_unmap_urb_for_dma+0x17/0x100
  unmap_urb_for_dma+0x22/0x24
  __usb_hcd_giveback_urb+0x51/0xc3
  usb_giveback_urb_bh+0x97/0xde
  tasklet_action_common.isra.4+0x5f/0xa1
  tasklet_action+0x2d/0x30
  __do_softirq+0x138/0x2df
  irq_exit+0x7d/0x8b
  smp_apic_timer_interrupt+0x10f/0x151
  apic_timer_interrupt+0xf/0x20
  </IRQ>
 RIP: 0010:_raw_spin_unlock_irqrestore+0x17/0x39

Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: <stable@vger.kernel.org> # 4.14+
Fixes: 13cf017446 ("iommu/vt-d: Make use of iova deferred flushing")
Signed-off-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-22 17:43:06 +02:00
James Morse
40ca0ce56d arm64: entry: SP Alignment Fault doesn't write to FAR_EL1
Comparing the arm-arm's  pseudocode for AArch64.PCAlignmentFault() with
AArch64.SPAlignmentFault() shows that SP faults don't copy the faulty-SP
to FAR_EL1, but this is where we read from, and the address we provide
to user-space with the BUS_ADRALN signal.

For user-space this value will be UNKNOWN due to the previous ERET to
user-space. If the last value is preserved, on systems with KASLR or KPTI
this will be the user-space link-register left in FAR_EL1 by tramp_exit().
Fix this to retrieve the original sp_el0 value, and pass this to
do_sp_pc_fault().

SP alignment faults from EL1 will cause us to take the fault again when
trying to store the pt_regs. This eventually takes us to the overflow
stack. Remove the ESR_ELx_EC_SP_ALIGN check as we will never make it
this far.

Fixes: 60ffc30d56 ("arm64: Exception handling")
Signed-off-by: James Morse <james.morse@arm.com>
[will: change label name and fleshed out comment]
Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22 16:22:34 +01:00
Michael S. Tsirkin
cfe61801b0 balloon: fix up comments
Lots of comments bitrotted. Fix them up.

Fixes: 418a3ab1e7 (mm/balloon_compaction: List interfaces)
Reviewed-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Acked-by: Nadav Amit <namit@vmware.com>
2019-07-22 11:19:26 -04:00
Wei Wang
dd42290679 mm/balloon_compaction: avoid duplicate page removal
A #GP is reported in the guest when requesting balloon inflation via
virtio-balloon. The reason is that the virtio-balloon driver has
removed the page from its internal page list (via balloon_page_pop),
but balloon_page_enqueue_one also calls "list_del"  to do the removal.
This is necessary when it's used from balloon_page_enqueue_list, but
not from balloon_page_enqueue.

Move list_del to balloon_page_enqueue, and update comments accordingly.

Fixes: 418a3ab1e7 (mm/balloon_compaction: List interfaces)
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-22 11:19:26 -04:00
Lu Baolu
557529494d iommu/vt-d: Avoid duplicated pci dma alias consideration
As we have abandoned the home-made lazy domain allocation
and delegated the DMA domain life cycle up to the default
domain mechanism defined in the generic iommu layer, we
needn't consider pci alias anymore when mapping/unmapping
the context entries. Without this fix, we see kernel NULL
pointer dereference during pci device hot-plug test.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Fixes: fa954e6831 ("iommu/vt-d: Delegate the dma domain to upper layer")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reported-and-tested-by: Xu Pengfei <pengfei.xu@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-22 17:16:24 +02:00
Marc Zyngier
cbdf8a189a arm64: Force SSBS on context switch
On a CPU that doesn't support SSBS, PSTATE[12] is RES0.  In a system
where only some of the CPUs implement SSBS, we end-up losing track of
the SSBS bit across task migration.

To address this issue, let's force the SSBS bit on context switch.

Fixes: 8f04e8e6e2 ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[will: inverted logic and added comments]
Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22 15:24:16 +01:00
Joerg Roedel
301e7ee1de Revert "iommu/vt-d: Consolidate domain_init() to avoid duplication"
This reverts commit 123b2ffc37.

This commit reportedly caused boot failures on some systems
and needs to be reverted for now.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-22 16:21:17 +02:00
Ilya Leoshkevich
c8eee4135a selftests/bpf: fix sendmsg6_prog on s390
"sendmsg6: rewrite IP & port (C)" fails on s390, because the code in
sendmsg_v6_prog() assumes that (ctx->user_ip6[0] & 0xFFFF) refers to
leading IPv6 address digits, which is not the case on big-endian
machines.

Since checking bitwise operations doesn't seem to be the point of the
test, replace two short comparisons with a single int comparison.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:19:06 +02:00
Wei Yongjun
5d9e06d60e bcache: fix possible memory leak in bch_cached_dev_run()
memory malloced in bch_cached_dev_run() and should be freed before
leaving from the error handling cases, otherwise it will cause
memory leak.

Fixes: 0b13efecf5 ("bcache: add return value check to bch_cached_dev_run()")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-22 08:15:17 -06:00
Arnaldo Carvalho de Melo
4be6e05c4d libbpf: Avoid designated initializers for unnamed union members
As it fails to build in some systems with:

  libbpf.c: In function 'perf_buffer__new':
  libbpf.c:4515: error: unknown field 'sample_period' specified in initializer
  libbpf.c:4516: error: unknown field 'wakeup_events' specified in initializer

Doing as:

    attr.sample_period = 1;

I.e. not as a designated initializer makes it build everywhere.

Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: fb84b82246 ("libbpf: add perf buffer API")
Link: https://lkml.kernel.org/n/tip-hnlmch8qit1ieksfppmr32si@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:14:43 +02:00
Arnaldo Carvalho de Melo
cdb2f92071 libbpf: Fix endianness macro usage for some compilers
Using endian.h and its endianness macros makes this code build in a
wider range of compilers, as some don't have those macros
(__BYTE_ORDER__, __ORDER_LITTLE_ENDIAN__, __ORDER_BIG_ENDIAN__),
so use instead endian.h's macros (__BYTE_ORDER, __LITTLE_ENDIAN,
__BIG_ENDIAN) which makes this code even shorter :-)

Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 12ef5634a8 ("libbpf: simplify endianness check")
Fixes: e6c64855fd ("libbpf: add btf__parse_elf API to load .BTF and .BTF.ext")
Link: https://lkml.kernel.org/n/tip-eep5n8vgwcdphw3uc058k03u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:14:43 +02:00
Daniel Borkmann
57ebc6230f Merge branch 'bpf-sockmap-tls-fixes'
Jakub Kicinski says:

====================
John says:

Resolve a series of splats discovered by syzbot and an unhash
TLS issue noted by Eric Dumazet.

The main issues revolved around interaction between TLS and
sockmap tear down. TLS and sockmap could both reset sk->prot
ops creating a condition where a close or unhash op could be
called forever. A rare race condition resulting from a missing
rcu sync operation was causing a use after free. Then on the
TLS side dropping the sock lock and re-acquiring it during the
close op could hang. Finally, sockmap must be deployed before
tls for current stack assumptions to be met. This is enforced
now. A feature series can enable it.

To fix this first refactor TLS code so the lock is held for the
entire teardown operation. Then add an unhash callback to ensure
TLS can not transition from ESTABLISHED to LISTEN state. This
transition is a similar bug to the one found and fixed previously
in sockmap. Then apply three fixes to sockmap to fix up races
on tear down around map free and close. Finally, if sockmap
is destroyed before TLS we add a new ULP op update to inform
the TLS stack it should not call sockmap ops. This last one
appears to be the most commonly found issue from syzbot.

v4:
 - fix some use after frees;
 - disable disconnect work for offload (ctx lifetime is much
   more complex);
 - remove some of the dead code which made it hard to understand
   (for me) that things work correctly (e.g. the checks TLS is
   the top ULP);
 - add selftets.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:18 +02:00
Jakub Kicinski
d4d34185e7 selftests/tls: add shutdown tests
Add test for killing the connection via shutdown.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:17 +02:00
Jakub Kicinski
8051bb7f2c selftests/tls: close the socket with open record
Add test which sends some data with MSG_MORE and then
closes the socket (never calling send without MSG_MORE).
This should make sure we clean up open records correctly.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:17 +02:00
Jakub Kicinski
65d41fb317 selftests/tls: add a bidirectional test
Add a simple test which installs the TLS state for both directions,
sends and receives data on both sockets.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:17 +02:00
Jakub Kicinski
78b5dc3d68 selftests/tls: test error codes around TLS ULP installation
Test the error codes returned when TCP connection is not
in ESTABLISHED state.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:17 +02:00
Jakub Kicinski
cf32526c88 selftests/tls: add a test for ULP but no keys
Make sure we test the TLS_BASE/TLS_BASE case both with data
and the tear down/clean up path.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:17 +02:00
John Fastabend
95fa145479 bpf: sockmap/tls, close can race with map free
When a map free is called and in parallel a socket is closed we
have two paths that can potentially reset the socket prot ops, the
bpf close() path and the map free path. This creates a problem
with which prot ops should be used from the socket closed side.

If the map_free side completes first then we want to call the
original lowest level ops. However, if the tls path runs first
we want to call the sockmap ops. Additionally there was no locking
around prot updates in TLS code paths so the prot ops could
be changed multiple times once from TLS path and again from sockmap
side potentially leaving ops pointed at either TLS or sockmap
when psock and/or tls context have already been destroyed.

To fix this race first only update ops inside callback lock
so that TLS, sockmap and lowest level all agree on prot state.
Second and a ULP callback update() so that lower layers can
inform the upper layer when they are being removed allowing the
upper layer to reset prot ops.

This gets us close to allowing sockmap and tls to be stacked
in arbitrary order but will save that patch for *next trees.

v4:
 - make sure we don't free things for device;
 - remove the checks which swap the callbacks back
   only if TLS is at the top.

Reported-by: syzbot+06537213db7ba2745c4a@syzkaller.appspotmail.com
Fixes: 02c558b2d5 ("bpf: sockmap, support for msg_peek in sk_msg with redirect ingress")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:17 +02:00