Commit Graph

82392 Commits

Author SHA1 Message Date
Radim Krčmář
dd1a4cc1fb KVM: split kvm_vcpu_wake_up from kvm_vcpu_kick
AVIC has a use for kvm_vcpu_wake_up.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18 18:04:27 +02:00
Christian Borntraeger
3491caf275 KVM: halt_polling: provide a way to qualify wakeups during poll
Some wakeups should not be considered a sucessful poll. For example on
s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
would be considered runnable - letting all vCPUs poll all the time for
transactional like workload, even if one vCPU would be enough.
This can result in huge CPU usage for large guests.
This patch lets architectures provide a way to qualify wakeups if they
should be considered a good/bad wakeups in regard to polls.

For s390 the implementation will fence of halt polling for anything but
known good, single vCPU events. The s390 implementation for floating
interrupts does a wakeup for one vCPU, but the interrupt will be delivered
by whatever CPU checks first for a pending interrupt. We prefer the
woken up CPU by marking the poll of this CPU as "good" poll.
This code will also mark several other wakeup reasons like IPI or
expired timers as "good". This will of course also mark some events as
not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
we prefer to not poll, unless we are really sure, though.

This patch successfully limits the CPU usage for cases like uperf 1byte
transactional ping pong workload or wakeup heavy workload like OLTP
while still providing a proper speedup.

This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
wakeups that are considered not good for polling.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
Cc: David Matlack <dmatlack@google.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
[Rename config symbol. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-13 17:29:23 +02:00
Alex Williamson
14717e2031 kvm: Conditionally register IRQ bypass consumer
If we don't support a mechanism for bypassing IRQs, don't register as
a consumer.  This eliminates meaningless dev_info()s when the connect
fails between producer and consumer, such as on AMD systems where
kvm_x86_ops->update_pi_irte is not implemented

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-11 22:37:55 +02:00
Alex Williamson
b52f3ed022 irqbypass: Disallow NULL token
A NULL token is meaningless and can only lead to unintended problems.
Error on registration with a NULL token, ignore de-registrations with
a NULL token.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-11 22:37:54 +02:00
Greg Kurz
0b1b1dfd52 kvm: introduce KVM_MAX_VCPU_ID
The KVM_MAX_VCPUS define provides the maximum number of vCPUs per guest, and
also the upper limit for vCPU ids. This is okay for all archs except PowerPC
which can have higher ids, depending on the cpu/core/thread topology. In the
worst case (single threaded guest, host with 8 threads per core), it limits
the maximum number of vCPUS to KVM_MAX_VCPUS / 8.

This patch separates the vCPU numbering from the total number of vCPUs, with
the introduction of KVM_MAX_VCPU_ID, as the maximal valid value for vCPU ids
plus one.

The corresponding KVM_CAP_MAX_VCPU_ID allows userspace to validate vCPU ids
before passing them to KVM_CREATE_VCPU.

This patch only implements KVM_MAX_VCPU_ID with a specific value for PowerPC.
Other archs continue to return KVM_MAX_VCPUS instead.

Suggested-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-11 22:37:54 +02:00
Greg Kurz
9b9e3fc4d5 KVM: remove NULL return path for vcpu ids >= KVM_MAX_VCPUS
Commit c896939f7c ("KVM: use heuristic for fast VCPU lookup by id") added
a return path that prevents vcpu ids to exceed KVM_MAX_VCPUS. This is a
problem for powerpc where vcpu ids can grow up to 8*KVM_MAX_VCPUS.

This patch simply reverses the logic so that we only try fast path if the
vcpu id can be tried as an index in kvm->vcpus[]. The slow path is not
affected by the change.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-11 22:37:53 +02:00
Paolo Bonzini
bdb4094eb5 Merge tag 'kvm-arm-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/ARM Changes for Linux v4.7

Reworks our stage 2 page table handling to have page table manipulation
macros separate from those of the host systems as the underlying
hardware page tables can be configured to be noticably different in
layout from the stage 1 page tables used by the host.

Adds 16K page size support based on the above.

Adds a generic firmware probing layer for the timer and GIC so that KVM
initializes using the same logic based on both ACPI and FDT.

Finally adds support for hardware updating of the access flag.
2016-05-11 22:37:37 +02:00
Julien Grall
a53d892dfb clocksource: arm_arch_timer: Remove arch_timer_get_timecounter
The only call of arch_timer_get_timecounter (in KVM) has been removed.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03 12:54:21 +02:00
Julien Grall
503a62862e KVM: arm/arm64: vgic: Rely on the GIC driver to parse the firmware tables
Currently, the firmware tables are parsed 2 times: once in the GIC
drivers, the other time when initializing the vGIC. It means code
duplication and make more tedious to add the support for another
firmware table (like ACPI).

Use the recently introduced helper gic_get_kvm_info() to get
information about the virtual GIC.

With this change, the virtual GIC becomes agnostic to the firmware
table and KVM will be able to initialize the vGIC on ACPI.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03 12:54:21 +02:00
Julien Grall
1839e57696 irqchip/gic-v3: Parse and export virtual GIC information
Fill up the recently introduced gic_kvm_info with the hardware
information used for virtualization.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03 12:54:21 +02:00
Julien Grall
502d6df11a irqchip/gic-v2: Parse and export virtual GIC information
For now, the firmware tables are parsed 2 times: once in the GIC
drivers, the other timer when initializing the vGIC. It means code
duplication and make more tedious to add the support for another
firmware table (like ACPI).

Introduce a new structure and set of helpers to get/set the virtual GIC
information. Also fill up the structure for GICv2.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03 12:54:21 +02:00
Julien Grall
d9b5e41591 clocksource: arm_arch_timer: Extend arch_timer_kvm_info to get the virtual IRQ
Currently, the firmware table is parsed by the virtual timer code in
order to retrieve the virtual timer interrupt. However, this is already
done by the arch timer driver.

To avoid code duplication, extend arch_timer_kvm_info to get the virtual
IRQ.

Note that the KVM code will be modified in a subsequent patch.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03 12:54:21 +02:00
Julien Grall
b4d6ce9776 clocksource: arm_arch_timer: Gather KVM specific information in a structure
Introduce a structure which are filled up by the arch timer driver and
used by the virtual timer in KVM.

The first member of this structure will be the timecounter. More members
will be added later.

A stub for the new helper isn't introduced because KVM requires the arch
timer for both ARM64 and ARM32.

The function arch_timer_get_timecounter is kept for the time being and
will be dropped in a subsequent patch.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03 12:54:20 +02:00
Paolo Bonzini
2e4682ba2e KVM: add missing memory barrier in kvm_{make,check}_request
kvm_make_request and kvm_check_request imply a producer-consumer
relationship; add implicit memory barriers to them.  There was indeed
already a place that was adding an explicit smp_mb() to order between
kvm_check_request and the processing of the request.  That memory
barrier can be removed (as an added benefit, kvm_check_request can use
smp_mb__after_atomic which is free on x86).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-20 15:29:17 +02:00
Linus Torvalds
ffb927d1dc Merge tag 'usb-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are some USB fixes and new device ids for 4.6-rc3.

  Nothing major, the normal USB gadget fixes and usb-serial driver ids,
  along with some other fixes mixed in.  All except the USB serial ids
  have been tested in linux-next, the id additions should be fine as
  they are 'trivial'"

* tag 'usb-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
  USB: option: add "D-Link DWM-221 B1" device id
  USB: serial: cp210x: Adding GE Healthcare Device ID
  USB: serial: ftdi_sio: Add support for ICP DAS I-756xU devices
  usb: dwc3: keystone: drop dma_mask configuration
  usb: gadget: udc-core: remove manual dma configuration
  usb: dwc3: pci: add ID for one more Intel Broxton platform
  usb: renesas_usbhs: fix to avoid using a disabled ep in usbhsg_queue_done()
  usb: dwc2: do not override forced dr_mode in gadget setup
  usb: gadget: f_midi: unlock on error
  USB: digi_acceleport: do sanity checking for the number of ports
  USB: cypress_m8: add endpoint sanity check
  USB: mct_u232: add sanity checking in probe
  usb: fix regression in SuperSpeed endpoint descriptor parsing
  USB: usbip: fix potential out-of-bounds write
  usb: renesas_usbhs: disable TX IRQ before starting TX DMAC transfer
  usb: renesas_usbhs: avoid NULL pointer derefernce in usbhsf_pkt_handler()
  usb: gadget: f_midi: Fixed a bug when buflen was smaller than wMaxPacketSize
  usb: phy: qcom-8x16: fix regulator API abuse
  usb: ch9: Fix SSP Device Cap wFunctionalitySupport type
  usb: gadget: composite: Access SSP Dev Cap fields properly
  ...
2016-04-09 12:23:02 -07:00
Linus Torvalds
fb41b4be00 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "This is a set of eight fixes.

  Two are trivial gcc-6 updates (brace additions and unused variable
  removal).  There's a couple of cxlflash regressions, a correction for
  sd being overly chatty on revalidation (causing excess log increases).
  A VPD issue which could crash USB devices because they seem very
  intolerant to VPD inquiries, an ALUA deadlock fix and a mpt3sas buffer
  overrun fix"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: Do not attach VPD to devices that don't support it
  sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes
  scsi_dh_alua: Fix a recently introduced deadlock
  scsi: Declare local symbols static
  cxlflash: Move to exponential back-off when cmd_room is not available
  cxlflash: Fix regression issue with re-ordering patch
  mpt3sas: Don't overreach ioc->reply_post[] during initialization
  aacraid: add missing curly braces
2016-04-09 12:00:42 -07:00
Linus Torvalds
9ef11ceb0d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Stale SKB data pointer access across pskb_may_pull() calls in L2TP,
    from Haishuang Yan.

 2) Fix multicast frame handling in mac80211 AP code, from Felix
    Fietkau.

 3) mac80211 station hashtable insert errors not handled properly, fix
    from Johannes Berg.

 4) Fix TX descriptor count limit handling in e1000, from Alexander
    Duyck.

 5) Revert a buggy netdev refcount fix in netpoll, from Bjorn Helgaas.

 6) Must assign rtnl_link_ops of the device before registering it, fix
    in ip6_tunnel from Thadeu Lima de Souza Cascardo.

 7) Memory leak fix in tc action net exit, from WANG Cong.

 8) Add missing AF_KCM entries to name tables, from Dexuan Cui.

 9) Fix regression in GRE handling of csums wrt.  FOU, from Alexander
    Duyck.

10) Fix memory allocation alignment and congestion map corruption in
    RDS, from Shamir Rabinovitch.

11) Fix default qdisc regression in tuntap driver, from Jason Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
  bridge, netem: mark mailing lists as moderated
  tuntap: restore default qdisc
  mpls: find_outdev: check for err ptr in addition to NULL check
  ipv6: Count in extension headers in skb->network_header
  RDS: fix congestion map corruption for PAGE_SIZE > 4k
  RDS: memory allocated must be align to 8
  GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU
  net: add the AF_KCM entries to family name tables
  MAINTAINERS: intel-wired-lan list is moderated
  lib/test_bpf: Add additional BPF_ADD tests
  lib/test_bpf: Add test to check for result of 32-bit add that overflows
  lib/test_bpf: Add tests for unsigned BPF_JGT
  lib/test_bpf: Fix JMP_JSET tests
  VSOCK: Detach QP check should filter out non matching QPs.
  stmmac: fix adjust link call in case of a switch is attached
  af_packet: tone down the Tx-ring unsupported spew.
  net_sched: fix a memory leak in tc action
  samples/bpf: Enable powerpc support
  samples/bpf: Use llc in PATH, rather than a hardcoded value
  samples/bpf: Fix build breakage with map_perf_test_user.c
  ...
2016-04-09 10:50:44 -07:00
Linus Torvalds
839a3f7657 Merge branch 'for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "These are bug fixes, including a really old fsync bug, and a few trace
  points to help us track down problems in the quota code"

* 'for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix file/data loss caused by fsync after rename and new inode
  btrfs: Reset IO error counters before start of device replacing
  btrfs: Add qgroup tracing
  Btrfs: don't use src fd for printk
  btrfs: fallback to vmalloc in btrfs_compare_tree
  btrfs: handle non-fatal errors in btrfs_qgroup_inherit()
  btrfs: Output more info for enospc_debug mount option
  Btrfs: fix invalid reference in replace_path
  Btrfs: Improve FL_KEEP_SIZE handling in fallocate
2016-04-09 10:41:34 -07:00
Linus Torvalds
1a59c53920 Merge tag 'iommu-fixes-v4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:

 - compile-time fixes (warnings and failures)

 - a bug in iommu core code which could cause the group->domain pointer
   to be falsly cleared

 - fix in scatterlist handling of the ARM common DMA-API code

 - stall detection fix for the Rockchip IOMMU driver

* tag 'iommu-fixes-v4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Silence an uninitialized variable warning
  iommu/rockchip: Fix "is stall active" check
  iommu: Don't overwrite domain pointer when there is no default_domain
  iommu/dma: Restore scatterlist offsets correctly
  iommu: provide of_xlate pointer unconditionally
2016-04-09 10:23:45 -07:00
Greg Kroah-Hartman
636c8a8d85 Merge tag 'usb-serial-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:

USB-serial fixes for v4.6-rc3

Here are some new device ids.

Signed-off-by: Johan Hovold <johan@kernel.org>
2016-04-08 15:41:58 -07:00
David S. Miller
30d237a6c2 Merge tag 'mac80211-for-davem-2016-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
For the current RC series, we have the following fixes:
 * TDLS fixes from Arik and Ilan
 * rhashtable fixes from Ben and myself
 * documentation fixes from Luis
 * U-APSD fixes from Emmanuel
 * a TXQ fix from Felix
 * and a compiler warning suppression from Jeff
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 16:41:28 -04:00
Linus Torvalds
93061f390f Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 bugfixes from Ted Ts'o:
 "These changes contains a fix for overlayfs interacting with some
  (badly behaved) dentry code in various file systems.  These have been
  reviewed by Al and the respective file system mtinainers and are going
  through the ext4 tree for convenience.

  This also has a few ext4 encryption bug fixes that were discovered in
  Android testing (yes, we will need to get these sync'ed up with the
  fs/crypto code; I'll take care of that).  It also has some bug fixes
  and a change to ignore the legacy quota options to allow for xfstests
  regression testing of ext4's internal quota feature and to be more
  consistent with how xfs handles this case"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: ignore quota mount options if the quota feature is enabled
  ext4 crypto: fix some error handling
  ext4: avoid calling dquot_get_next_id() if quota is not enabled
  ext4: retry block allocation for failed DIO and DAX writes
  ext4: add lockdep annotations for i_data_sem
  ext4: allow readdir()'s of large empty directories to be interrupted
  btrfs: fix crash/invalid memory access on fsync when using overlayfs
  ext4 crypto: use dget_parent() in ext4_d_revalidate()
  ext4: use file_dentry()
  ext4: use dget_parent() in ext4_file_open()
  nfs: use file_dentry()
  fs: add file_dentry()
  ext4 crypto: don't let data integrity writebacks fail with ENOMEM
  ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()
2016-04-07 17:22:20 -07:00
Linus Torvalds
d3436a1d03 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/qemu fixes from Michael S Tsirkin:
 "A couple of fixes for virtio and for the new QEMU fw cfg driver"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio: add VIRTIO_CONFIG_S_NEEDS_RESET device status bit
  MAINTAINERS: add entry for QEMU
  firmware: qemu_fw_cfg.c: hold ACPI global lock during device access
  virtio: virtio 1.0 cs04 spec compliance for reset
  qemu_fw_cfg: don't leak kobj on init error
2016-04-07 16:20:40 -07:00
Alexander Duyck
a0ca153f98 GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU
This patch fixes an issue I found in which we were dropping frames if we
had enabled checksums on GRE headers that were encapsulated by either FOU
or GUE.  Without this patch I was barely able to get 1 Gb/s of throughput.
With this patch applied I am now at least getting around 6 Gb/s.

The issue is due to the fact that with FOU or GUE applied we do not provide
a transport offset pointing to the GRE header, nor do we offload it in
software as the GRE header is completely skipped by GSO and treated like a
VXLAN or GENEVE type header.  As such we need to prevent the stack from
generating it and also prevent GRE from generating it via any interface we
create.

Fixes: c3483384ee ("gro: Allow tunnel stacking in the case of FOU/GUE")
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:56:33 -04:00
Stefan Hajnoczi
c00bbcf862 virtio: add VIRTIO_CONFIG_S_NEEDS_RESET device status bit
The VIRTIO 1.0 specification added the DEVICE_NEEDS_RESET device status
bit in "VIRTIO-98: Add DEVICE_NEEDS_RESET".  This patch defines the
device status bit in the uapi header file so that both the kernel and
userspace applications can use it.

The bit is currently unused by the virtio guest drivers and vhost.
According to the spec "a good implementation will try to recover by
issuing a reset".  This is not attempted here because it requires
auditing the virtio drivers to ensure there are no resource leaks or
crashes if the device needs to be reset mid-operation.

See "2.1 Device Status Field" in the VIRTIO 1.0 specification for
details.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07 15:16:41 +03:00
Dave Airlie
fd8c61ebd4 Merge branch 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Lots of misc bug fixes for radeon and amdgpu and one for ttm.
- fix vram info fetching on Fiji and unposted boards
- additional vblank fixes from the conversion to drm_vblank_on/off
- UVD dGPU suspend and resume fixes
- lots of powerplay fixes
- fix a fence leak in the pageflip code
- ttm fix for platforms where CPU is 32 bit, but physical addresses are >32bits

* 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux: (21 commits)
  drm/amdgpu: total vram size also reduces pin size
  drm/amd/powerplay: add uvd/vce dpm enabling flag default.
  drm/amd/powerplay: fix issue that resume back, dpm can't work on FIJI.
  drm/amdgpu: save and restore the firwmware cache part when suspend resume
  drm/amdgpu: save and restore UVD context with suspend and resume
  drm/ttm: use phys_addr_t for ttm_bus_placement
  drm/radeon: Only call drm_vblank_on/off between drm_vblank_init/cleanup
  drm/amdgpu: fence wait old rcu slot
  drm/amdgpu: fix leaking fence in the pageflip code
  drm/amdgpu: print vram type rather than just DDR
  drm/amdgpu/gmc: use proper register for vram type on Fiji
  drm/amdgpu/gmc: move vram type fetching into sw_init
  drm/amdgpu: Set vblank_disable_allowed = true
  drm/radeon: Set vblank_disable_allowed = true
  drm/amd/powerplay: Need to change boot to performance state in resume.
  drm/amd/powerplay: add new Fiji function for not setting same ps.
  drm/amdgpu: check dpm state before pm system fs initialized.
  drm/amd/powerplay: notify amdgpu whether dpm is enabled or not.
  drm/amdgpu: Not support disable dpm in powerplay.
  drm/amdgpu: add an cgs interface to notify amdgpu the dpm state.
  ...
2016-04-07 07:08:46 +10:00
WANG Cong
18fcf49f87 net_sched: fix a memory leak in tc action
Fixes: ddf97ccdd7 ("net_sched: add network namespace support for tc actions")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 16:04:29 -04:00
Dave Airlie
915e846d4b Merge tag 'imx-drm-next-2016-04-01' of git://git.pengutronix.de/git/pza/linux into drm-fixes
imx-drm: stricter plane parameter checking, dw_hdmi-imx and dmfc fixes

- Check whether plane parameters comply with IPU IDMAC limitations and
  fix planar YUV 4:2:0 U/V offsets and stride
- Cleanup encoder in dw_hdmi-imx bind error path and
  remove a superfluous platform_set_drvdata in dw_hdmi-imx
- DMFC setup fixes: lock the ipu_dmfc_init_channel function against
  concurrent use, rename it to ipu_dmfc_config_wait4eot, and call
  it after the FIFO size has been determined.

* tag 'imx-drm-next-2016-04-01' of git://git.pengutronix.de/git/pza/linux:
  drm/imx: Don't set a gamma table size
  drm/imx: ipuv3-plane: Configure DMFC wait4eot bit after slots are determined
  gpu: ipu-v3: ipu-dmfc: Rename ipu_dmfc_init_channel to ipu_dmfc_config_wait4eot
  gpu: ipu-v3: ipu-dmfc: Make function ipu_dmfc_init_channel() return void
  gpu: ipu-v3: ipu-dmfc: Protect function ipu_dmfc_init_channel() with mutex
  drm/imx: dw_hdmi: Don't call platform_set_drvdata()
  drm/imx: dw_hdmi: Call drm_encoder_cleanup() in error path
  drm/imx: ipuv3-plane: fix planar YUV 4:2:0 support
  drm/imx: ipuv3-plane: Add more thorough checks for plane parameter limitations
  gpu: ipu-cpmem: modify ipu_cpmem_set_yuv_planar_full for better control
2016-04-06 09:48:53 +10:00
Linus Torvalds
541d8f4d59 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "Miscellaneous bugfixes.

  The ARM and s390 fixes are for new regressions from the merge window,
  others are usual stable material"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  compiler-gcc: disable -ftracer for __noclone functions
  kvm: x86: make lapic hrtimer pinned
  s390/mm/kvm: fix mis-merge in gmap handling
  kvm: set page dirty only if page has been writable
  KVM: x86: reduce default value of halt_poll_ns parameter
  KVM: Hyper-V: do not do hypercall userspace exits if SynIC is disabled
  KVM: x86: Inject pending interrupt even if pending nmi exist
  arm64: KVM: Register CPU notifiers when the kernel runs at HYP
  arm64: kvm: 4.6-rc1: Fix VTCR_EL2 VS setting
2016-04-05 16:16:00 -07:00
Marcelo Ricardo Leitner
eb8e97715f sctp: use list_* in sctp_list_dequeue
Use list_* helpers in sctp_list_dequeue, more readable.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-05 15:44:08 -04:00
Paolo Bonzini
95272c2937 compiler-gcc: disable -ftracer for __noclone functions
-ftracer can duplicate asm blocks causing compilation to fail in
noclone functions.  For example, KVM declares a global variable
in an asm like

    asm("2: ... \n
         .pushsection data \n
         .global vmx_return \n
         vmx_return: .long 2b");

and -ftracer causes a double declaration.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: stable@vger.kernel.org
Cc: kvm@vger.kernel.org
Reported-by: Linda Walsh <lkml@tlinx.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-05 14:19:08 +02:00
Arnd Bergmann
b70bb98448 iommu: provide of_xlate pointer unconditionally
iommu drivers that support the standard DT bindings use a of_xlate
callback pointer, but that is only part of struct iommu_ops when
CONFIG_OF_IOMMU is enabled, leading to build errors in randconfig
builds when that is not provided:

drivers/iommu/mtk_iommu.c:497:2: error: unknown field 'of_xlate' specified in initializer
  .of_xlate = mtk_iommu_of_xlate,
  ^
drivers/iommu/mtk_iommu.c:497:14: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
  .of_xlate = mtk_iommu_of_xlate,
              ^~~~~~~~~~~~~~~~~~
drivers/iommu/mtk_iommu.c:497:14: note: (near initialization for 'mtk_iommu_ops.domain_get_attr')

We can work around it by adding more #ifdefs in each driver, but
it seems nicer to just allow setting the pointer even if it is
unused. This makes the driver code look nicer, and it gives better
compile-time coverage when test building on other architectures.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 0df4fabe20 ("iommu/mediatek: Add mt8173 IOMMU driver")
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-04-05 13:25:12 +02:00
James Bottomley
6ea7e3873e Merge branch 'fixes-base' into fixes 2016-04-05 06:56:47 -04:00
Hannes Reinecke
5ddfe0858e scsi: Do not attach VPD to devices that don't support it
The patch "scsi: rescan VPD attributes" introduced a regression in which
devices that don't support VPD were being scanned for VPD attributes
anyway.  This could cause issues for some devices and should be avoided
so the check for scsi_level has been moved out of scsi_add_lun and into
scsi_attach_vpd so that all callers will not scan VPD for devices that
don't support it.

[mkp: Merge fix]

Fixes: 09e2b0b146 ("scsi: rescan VPD attributes")
Cc: <stable@vger.kernel.org> #v4.5+
Suggested-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-05 06:56:40 -04:00
Luis de Bethencourt
84ea3a18c0 mac80211: add doc for RX_FLAG_DUP_VALIDATED flag
Add documentation for the flag for duplication check.

Fixes the following warning when running make htmldocs:
warning: Enum value 'RX_FLAG_DUP_VALIDATED' not described in enum 'mac80211_rx_flags'

Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
[fix description]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-05 11:10:59 +02:00
Alex Deucher
749b48faaf drm/ttm: use phys_addr_t for ttm_bus_placement
Fixes ttm on platforms like PPC460 where the CPU
is in 32-bit mode, but the physical addresses are
>32 bits.

Extracted from a patch by Hans Verkuil.

Tested-by: Julian Margetson <runaway@candw.ms>
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Julian Margetson <runaway@candw.ms>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-04-04 17:00:01 -04:00
Linus Torvalds
4a2d057e4f Merge branch 'PAGE_CACHE_SIZE-removal'
Merge PAGE_CACHE_SIZE removal patches from Kirill Shutemov:
 "PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
  ago with promise that one day it will be possible to implement page
  cache with bigger chunks than PAGE_SIZE.

  This promise never materialized.  And unlikely will.

  Let's stop pretending that pages in page cache are special.  They are
  not.

  The first patch with most changes has been done with coccinelle.  The
  second is manual fixups on top.

  The third patch removes macros definition"

[ I was planning to apply this just before rc2, but then I spaced out,
  so here it is right _after_ rc2 instead.

  As Kirill suggested as a possibility, I could have decided to only
  merge the first two patches, and leave the old interfaces for
  compatibility, but I'd rather get it all done and any out-of-tree
  modules and patches can trivially do the converstion while still also
  working with older kernels, so there is little reason to try to
  maintain the redundant legacy model.    - Linus ]

* PAGE_CACHE_SIZE-removal:
  mm: drop PAGE_CACHE_* and page_cache_{get,release} definition
  mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage
  mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
2016-04-04 10:50:24 -07:00
Kirill A. Shutemov
1fa64f198b mm: drop PAGE_CACHE_* and page_cache_{get,release} definition
All users gone.  We can remove these macros.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04 10:41:08 -07:00
Kirill A. Shutemov
ea1754a084 mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage
Mostly direct substitution with occasional adjustment or removing
outdated comments.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04 10:41:08 -07:00
Kirill A. Shutemov
09cbfeaf1a mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04 10:41:08 -07:00
Mark Fasheh
0f5dcf8de9 btrfs: Add qgroup tracing
This patch adds tracepoints to the qgroup code on both the reporting side
(insert_dirty_extents) and the accounting side. Taken together it allows us
to see what qgroup operations have happened, and what their result was.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-04 16:29:22 +02:00
Linus Torvalds
7b367f5dba Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core kernel fixes from Ingo Molnar:
 "This contains the nohz/atomic cleanup/fix for the fetch_or() ugliness
  you noted during the original nohz pull request, plus there's also
  misc fixes:

   - fix liblockdep build bug
   - fix uapi header build bug
   - print more lockdep hash collision info to help debug recent reports
     of hash collisions
   - update MAINTAINERS email address"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Update my email address
  locking/lockdep: Print chain_key collision information
  uapi/linux/stddef.h: Provide __always_inline to userspace headers
  tools/lib/lockdep: Fix unsupported 'basename -s' in run_tests.sh
  locking/atomic, sched: Unexport fetch_or()
  timers/nohz: Convert tick dependency mask to atomic_t
  locking/atomic: Introduce atomic_fetch_or()
2016-04-03 07:06:53 -05:00
Linus Torvalds
d6c24df082 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
 "This includes fixes from HCH for -rc1 configfs default_groups
  conversion changes that ended up breaking some iscsi-target
  default_groups, along with Sagi's ib_drain_qp() conversion for
  iser-target to use the common caller now available to RDMA kernel
  consumers in v4.6+ code"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: add a new add_wwn_groups fabrics method
  target: initialize the nacl base CIT begfore init_nodeacl
  target: remove ->fabric_cleanup_nodeacl
  iser-target: Use ib_drain_qp
2016-04-02 18:48:37 -05:00
Linus Torvalds
264800b5ec Merge tag 'configfs-for-linus-2' of git://git.infradead.org/users/hch/configfs
Pull configfs fix from Christoph Hellwig:
 "A trivial fix to the recently introduced binary attribute helper
  macros"

* tag 'configfs-for-linus-2' of git://git.infradead.org/users/hch/configfs:
  configfs: fix CONFIGFS_BIN_ATTR_[RW]O definitions
2016-04-02 16:46:56 -05:00
Linus Torvalds
05cf8077e5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Missing device reference in IPSEC input path results in crashes
    during device unregistration.  From Subash Abhinov Kasiviswanathan.

 2) Per-queue ISR register writes not being done properly in macb
    driver, from Cyrille Pitchen.

 3) Stats accounting bugs in bcmgenet, from Patri Gynther.

 4) Lightweight tunnel's TTL and TOS were swapped in netlink dumps, from
    Quentin Armitage.

 5) SXGBE driver has off-by-one in probe error paths, from Rasmus
    Villemoes.

 6) Fix race in save/swap/delete options in netfilter ipset, from
    Vishwanath Pai.

 7) Ageing time of bridge not set properly when not operating over a
    switchdev device.  Fix from Haishuang Yan.

 8) Fix GRO regression wrt nested FOU/GUE based tunnels, from Alexander
    Duyck.

 9) IPV6 UDP code bumps wrong stats, from Eric Dumazet.

10) FEC driver should only access registers that actually exist on the
    given chipset, fix from Fabio Estevam.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
  net: mvneta: fix changing MTU when using per-cpu processing
  stmmac: fix MDIO settings
  Revert "stmmac: Fix 'eth0: No PHY found' regression"
  stmmac: fix TX normal DESC
  net: mvneta: use cache_line_size() to get cacheline size
  net: mvpp2: use cache_line_size() to get cacheline size
  net: mvpp2: fix maybe-uninitialized warning
  tun, bpf: fix suspicious RCU usage in tun_{attach, detach}_filter
  net: usb: cdc_ncm: adding Telit LE910 V2 mobile broadband card
  rtnl: fix msg size calculation in if_nlmsg_size()
  fec: Do not access unexisting register in Coldfire
  net: mvneta: replace MVNETA_CPU_D_CACHE_LINE_SIZE with L1_CACHE_BYTES
  net: mvpp2: replace MVPP2_CPU_D_CACHE_LINE_SIZE with L1_CACHE_BYTES
  net: dsa: mv88e6xxx: Clear the PDOWN bit on setup
  net: dsa: mv88e6xxx: Introduce _mv88e6xxx_phy_page_{read, write}
  bpf: make padding in bpf_tunnel_key explicit
  ipv6: udp: fix UDP_MIB_IGNOREDMULTI updates
  bnxt_en: Fix ethtool -a reporting.
  bnxt_en: Fix typo in bnxt_hwrm_set_pause_common().
  bnxt_en: Implement proper firmware message padding.
  ...
2016-04-01 20:03:33 -05:00
Lucas Stach
bbe3de2560 mm/page_isolation: fix tracepoint to mirror check function behavior
Page isolation has not failed if the fin pfn extends beyond the end pfn
and test_pages_isolated checks this correctly.  Fix the tracepoint to
report the same result as the actual check function.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 17:03:37 -05:00
Chen Gang
969e8d7e47 include/linux/huge_mm.h: return NULL instead of false for pmd_trans_huge_lock()
The return value of pmd_trans_huge_lock() is a pointer, not a boolean
value, so use NULL instead of false as the return value.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 17:03:37 -05:00
Giuseppe CAVALLARO
a7657f128c stmmac: fix MDIO settings
Initially the phy_bus_name was added to manipulate the
driver name but it was recently just used to manage the
fixed-link and then to take some decision at run-time.
So the patch uses the is_pseudo_fixed_link and removes
the phy_bus_name variable not necessary anymore.

The driver can manage the mdio registration by using phy-handle,
dwmac-mdio and own parameter e.g. snps,phy-addr.
This patch takes care about all these possible configurations
and fixes the mdio registration in case of there is a real
transceiver or a switch (that needs to be managed by using
fixed-link).

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Tested-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Cc: Gabriel Fernandez <gabriel.fernandez@linaro.org>
Cc: Dinh Nguyen <dinh.linux@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Phil Reid <preid@electromag.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-01 14:38:59 -04:00
Giuseppe CAVALLARO
d7e944c8dd Revert "stmmac: Fix 'eth0: No PHY found' regression"
This reverts commit 88f8b1bb41.
due to problems on GeekBox and Banana Pi M1 board when
connected to a real transceiver instead of a switch via
fixed-link.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Gabriel Fernandez <gabriel.fernandez@linaro.org>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Cc: Dinh Nguyen <dinh.linux@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-01 14:38:58 -04:00
Daniel Borkmann
5a5abb1fa3 tun, bpf: fix suspicious RCU usage in tun_{attach, detach}_filter
Sasha Levin reported a suspicious rcu_dereference_protected() warning
found while fuzzing with trinity that is similar to this one:

  [   52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
  [   52.765688] other info that might help us debug this:
  [   52.765695] rcu_scheduler_active = 1, debug_locks = 1
  [   52.765701] 1 lock held by a.out/1525:
  [   52.765704]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff816a64b7>] rtnl_lock+0x17/0x20
  [   52.765721] stack backtrace:
  [   52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
  [...]
  [   52.765768] Call Trace:
  [   52.765775]  [<ffffffff813e488d>] dump_stack+0x85/0xc8
  [   52.765784]  [<ffffffff810f2fa5>] lockdep_rcu_suspicious+0xd5/0x110
  [   52.765792]  [<ffffffff816afdc2>] sk_detach_filter+0x82/0x90
  [   52.765801]  [<ffffffffa0883425>] tun_detach_filter+0x35/0x90 [tun]
  [   52.765810]  [<ffffffffa0884ed4>] __tun_chr_ioctl+0x354/0x1130 [tun]
  [   52.765818]  [<ffffffff8136fed0>] ? selinux_file_ioctl+0x130/0x210
  [   52.765827]  [<ffffffffa0885ce3>] tun_chr_ioctl+0x13/0x20 [tun]
  [   52.765834]  [<ffffffff81260ea6>] do_vfs_ioctl+0x96/0x690
  [   52.765843]  [<ffffffff81364af3>] ? security_file_ioctl+0x43/0x60
  [   52.765850]  [<ffffffff81261519>] SyS_ioctl+0x79/0x90
  [   52.765858]  [<ffffffff81003ba2>] do_syscall_64+0x62/0x140
  [   52.765866]  [<ffffffff817d563f>] entry_SYSCALL64_slow_path+0x25/0x25

Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.

Since the fix in f91ff5b9ff ("net: sk_{detach|attach}_filter() rcu
fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
filter with rcu_dereference_protected(), checking whether socket lock
is held in control path.

Since its introduction in 9940516259 ("tun: socket filter support"),
tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
sock_owned_by_user(sk) doesn't apply in this specific case and therefore
triggers the false positive.

Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
that is used by tap filters and pass in lockdep_rtnl_is_held() for the
rcu_dereference_protected() checks instead.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-01 14:33:46 -04:00