linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-07 19:30:30 +09:00

Author	SHA1	Message	Date
Jason Gunthorpe	20fef4ef84	nouveau: use mmu_interval_notifier instead of hmm_mirror Remove the hmm_mirror object and use the mmu_interval_notifier API instead for the range, and use the normal mmu_notifier API for the general invalidation callback. While here re-organize the pagefault path so the locking pattern is clear. nouveau is the only driver that uses a temporary range object and instead forwards nearly every invalidation range directly to the HW. While this is not how the mmu_interval_notifier was intended to be used, the overheads on the pagefaulting path are similar to the existing hmm_mirror version. Particularly since the interval tree will be small. Link: https://lore.kernel.org/r/20191112202231.3856-10-jgg@ziepe.ca Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:44 -04:00
Jason Gunthorpe	c625c274ee	nouveau: use mmu_notifier directly for invalidate_range_start There is no reason to get the invalidate_range_start() callback via an indirection through hmm_mirror, just register a normal notifier directly. Link: https://lore.kernel.org/r/20191112202231.3856-9-jgg@ziepe.ca Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:44 -04:00
Jason Gunthorpe	3506ff69c3	drm/radeon: use mmu_interval_notifier_insert The new API is an exact match for the needs of radeon. For some reason radeon tries to remove overlapping ranges from the interval tree, but interval trees (and mmu_interval_notifier_insert()) support overlapping ranges directly. Simply delete all this code. Since this driver is missing a invalidate_range_end callback, but still calls get_user_pages(), it cannot be correct against all races. Link: https://lore.kernel.org/r/20191112202231.3856-8-jgg@ziepe.ca Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:44 -04:00
Jason Gunthorpe	3889551db2	RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv This converts one of the two users of mmu_notifiers to use the new API. The conversion is fairly straightforward, however the existing use of notifiers here seems to be racey. Link: https://lore.kernel.org/r/20191112202231.3856-7-jgg@ziepe.ca Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:44 -04:00
Jason Gunthorpe	f25a546e65	RDMA/odp: Use mmu_interval_notifier_insert() Replace the internal interval tree based mmu notifier with the new common mmu_interval_notifier_insert() API. This removes a lot of code and fixes a deadlock that can be triggered in ODP: zap_page_range() mmu_notifier_invalidate_range_start() [..] ib_umem_notifier_invalidate_range_start() down_read(&per_mm->umem_rwsem) unmap_single_vma() [..] __split_huge_page_pmd() mmu_notifier_invalidate_range_start() [..] ib_umem_notifier_invalidate_range_start() down_read(&per_mm->umem_rwsem) // DEADLOCK mmu_notifier_invalidate_range_end() up_read(&per_mm->umem_rwsem) mmu_notifier_invalidate_range_end() up_read(&per_mm->umem_rwsem) The umem_rwsem is held across the range_start/end as the ODP algorithm for invalidate_range_end cannot tolerate changes to the interval tree. However, due to the nested invalidation regions the second down_read() can deadlock if there are competing writers. The new core code provides an alternative scheme to solve this problem. Fixes: `ca748c39ea` ("RDMA/umem: Get rid of per_mm->notifier_count") Link: https://lore.kernel.org/r/20191112202231.3856-6-jgg@ziepe.ca Tested-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:44 -04:00
Jason Gunthorpe	107e899874	mm/hmm: define the pre-processor related parts of hmm.h even if disabled Only the function calls are stubbed out with static inlines that always fail. This is the standard way to write a header for an optional component and makes it easier for drivers that only optionally need HMM_MIRROR. Link: https://lore.kernel.org/r/20191112202231.3856-5-jgg@ziepe.ca Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:44 -04:00
Jason Gunthorpe	04ec32fbc2	mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror hmm_mirror's handling of ranges does not use a sequence count which results in this bug: CPU0 CPU1 hmm_range_wait_until_valid(range) valid == true hmm_range_fault(range) hmm_invalidate_range_start() range->valid = false hmm_invalidate_range_end() range->valid = true hmm_range_valid(range) valid == true Where the hmm_range_valid() should not have succeeded. Adding the required sequence count would make it nearly identical to the new mmu_interval_notifier. Instead replace the hmm_mirror stuff with mmu_interval_notifier. Co-existence of the two APIs is the first step. Link: https://lore.kernel.org/r/20191112202231.3856-4-jgg@ziepe.ca Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Tested-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:44 -04:00
Jason Gunthorpe	99cb252f5e	mm/mmu_notifier: add an interval tree notifier Of the 13 users of mmu_notifiers, 8 of them use only invalidate_range_start/end() and immediately intersect the mmu_notifier_range with some kind of internal list of VAs. 4 use an interval tree (i915_gem, radeon_mn, umem_odp, hfi1). 4 use a linked list of some kind (scif_dma, vhost, gntdev, hmm) And the remaining 5 either don't use invalidate_range_start() or do some special thing with it. It turns out that building a correct scheme with an interval tree is pretty complicated, particularly if the use case is synchronizing against another thread doing get_user_pages(). Many of these implementations have various subtle and difficult to fix races. This approach puts the interval tree as common code at the top of the mmu notifier call tree and implements a shareable locking scheme. It includes: - An interval tree tracking VA ranges, with per-range callbacks - A read/write locking scheme for the interval tree that avoids sleeping in the notifier path (for OOM killer) - A sequence counter based collision-retry locking scheme to tell device page fault that a VA range is being concurrently invalidated. This is based on various ideas: - hmm accumulates invalidated VA ranges and releases them when all invalidates are done, via active_invalidate_ranges count. This approach avoids having to intersect the interval tree twice (as umem_odp does) at the potential cost of a longer device page fault. - kvm/umem_odp use a sequence counter to drive the collision retry, via invalidate_seq - a deferred work todo list on unlock scheme like RTNL, via deferred_list. This makes adding/removing interval tree members more deterministic - seqlock, except this version makes the seqlock idea multi-holder on the write side by protecting it with active_invalidate_ranges and a spinlock To minimize MM overhead when only the interval tree is being used, the entire SRCU and hlist overheads are dropped using some simple branches. Similarly the interval tree overhead is dropped when in hlist mode. The overhead from the mandatory spinlock is broadly the same as most of existing users which already had a lock (or two) of some sort on the invalidation path. Link: https://lore.kernel.org/r/20191112202231.3856-3-jgg@ziepe.ca Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-11-23 19:56:44 -04:00
Thomas Bogendoerfer	a8d0f11ee5	MIPS: SGI-IP27: Enable ethernet phy on second Origin 200 module PROM only enables ethernet PHY on first Origin 200 module, so we must do it ourselves for the second module. Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: James Hogan <jhogan@kernel.org> Cc: Lee Jones <lee.jones@linaro.org> Cc: David S. Miller <davem@davemloft.net> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-rtc@vger.kernel.org Cc: linux-serial@vger.kernel.org	2019-11-23 14:20:30 -08:00
Thomas Bogendoerfer	29b261ff6f	MIPS: PCI: Fix fake subdevice ID for IOC3 Generation of fake subdevice ID had vendor and device ID swapped. Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: James Hogan <jhogan@kernel.org> Cc: Lee Jones <lee.jones@linaro.org> Cc: David S. Miller <davem@davemloft.net> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-rtc@vger.kernel.org Cc: linux-serial@vger.kernel.org	2019-11-23 14:20:18 -08:00
Linus Torvalds	6b8a794678	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull last minute virtio bugfixes from Michael Tsirkin: "Minor bugfixes all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_balloon: fix shrinker count virtio_balloon: fix shrinker scan number of pages virtio_console: allocate inbufs in add_port() only if it is needed virtio_ring: fix return code on DMA mapping fails	2019-11-23 13:02:18 -08:00
Taehee Yoo	ab818362c9	net: use rhashtable_lookup() instead of rhashtable_lookup_fast() rhashtable_lookup_fast() internally calls rcu_read_lock() then, calls rhashtable_lookup(). So if rcu_read_lock() is already held, rhashtable_lookup() is enough. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 12:15:01 -08:00
Jakub Kicinski	3a06ee3396	Merge tag 'wireless-drivers-next-2019-11-22' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for v5.5 Last set of patches for v5.5. Major features here 802.11ax support for qtnfmac and airtime fairness support to mt76. And naturally smaller fixes and improvements all over. Major changes: qtnfmac * add 802.11ax support in AP mode * enable offload bridging support iwlwifi * support TX/RX antennas reporting mt76 * mt7615 smart carrier sense support * aggregation statistics via debugfs * airtime fairness (ATF) support * mt76x0 OF mac address support ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 12:00:54 -08:00
Jakub Kicinski	72a2707a87	Merge branch 'nfc-convert-from-txt-to-rst' Robert Schwebel says: ==================== here is v2 of the series converting the NFC documentation from txt to rst. Thanks to Jonathan and Dave for the input. Changes since (implicit) v1: * replace code-block by more compact :: syntax * really add the rst file to the index ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 11:03:26 -08:00
Robert Schwebel	4791d77a08	docs: networking: nfc: change to rst format Now that the sphinx syntax has been fixed, change the document from txt to rst and add it to the index. Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 11:00:19 -08:00
Robert Schwebel	bf0b2511e8	docs: networking: nfc: fix code block syntax Silence this warning: Documentation/networking/nfc.rst:113: WARNING: Definition list ends without a blank line; unexpected unindent. Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 11:00:19 -08:00
Robert Schwebel	f67b7c0874	docs: networking: nfc: fix bullet list syntax Fix this warning: Documentation/networking/nfc.rst:87: WARNING: Bullet list ends without a blank line; unexpected unindent. Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 11:00:19 -08:00
Robert Schwebel	c0b96e8f9f	docs: networking: nfc: change block diagram to sphinx syntax Change the block diagram to match the sphinx syntax. This will make it possible to switch this file to rst in the future. Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 11:00:19 -08:00
Robert Schwebel	66ac53a8c5	docs: networking: nfc: change headlines to sphinx syntax The headlines in this file do are not in the standard kernel docu- mentation headline format. Change it, so this file can be switched to rst in the future. Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 11:00:19 -08:00
Russell King	a5d66f8100	net: phy: initialise phydev speed and duplex sanely When a phydev is created, the speed and duplex are set to zero and -1 respectively, rather than using the predefined SPEED_UNKNOWN and DUPLEX_UNKNOWN constants. There is a window at initialisation time where we may report link down using the 0/-1 values. Tidy this up and use the predefined constants, so debug doesn't complain with: "Unsupported (update phy-core.c)/Unsupported (update phy-core.c)" when the speed and duplex settings are printed. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 10:46:41 -08:00
Russell King	e3cf8b3668	net: phy: remove phy_ethtool_sset() There are no users of phy_ethtool_sset() in the kernel anymore, and as of commit `3c1bcc8614` ("net: ethernet: Convert phydev advertize and supported from u32 to link mode"), the implementation is slightly buggy - it doesn't correctly check the masked advertising mask as it used to. Remove it, and update the phy documentation to refer to its replacement function. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 10:46:26 -08:00
Jakub Kicinski	84bb46cd62	Revert "bpf: Emit audit messages upon successful prog load and unload" This commit reverts commit `91e6015b08` ("bpf: Emit audit messages upon successful prog load and unload") and its follow up commit `7599a896f2` ("audit: Move audit_log_task declaration under CONFIG_AUDITSYSCALL") as requested by Paul Moore. The change needs close review on linux-audit, tests etc. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-23 09:56:02 -08:00
Dmitry Torokhov	b2b2dd71e0	tty: vt: keyboard: reject invalid keycodes Do not try to handle keycodes that are too big, otherwise we risk doing out-of-bounds writes: BUG: KASAN: global-out-of-bounds in clear_bit include/asm-generic/bitops-instrumented.h:56 [inline] BUG: KASAN: global-out-of-bounds in kbd_keycode drivers/tty/vt/keyboard.c:1411 [inline] BUG: KASAN: global-out-of-bounds in kbd_event+0xe6b/0x3790 drivers/tty/vt/keyboard.c:1495 Write of size 8 at addr ffffffff89a1b2d8 by task syz-executor108/1722 ... kbd_keycode drivers/tty/vt/keyboard.c:1411 [inline] kbd_event+0xe6b/0x3790 drivers/tty/vt/keyboard.c:1495 input_to_handler+0x3b6/0x4c0 drivers/input/input.c:118 input_pass_values.part.0+0x2e3/0x720 drivers/input/input.c:145 input_pass_values drivers/input/input.c:949 [inline] input_set_keycode+0x290/0x320 drivers/input/input.c:954 evdev_handle_set_keycode_v2+0xc4/0x120 drivers/input/evdev.c:882 evdev_do_ioctl drivers/input/evdev.c:1150 [inline] In this case we were dealing with a fuzzed HID device that declared over 12K buttons, and while HID layer should not be reporting to us such big keycodes, we should also be defensive and reject invalid data ourselves as well. Reported-by: syzbot+19340dff067c2d3835c0@syzkaller.appspotmail.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191122204220.GA129459@dtor-ws Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-11-23 18:31:07 +01:00
Jim Mattson	85c9aae9ac	kvm: nVMX: Relax guest IA32_FEATURE_CONTROL constraints Commit `37e4c997da` ("KVM: VMX: validate individual bits of guest MSR_IA32_FEATURE_CONTROL") broke the KVM_SET_MSRS ABI by instituting new constraints on the data values that kvm would accept for the guest MSR, IA32_FEATURE_CONTROL. Perhaps these constraints should have been opt-in via a new KVM capability, but they were applied indiscriminately, breaking at least one existing hypervisor. Relax the constraints to allow either or both of FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX and FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX to be set when nVMX is enabled. This change is sufficient to fix the aforementioned breakage. Fixes: `37e4c997da` ("KVM: VMX: validate individual bits of guest MSR_IA32_FEATURE_CONTROL") Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-11-23 11:34:46 +01:00
Sean Christopherson	ad5996d9a0	KVM: x86: Grab KVM's srcu lock when setting nested state Acquire kvm->srcu for the duration of ->set_nested_state() to fix a bug where nVMX derefences ->memslots without holding ->srcu or ->slots_lock. The other half of nested migration, ->get_nested_state(), does not need to acquire ->srcu as it is a purely a dump of internal KVM (and CPU) state to userspace. Detected as an RCU lockdep splat that is 100% reproducible by running KVM's state_test selftest with CONFIG_PROVE_LOCKING=y. Note that the failing function, kvm_is_visible_gfn(), is only checking the validity of a gfn, it's not actually accessing guest memory (which is more or less unsupported during vmx_set_nested_state() due to incorrect MMU state), i.e. vmx_set_nested_state() itself isn't fundamentally broken. In any case, setting nested state isn't a fast path so there's no reason to go out of our way to avoid taking ->srcu. ============================= WARNING: suspicious RCU usage 5.4.0-rc7+ #94 Not tainted ----------------------------- include/linux/kvm_host.h:626 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by evmcs_test/10939: #0: ffff88826ffcb800 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x630 [kvm] stack backtrace: CPU: 1 PID: 10939 Comm: evmcs_test Not tainted 5.4.0-rc7+ #94 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: dump_stack+0x68/0x9b kvm_is_visible_gfn+0x179/0x180 [kvm] mmu_check_root+0x11/0x30 [kvm] fast_cr3_switch+0x40/0x120 [kvm] kvm_mmu_new_cr3+0x34/0x60 [kvm] nested_vmx_load_cr3+0xbd/0x1f0 [kvm_intel] nested_vmx_enter_non_root_mode+0xab8/0x1d60 [kvm_intel] vmx_set_nested_state+0x256/0x340 [kvm_intel] kvm_arch_vcpu_ioctl+0x491/0x11a0 [kvm] kvm_vcpu_ioctl+0xde/0x630 [kvm] do_vfs_ioctl+0xa2/0x6c0 ksys_ioctl+0x66/0x70 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x54/0x200 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f59a2b95f47 Fixes: `8fcc4b5923` ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-11-23 11:30:15 +01:00
Sean Christopherson	05c19c2fe1	KVM: x86: Open code shared_msr_update() in its only caller Fold shared_msr_update() into its sole user to eliminate its pointless bounds check, its godawful printk, its misleading comment (it's called under a global lock), and its woefully inaccurate name. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-11-23 11:29:38 +01:00
Miaohe Lin	faf0be2216	KVM: Fix jump label out_free_* in kvm_init() The jump label out_free_1 and out_free_2 deal with the same stuff, so git rid of one and rename the label out_free_0a to retain the label name order. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-11-23 11:29:17 +01:00
Sean Christopherson	24885d1d79	KVM: x86: Remove a spurious export of a static function A recent change inadvertently exported a static function, which results in modpost throwing a warning. Fix it. Fixes: `cbbaa2727a` ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-11-23 11:28:59 +01:00
Ingo Molnar	8cacac6ecd	Merge tag 'perf-core-for-mingo-5.5-20191122' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: perf report: Jin Yao: - Allow entering the annotation view (symbol source/assembly + overhead/cycles/etc column) from the 'perf report --total-cycles' interface. E.g.: # perf record --all-cpus --branch-any --all-kernel ^C[ perf record: Woken up 5 times to write data ] # # perf evlist -v cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP\|TID\|TIME\|CPU\|PERIOD\|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY # # perf report --total-cycles # # Samples: 78762 of event 'cycles' Sampled Sampled Avg Avg Cycles% Cycles Cycles% Cycles [Program Block Range] Shared Object 1.72% 95.8K 0.00% 254 [msr.h:105 -> msr.h:166] [kernel.vmlinux] 1.56% 107.6K 0.00% 618 [compiler.h:199 -> common.c:301] [kernel.vmlinux] 0.83% 46.3K 0.00% 409 [entry_64.S:153 -> entry_64.S:175] [kernel.vmlinux] 0.83% 46.1K 0.00% 83 [jump_label.h:41 -> tsc.c:230] [kernel.vmlinux] 0.64% 36.9K 0.01% 1.4K [hda_intel.c:904 -> hda_intel.c:916] [snd_hda_intel] 0.57% 30.2K 0.00% 282 [file.c:710 -> file.c:730] [kernel.vmlinux] 0.48% 25.8K 0.00% 82 [spinlock.c:158 -> spinlock.c:160] [kernel.vmlinux] 0.45% 23.7K 0.00% 369 [tick-broadcast.c:585 -> tick-broadcast.c:586] [kernel.vmlinux] 0.44% 24.4K 0.00% 73 [msr.h:236 -> tsc.c:1088] [kernel.vmlinux] 0.43% 22.7K 0.00% 144 [cpuidle.c:229 -> cpuidle.c:232] [kernel.vmlinux] Then press 'A' or Enter on one of those lines, just like with 'perf top', say the top one: [msr.h:105 -> msr.h:166], then this shows up: Samples: 78K of event 'cycles', 4000 Hz, Event count (approx.): 78762 native_write_msr /lib/modules/5.4.0-rc8/build/vmlinux [Percent: local period] Percent│ IPC Cycle (Average IPC: 0.02, IPC Coverage: 50.0%) │ │ Disassembly of section .text: │ │ ffffffff8106c480 <native_write_msr>: │ __wrmsr(): │ return EAX_EDX_VAL(val, low, high); │ } │ │ static inline void notrace __wrmsr(unsigned int msr, u32 low, u32 high) │ { │ asm volatile("1: wrmsr\n" 49.16 │0.02 mov %edi,%ecx │0.02 mov %esi,%eax │0.02 wrmsr │ arch_static_branch(): │ #include <linux/stringify.h> │ #include <linux/types.h> │ │ static __always_inline bool arch_static_branch(struct static_key *key, bool branch) │ { │ asm_volatile_goto("1:" 0.79 │0.02 nop │ native_write_msr(): │ { │ __wrmsr(msr, low, high); │ │ if (msr_tracepoint_active(__tracepoint_write_msr)) │ do_trace_write_msr(msr, ((u64)high << 32 \| low), 0); │ } 50.05 │0.02 254 ← retq │ do_trace_write_msr(msr, ((u64)high << 32 \| low), 0); │ shl $0x20,%rdx │ mov %esi,%esi │ or %rdx,%rsi │ xor %edx,%edx │ → jmpq do_trace_write_msr We need to improve this to show the source code line numbers in the annotation view, so one can go from that program block to the annotation view and see those source code line numbers straight away. auxtrace/Intel PT: Adrian Hunter: - Add support for AUX area sampling, requires new functionality that will land in 5.5, its already in tip. This includes kernel capability querying so that it fails gracefully with older kernels, duimping aux area samples in 'perf report -D' and 'perf script'. perf.data: Alexey Budankov: - Fix decompression of PERF_RECORD_COMPRESSED records. core: Arnaldo Carvalho de Melo: - Use the 'dcacheline' cmp routine to find the right DSOs taking into account the 'maj', 'min', 'ino' and 'ino_generation', that got moved from 'struct map' to 'struct dso', where it belongs. This further reduces the size of 'struct map', there is still more work to do to maybe get it to max one cacheline. libtraceevent: Hewenliang: - Fix memory leakage in copy_filter_type(). Sudip Mukherjee: - Fix header installation. perf parse: Ian Rogers : - Fix potential memory leak when handling tracepoint errors, found using LLVM's libFuzzer. perf probe: Colin Ian King: - Fix spelling mistake "addrees" -> "address". Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-23 09:00:13 +01:00
Masahiro Yamada	b1fbfcb4a2	kbuild: make single target builds even faster Commit `2dffd23f81` ("kbuild: make single target builds much faster") made the situation much better. To improve it even more, apply the similar idea to the top Makefile. Trim unrelated directories from build-dirs. The single build code must be moved above the 'descend' target. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Jens Axboe <axboe@kernel.dk>	2019-11-23 15:46:42 +09:00
Masahiro Yamada	7ef9ab3b32	modpost: respect the previous export when 'exported twice' is warned When 'exported twice' is warned, let sym_add_exported() return without updating the symbol info. This respects the previous export, which is ordered first in modules.order This simplifies the code too. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-11-23 15:46:42 +09:00
Masahiro Yamada	e4b26c9f75	modpost: do not set ->preloaded for symbols from Module.symvers Now that there is no overwrap between symbols from ELF files and ones from Module.symvers. So, the 'exported twice' warning should be reported irrespective of where the symbol in question came from. The exceptional case is external module; in some cases, we build an external module to provide a different version/variant of the corresponding in-kernel module, overriding the same set of exported symbols. You can see this use-case in upstream; tools/testing/nvdimm/libnvdimm.ko replaces drivers/nvdimm/libnvdimm.ko in order to link it against mocked version of core kernel symbols. So, let's relax the 'exported twice' warning when building external modules. The multiple export from external modules is warned only when the previous one is from vmlinux or itself. With this refactoring, the ugly preloading goes away. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-11-23 15:46:42 +09:00
Masahiro Yamada	1743694eb2	modpost: stop symbol preloading for modversion CRC It is complicated to add mocked-up symbols for pre-handling CRC. Handle CRC after all the export symbols in the relevant module are registered. Call handle_modversion() after the handle_symbol() iteration. In some cases, I see atand-alone __crc_* without __ksymtab_. For example, ARCH=arm allyesconfig produces __crc_ccitt_veneer and __crc_itu_t_veneer. I guess they come from crc_ccitt, crc_itu_t, respectively. Since ___veneer are auto-generated symbols, just ignore them. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-11-23 15:46:38 +09:00
Giovanni Mascellani	4a1288f1c1	dell-smm-hwmon: Add documentation Part of the documentation is taken from the README of the userspace utils (https://github.com/vitorafsr/i8kutils). The license is GPL-2+ and the author Massimo Dal Zotto is already credited as author of the module. Therefore there should be no copyright problem. I also added a paragraph with specific information on the experimental support for automatic BIOS fan control. Signed-off-by: Giovanni Mascellani <gio@debian.org> Link: https://lore.kernel.org/r/20191122101519.1246458-2-gio@debian.org [groeck: Fixed some of the documentation warnings] Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2019-11-22 20:47:43 -08:00
Giovanni Mascellani	afe45277ad	hwmon: (dell-smm) Add support for disabling automatic BIOS fan control This patch exports standard hwmon pwmX_enable sysfs attribute for enabling or disabling automatic fan control by BIOS. Standard value "1" is for disabling automatic BIOS fan control and value "2" for enabling. By default BIOS auto mode is enabled by laptop firmware. When BIOS auto mode is enabled, custom fan speed value (set via hwmon pwmX sysfs attribute) is overwritten by SMM in few seconds and therefore any custom settings are without effect. So this is reason why implementing option for disabling BIOS auto mode is needed. So finally this patch allows kernel to set and control fan speed on laptops, but it can be dangerous (like setting speed of other fans). The SMM commands to enable or disable automatic fan control are not documented and are not the same on all Dell laptops. Therefore a whitelist is used to send the correct codes only on laptopts for which they are known. This patch was originally developed by Pali Rohár; later Giovanni Mascellani implemented the whitelist. Signed-off-by: Giovanni Mascellani <gio@debian.org> Co-Developed-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Link: https://lore.kernel.org/r/20191122101519.1246458-1-gio@debian.org [groeck: Fixed checkpatch warnings] Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2019-11-22 20:08:58 -08:00
Masahiro Yamada	9bd2a099d7	modpost: rename handle_modversions() to handle_symbol() This function handles not only modversions, but also unresolved symbols, export symbols, etc. Rename it to a more proper function name. While I was here, I also added the 'const' qualifier to *sym. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-11-23 12:44:24 +09:00
Masahiro Yamada	e84f9fbbec	modpost: refactor namespace_from_kstrtabns() to not hard-code section name Currently, namespace_from_kstrtabns() relies on the fact that namespace strings are recorded in the __ksymtab_strings section. Actually, it is coded in include/linux/export.h, but modpost does not need to hard-code the section name. Elf_Sym::st_shndx holds the index of the relevant section. Using it is a more portable way to get the namespace string. Make namespace_from_kstrtabns() simply call sym_get_data(), and delete the info->ksymtab_strings . While I was here, I added more 'const' qualifiers to pointers. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-11-23 12:44:24 +09:00
Masahiro Yamada	afa0459daa	modpost: add a helper to get data pointed by a symbol When CONFIG_MODULE_REL_CRCS is enabled, the value of __crc_* is not an absolute value, but the address to the CRC data embedded in the .rodata section. Getting the data pointed by the symbol value is somewhat complex. Split it out into a new helper, sym_get_data(). I will reuse it to refactor namespace_from_kstrtabns() in the next commit. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-11-23 12:44:24 +09:00
Paul Walmsley	1646220a6d	Merge branch 'next/defconfig-add-debug' into for-next	2019-11-22 18:59:23 -08:00
Paul Walmsley	8eace9fb39	Merge branch 'next/misc2' into for-next	2019-11-22 18:59:17 -08:00
Paul Walmsley	5ba9aa56e6	Merge branch 'next/nommu' into for-next Conflicts: arch/riscv/boot/Makefile arch/riscv/include/asm/sbi.h	2019-11-22 18:59:09 -08:00
Paul Walmsley	4a979862dd	Merge branch 'next/misc' into for-next	2019-11-22 18:58:34 -08:00
Paul Walmsley	e8cad25b7e	Merge branch 'next/tlb-opt' into for-next	2019-11-22 18:58:32 -08:00
Paul Walmsley	9acfd6f538	Merge branch 'next/isa-string' into for-next	2019-11-22 18:58:29 -08:00
Paul Walmsley	69049d523f	Merge branch 'next/seccomp' into for-next	2019-11-22 18:58:26 -08:00
Yash Shah	2cc6c4a0da	RISC-V: Add address map dumper Add support for dumping the kernel address space layout to the console. User can enable CONFIG_DEBUG_VM to dump the virtual memory region into dmesg buffer during boot-up. Signed-off-by: Yash Shah <yash.shah@sifive.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Anup Patel <anup@brainfault.org> [paul.walmsley@sifive.com: dropped .init/.text/.data/.bss prints; added PCI legacy I/O region display] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>	2019-11-22 18:53:26 -08:00
Paul Walmsley	2e06b27175	riscv: defconfigs: enable more debugging options Enable more debugging options in the RISC-V defconfigs to help kernel developers catch problems with patches earlier in the development cycle. Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>	2019-11-22 18:50:58 -08:00
Jakub Kicinski	8dcdc9524c	Merge branch 'sfc-ARFS-expiry-improvements' Edward Cree says: ==================== A series of changes to how we check filters for expiry, manage how much of that work to do & when, etc. Prompted by some pathological behaviour under heavy load, which was Reported-by: David Ahern <dahern@digitalocean.com> ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-22 17:56:29 -08:00
Edward Cree	6fbc05e591	sfc: do ARFS expiry work occasionally even without NAPI poll If there's no traffic on a channel, its ARFS expiry work will never get scheduled by efx_poll() as that isn't being run. So make efx_filter_rfs_expire() reschedule itself to run after 30 seconds. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-22 17:50:55 -08:00
Edward Cree	ca70bd423f	sfc: add statistics for ARFS Report the number of successful and failed insertions, and also the current count of filters, to aid in tuning e.g. rps_flow_cnt. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>	2019-11-22 17:50:55 -08:00

... 44 45 46 47 48 ...

887794 Commits