linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-08 11:50:43 +09:00

Author	SHA1	Message	Date
Vasily Gorbik	f44fa79b10	s390/test_unwind: require that unwinding ended successfully Currently unwinder test passes if unwinding results contain unwindme_func2 and unwindme_func1 functions. Now that unwinder reports success upon reaching task pt_regs, check that unwinding ended successfully in every test. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Ilya Leoshkevich	badbf39790	s390/unwind: add a test for the internal API unwind_for_each_frame can take at least 8 different sets of parameters. Add a test to make sure they all are handled in a sane way. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Co-developed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Vasily Gorbik	adcfb8cdc9	s390/unwind: always inline get_stack_pointer Always inline get_stack_pointer() to avoid potential problems due to compiler inlining decisions, i.e. getting stack pointer of get_stack_pointer() itself which is later reused. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Niklas Schnelle	d497b7ec83	s390/pci: add error message on device number limit The config option CONFIG_PCI_NR_FUNCTIONS sets a limit on the number of PCI functions we can support. Previously on reaching this limit there was no indication why newly attached devices are not recognized by Linux which could be quite confusing. Thus this patch adds a pr_err() for this case. Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Niklas Schnelle	794b8846dc	s390/pci: add error message for UID collision When UID checking was turned off during runtime in the underlying hypervisor, a PCI device may be attached with the same UID. This is already detected but happens silently. Add an error message so it can more easily be understood why a device was not added. Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Thomas Richter	247f265fa5	s390/cpum_sf: Check for SDBT and SDB consistency Each SBDT is located at a 4KB page and contains 512 entries. Each entry of a SDBT points to a SDB, a 4KB page containing sampled data. The last entry is a link to another SDBT page. When an event is created the function sequence executed is: __hw_perf_event_init() +--> allocate_buffers() +--> realloc_sampling_buffers() +---> alloc_sample_data_block() Both functions realloc_sampling_buffers() and alloc_sample_data_block() allocate pages and the allocation can fail. This is handled correctly and all allocated pages are freed and error -ENOMEM is returned to the top calling function. Finally the event is not created. Once the event has been created, the amount of initially allocated SDBT and SDB can be too low. This is detected during measurement interrupt handling, where the amount of lost samples is calculated. If the number of lost samples is too high considering sampling frequency and already allocated SBDs, the number of SDBs is enlarged during the next execution of cpumsf_pmu_enable(). If more SBDs need to be allocated, functions realloc_sampling_buffers() +---> alloc-sample_data_block() are called to allocate more pages. Page allocation may fail and the returned error is ignored. A SDBT and SDB setup already exists. However the modified SDBTs and SDBs might end up in a situation where the first entry of an SDBT does not point to an SDB, but another SDBT, basicly an SBDT without payload. This can not be handled by the interrupt handler, where an SDBT must have at least one entry pointing to an SBD. Add a check to avoid SDBTs with out payload (SDBs) when enlarging the buffer setup. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Thomas Richter	7dd6b199df	s390/cpum_sf: Use TEAR_REG macro consistantly The macro TEAR_REG() saves the last used SDBT address in the perf_hw_event structure. This is also done by function hw_reset_registers() which is a one-liner and simply uses macro TEAR_REG(). Remove function hw_reset_registers(), which is only used one time and use macro TEAR_REG() instead. This macro is used throughout the code anyway. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Thomas Richter	c17a7c6ee8	s390/cpum_sf: Remove unnecessary check for pending SDBs In interrupt handling the function extend_sampling_buffer() is called after checking for a possibly extension. This check is not necessary as the called function itself performs this check again. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Thomas Richter	532da3de70	s390/cpum_sf: Replace function name in debug statements Replace hard coded function names in debug statements by the "%s ...", __func__ construct suggested by checkpatch.pl script. Use consistent debug print format of the form variable blank value. Also add leading 0x for all hex values. Print allocated page addresses consistantly as hex numbers with leading 0x. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Gerald Schaefer	a9f2f6865d	s390/kaslr: store KASLR offset for early dumps The KASLR offset is added to vmcoreinfo in arch_crash_save_vmcoreinfo(), so that it can be found by crash when processing kernel dumps. However, arch_crash_save_vmcoreinfo() is called during a subsys_initcall, so if the kernel crashes before that, we have no vmcoreinfo and no KASLR offset. Fix this by storing the KASLR offset in the lowcore, where the vmcore_info pointer will be stored, and where it can be found by crash. In order to make it distinguishable from a real vmcore_info pointer, mark it as uneven (KASLR offset itself is aligned to THREAD_SIZE). When arch_crash_save_vmcoreinfo() stores the real vmcore_info pointer in the lowcore, it overwrites the KASLR offset. At that point, the KASLR offset is not yet added to vmcoreinfo, so we also need to move the mem_assign_absolute() behind the vmcoreinfo_append_str(). Fixes: `b2d24b97b2` ("s390/kernel: add support for kernel address space layout randomization (KASLR)") Cc: <stable@vger.kernel.org> # v5.2+ Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	e76e69611e	s390/unwind: stop gracefully at task pt_regs Consider reaching task pt_regs graceful unwinder termination. Task pt_regs itself never contains a valid state to which a task might return within the kernel context (user task pt_regs is a special case). Since we already avoid printing user task pt_regs and in most cases we don't even bother filling task pt_regs psw and r15 with something reasonable simply skip task pt_regs altogether. With this change unwind_error() now accurately represent whether unwinder reached task pt_regs successfully or failed along the way. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	cb7948e8c3	s390/head64: correct init_task stack setup Add missing allocation of pt_regs at the bottom of the stack. This makes it consistent with other stack setup cases and also what stack unwinder expects. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	97806dfb6f	s390/unwind: make reuse_sp default when unwinding pt_regs Currently unwinder yields 2 entries when pt_regs are met: sp="address of pt_regs itself" ip=pt_regs->psw sp=pt_regs->gprs[15] ip="r14 from stack frame pointed by pt_regs->gprs[15]" And neither of those 2 states (combination of sp and ip) ever happened. reuse_sp has been introduced by commit `a1d863ac3e` ("s390/unwind: fix mixing regs and sp"). reuse_sp=true makes unwinder keen to produce the following result, when pt_regs are given (as an arg to unwind_start): sp=pt_regs->gprs[15] ip=pt_regs->psw sp=pt_regs->gprs[15] ip="r14 from stack frame pointed by pt_regs->gprs[15]" The first state is an actual state in which a task was when pt_regs were collected. The second state is marked unreliable and is for debugging purposes to cover the case when a task has been interrupted in between stack frame allocation and writing back_chain - in this case r14 might show an actual caller. Make unwinder behaviour enabled via reuse_sp=true default and drop the special case handling. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	67f5593419	s390/unwind: report an error if pt_regs are not on stack If unwinder is looking at pt_regs which is not on stack then something went wrong and an error has to be reported rather than successful unwinding termination. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	7bcaad1f9f	s390: avoid misusing CALL_ON_STACK for task stack setup CALL_ON_STACK is intended to be used for temporary stack switching with potential return to the caller. When CALL_ON_STACK is misused to switch from nodat stack to task stack back_chain information would later lead stack unwinder from task stack into (per cpu) nodat stack which is reused for other purposes. This would yield confusing unwinding result or errors. To avoid that introduce CALL_ON_STACK_NORETURN to be used instead. It makes sure that back_chain is zeroed and unwinder finishes gracefully ending up at task pt_regs. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	7579425777	s390: correct CALL_ON_STACK back_chain saving Currently CALL_ON_STACK saves r15 as back_chain in the first stack frame of the stack we about to switch to. But if a function which uses CALL_ON_STACK calls other function it allocates a stack frame for a callee. In this case r15 is pointing to a callee stack frame and not a stack frame of function itself. This results in dummy unwinding entry with random sp and ip values. Introduce and utilize current_frame_address macro to get an address of actual function stack frame. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	103b4cca60	s390/unwind: unify task is current checks Avoid mixture of task == NULL and task == current meaning the same thing and simply always initialize task with current in unwind_start. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	7f28dad395	s390: disable preemption when switching to nodat stack with CALL_ON_STACK Make sure preemption is disabled when temporary switching to nodat stack with CALL_ON_STACK helper, because nodat stack is per cpu. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	c2e06e15ad	s390: always inline disabled_wait disabled_wait uses _THIS_IP_ and assumes that compiler would inline it. Make sure this assumption is always correct by utilizing __always_inline. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:44 +01:00
Heiko Carstens	5a5525b048	s390/vdso: fix getcpu getcpu reads the required values for cpu and node with two instructions. This might lead to an inconsistent result if user space gets preempted and migrated to a different CPU between the two instructions. Fix this by using just a single instruction to read both values at once. This is currently rather a theoretical bug, since there is no real NUMA support available (except for NUMA emulation). Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:44 +01:00
Heiko Carstens	a2308c11ec	s390/smp,vdso: fix ASCE handling When a secondary CPU is brought up it must initialize its control registers. CPU A which triggers that a secondary CPU B is brought up stores its control register contents into the lowcore of new CPU B, which then loads these values on startup. This is problematic in various ways: the control register which contains the home space ASCE will correctly contain the kernel ASCE; however control registers for primary and secondary ASCEs are initialized with whatever values were present in CPU A. Typically: - the primary ASCE will contain the user process ASCE of the process that triggered onlining of CPU B. - the secondary ASCE will contain the percpu VDSO ASCE of CPU A. Due to lazy ASCE handling we may also end up with other combinations. When then CPU B switches to a different process (!= idle) it will fixup the primary ASCE. However the problem is that the (wrong) ASCE from CPU A was loaded into control register 1: as soon as an ASCE is attached (aka loaded) a CPU is free to generate TLB entries using that address space. Even though it is very unlikey that CPU B will actually generate such entries, this could result in TLB entries of the address space of the process that ran on CPU A. These entries shouldn't exist at all and could cause problems later on. Furthermore the secondary ASCE of CPU B will not be updated correctly. This means that processes may see wrong results or even crash if they access VDSO data on CPU B. The correct VDSO ASCE will eventually be loaded on return to user space as soon as the kernel executed a call to strnlen_user or an atomic futex operation on CPU B. Fix both issues by intializing the to be loaded control register contents with the correct ASCEs and also enforce (re-)loading of the ASCEs upon first context switch and return to user space. Fixes: `0aaba41b58` ("s390: remove all code using the access register mode") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:44 +01:00
Harald Freudenberger	6733775a92	s390/zcrypt: handle new reply code FILTERED_BY_HYPERVISOR This patch introduces support for a new architectured reply code 0x8B indicating that a hypervisor layer (if any) has rejected an ap message. Linux may run as a guest on top of a hypervisor like zVM or KVM. So the crypto hardware seen by the ap bus may be restricted by the hypervisor for example only a subset like only clear key crypto requests may be supported. Other requests will be filtered out - rejected by the hypervisor. The new reply code 0x8B will appear in such cases and needs to get recognized by the ap bus and zcrypt device driver zoo. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:44 +01:00
Ilya Leoshkevich	914d52e464	s390: implement perf_arch_fetch_caller_regs On s390 bpf_get_stack_raw_tp() returns 0 entries for both kernel and user stacks. While there is no practical unwinding solution for userspace on s390 at this moment, there certainly is a kernel unwinder. However, it is not properly integrated with BPF. In order to start unwinding, bpf_get_stack_raw_tp() obtains the current kernel register values using perf_fetch_caller_regs(), which is not implemented for s390. The actual unwinding then happens by passing those registers to perf_callchain_kernel(). Implement perf_arch_fetch_caller_regs() for s390, where __builtin_frame_address(0) points to back_chain. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:44 +01:00
Max Filippov	9d9043f6a8	xtensa: clean up system_call/xtensa_rt_sigreturn interaction system_call assembly code always pushes pointer to struct pt_regs as the last additional parameter for all system calls. The only user of this feature is xtensa_rt_sigreturn. Avoid this special case. Define xtensa_rt_sigreturn as accepting no argiments. Use current_pt_regs to get pointer to struct pt_regs in xtensa_rt_sigreturn. Don't pass additional parameter from system_call code. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2019-11-29 19:37:12 -08:00
Max Filippov	02ce94c229	xtensa: fix system_call interaction with ptrace Don't overwrite return value if system call was cancelled at entry by ptrace. Return status code from do_syscall_trace_enter so that pt_regs::syscall doesn't need to be changed to skip syscall. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2019-11-29 15:47:54 -08:00
Max Filippov	ba9c1d6599	xtensa: rearrange syscall tracing system_call saves and restores syscall number across system call to make clone and execv entry and exit tracing match. This complicates things when syscall code may be changed by ptrace. Preserve syscall code in copy_thread and start_thread directly instead of doing tricks in system_call. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2019-11-29 14:39:12 -08:00
Thadeu Lima de Souza Cascardo	2745aea675	selftests: pmtu: use -oneline for ip route list cache Some versions of iproute2 will output more than one line per entry, which will cause the test to fail, like: TEST: ipv6: list and flush cached exceptions [FAIL] can't list cached exceptions That happens, for example, with iproute2 4.15.0. When using the -oneline option, this will work just fine: TEST: ipv6: list and flush cached exceptions [ OK ] This also works just fine with a more recent version of iproute2, like 5.4.0. For some reason, two lines are printed for the IPv4 test no matter what version of iproute2 is used. Use the same -oneline parameter there instead of counting the lines twice. Fixes: `b964641e99` ("selftests: pmtu: Make list_flush_ipv6_exception test more demanding") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-29 12:29:55 -08:00
Jiri Kosina	d8d0470875	Merge branch 'for-5.5/whiskers' into for-linus - robustification of tablet mode support in google-whiskers driver (Dmitry Torokhov)	2019-11-29 20:39:21 +01:00
Jiri Kosina	a820e45039	Merge branch 'for-5.5/logitech' into for-linus - Support for Logitech G15 (Hans de Goede) - silencing of non-informative error flow in dmesg from logitechi-hiddpp (Hans de Goede)	2019-11-29 20:37:55 +01:00
Jiri Kosina	2fa55328f1	Merge branch 'for-5.5/ish' into for-linus - typo fix (Geert Uytterhoeven)	2019-11-29 20:37:10 +01:00
Jiri Kosina	b49b511f41	Merge branch 'for-5.5/i2c' into for-linus - removal of superfluous delay (You-Sheng Yang)	2019-11-29 20:36:45 +01:00
Jiri Kosina	09f5429ddf	Merge branch 'for-5.5/hidraw' into for-linus - printk() -> pr_*() cleanup (Rishi Gupta)	2019-11-29 20:36:00 +01:00
Jiri Kosina	b746a1a286	Merge branch 'for-5.5/core' into for-linus - hid_have_special_driver[] cleanup for LED devices (Heiner Kallweit) - HID parser improvements (Blaž Hrastnik, Candle Sun)	2019-11-29 20:34:28 +01:00
Arnaldo Carvalho de Melo	77b91c1a52	perf machine: Fill map_symbol->maps in append_inlines() to fix segfault I forgot to fill in the map_symbol->maps field in append_inlines() which then makes code down the line segfault when trying to deref it. It doesn't make any sense to have an addr_location with its 'map' member not NULL while its 'maps' is NULL, after all al->maps is where al->map is in. It is done that way so that we don't have to have in each 'struct map' a pointer to the 'struct maps' it is in, as we had in the past when we would have 'map->mg', before 'struct maps' was combined with 'struct map_groups', because there was always a one-to-one relationship for these structs. This fixes a segfault when processing DWARF callgraphs in 'perf report'. Reported-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: `08f6680e62` ("perf tools: Add a 'struct map_groups' pointer to 'struct map_symbol'") Link: http://lore.kernel.org/lkml/20191129160631.GD26963@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 16:11:06 -03:00
Paolo Bonzini	3525d0ccd9	Merge tag 'kvm-ppc-uvmem-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD KVM: Add support for secure guests under the Protected Execution Framework (PEF) Ultravisor on POWER. This enables secure memory to be represented as device memory, which provides a way for the host to keep track of which pages of a secure guest have been moved into secure memory managed by the ultravisor and are no longer accessible by the host, and manage movement of pages between secure and normal memory.	2019-11-29 19:20:08 +01:00
Wainer dos Santos Moschetta	80b10aa924	Documentation: kvm: Fix mention to number of ioctls classes In api.txt it is said that KVM ioctls belong to three classes but in reality it is four. Fixed this, but do not count categories anymore to avoid such as outdated information in the future. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-11-29 19:18:51 +01:00
Jens Axboe	aa4c396775	io_uring: fix missing kmap() declaration on powerpc Christophe reports that current master fails building on powerpc with this error: CC fs/io_uring.o fs/io_uring.c: In function ‘loop_rw_iter’: fs/io_uring.c:1628:21: error: implicit declaration of function ‘kmap’ [-Werror=implicit-function-declaration] iovec.iov_base = kmap(iter->bvec->bv_page) ^ fs/io_uring.c:1628:19: warning: assignment makes pointer from integer without a cast [-Wint-conversion] iovec.iov_base = kmap(iter->bvec->bv_page) ^ fs/io_uring.c:1643:4: error: implicit declaration of function ‘kunmap’ [-Werror=implicit-function-declaration] kunmap(iter->bvec->bv_page); ^ which is caused by a missing highmem.h include. Fix it by including it. Fixes: `311ae9e159` ("io_uring: fix dead-hung for non-iter fixed rw") Reported-by: Christophe Leroy <christophe.leroy@c-s.fr> Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-29 10:14:00 -07:00
Ian Rogers	fa7f7e7354	perf jit: Move test functionality in to a test Adds a test for minimal jit_write_elf functionality. Committer testing: # perf test jit 61: Test jit_write_elf : Ok # # perf test -v jit 61: Test jit_write_elf : --- start --- test child forked, pid 10460 Writing jit code to: /tmp/perf-test-KqxURR test child finished with 0 ---- end ---- Test jit_write_elf: Ok # Committer notes: Fix up the case where HAVE_JITDUMP is no defined. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Allison Randal <allison@lohutok.net> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lore.kernel.org/lkml/20191126235913.41855-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Andi Kleen	704e2f5b70	perf stat: Use affinity for enabling/disabling events Restructure event enabling/disabling to use affinity, which minimizes the number of IPIs needed. Before on a large test case with 94 CPUs: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 54.65 1.899986 22 84812 660 ioctl after: 39.21 0.930451 10 84796 644 ioctl Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-13-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Andi Kleen	363fb12189	perf evsel: Add functions to enable/disable for a specific CPU Refactor the existing functions to use these functions internally. Used in the next patch. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-12-andi@firstfloor.org Link: http://lore.kernel.org/lkml/20191127232657.GL84886@tassilo.jf.intel.com # Fix Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Andi Kleen	4b49ab708d	perf stat: Use affinity for reading Restructure event reading to use affinity to minimize the number of IPIs needed. Before on a large test case with 94 CPUs: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 3.16 0.106079 4 22082 read After: 3.43 0.081295 3 22082 read Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-11-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Andi Kleen	4804e01116	perf stat: Use affinity for opening events Restructure the event opening in perf stat to cycle through the events by CPU after setting affinity to that CPU. This eliminates IPI overhead in the perf API. We have to loop through the CPU in the outter builtin-stat code instead of leaving that to low level functions. It has to change the weak group fallback strategy slightly. Since we cannot easily undo the opens for other CPUs move the weak group retry to a separate loop. Before with a large test case with 94 CPUs: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 42.75 4.050910 67 60046 110 perf_event_open After: 26.86 0.944396 16 58069 110 perf_event_open (the number changes slightly because the weak group retries work differently and the test case relies on weak groups) Committer notes: Added one of the hunks in a patch provided by Andi after I noticed that the "event times" 'perf test' entry was segfaulting. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-10-andi@firstfloor.org Link: http://lore.kernel.org/lkml/20191127232657.GL84886@tassilo.jf.intel.com # Fix Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Andi Kleen	e0e6a6ca3a	perf stat: Factor out open error handling Factor out the open error handling into a separate function. This is useful for followon patches who need to duplicate this. No behavior change intended. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-9-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Andi Kleen	7736627b86	perf stat: Use affinity for closing file descriptors Closing a perf fd can also trigger an IPI to the target CPU. Use the same affinity technique as we use for reading/enabling events to closing to optimize the CPU transitions. Before on a large test case with 94 CPUs: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 32.56 3.085463 50 61483 close After: 10.54 0.735704 11 61485 close Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-8-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Andi Kleen	99d6141d67	perf evsel: Add functions to close evsel on a CPU Refactor the existing all CPU function to use the per CPU close internally. Export APIs to close per CPU. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-7-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Andi Kleen	a8cbe40fe9	perf evsel: Add iterator to iterate over events ordered by CPU Add some common code that is needed to iterate over all events in CPU order. Used in followon patches Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-6-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Andi Kleen	a2408a7036	perf evlist: Maintain evlist->all_cpus Maintain a cpumap in the evlist that is the union of all the cpus of the events. This needs a cpumap merge operation, which is added together with tests. v2: Add tests for cpu map merge Fix handling of duplicates Rename _update to _merge Factor out sorting. Fix handling of NULL maps in merge v3: Add comments and empty lines to _merge Committer testing: # perf test "Merge cpu map" 52: Merge cpu map : Ok # Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/20191121001522.180827-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Andi Kleen	7074674e73	perf cpumap: Maintain cpumaps ordered and without dups Enforce this in _trim() Needed for followon change. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20191121001522.180827-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-29 12:20:45 -03:00
Jaroslav Kysela	d2cd795c4e	ALSA: hda - fixup for the bass speaker on Lenovo Carbon X1 7th gen The auto-parser assigns the bass speaker to DAC3 (NID 0x06) which is without the volume control. I do not see a reason to use DAC2, because the shared output to all speakers produces the sufficient and well balanced sound. The stereo support is enough for this purpose (laptop). Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20191129144027.14765-1-perex@perex.cz Signed-off-by: Takashi Iwai <tiwai@suse.de>	2019-11-29 15:43:19 +01:00
Kai Vehmanen	609f548534	ALSA: hda: hdmi - preserve non-MST PCM routing for Intel platforms Commit `5398e94fb7` ("ALSA: hda - Add DP-MST support for NVIDIA codecs") introduced a slight change of behaviour how non-MST monitors are assigned to PCMs on Intel platforms. In the drm_audio_component.h interface, the third parameter to pin_eld_notify() is pipe number. On Intel platforms, this value is -1 for MST. On other platforms, a non-zero pipe id is used to signal MST use. This difference leads to some subtle differences in hdmi_find_pcm_slot() with regards to how non-MST monitors are assigned to PCMs. This patch restores the original behaviour on Intel platforms while keeping the new allocation policy on other platforms. Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20191129143756.23941-2-kai.vehmanen@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2019-11-29 15:42:16 +01:00

... 40 41 42 43 44 ...

888315 Commits