linux

mirror of https://github.com/hardkernel/linux.git synced 2026-04-02 11:13:02 +09:00

Author	SHA1	Message	Date
Victor Wan	2c95ea743b	Merge branch 'android-4.9' into amlogic-4.9-dev Conflicts: arch/arm/configs/omap2plus_defconfig drivers/Makefile drivers/android/binder.c	2018-01-08 18:44:19 +08:00
jianxin.pan	872270857a	module: skip sublevel and crc when ver check PD#157069: skip SUBLEVEL and crc when ver check durning insmod When CONFIG_MODVERSIONS enabled, vermagic and crc are checked durning insmod. Change-Id: I6eb7bdda5b771afa754f7b783a7bbfe1be7cedd1 Signed-off-by: jianxin.pan <jianxin.pan@amlogic.com>	2017-12-19 04:11:48 -07:00
Greg Kroah-Hartman	fdeec8fdb7	Merge 4.9.68 into android-4.9 Changes in 4.9.68 bcache: only permit to recovery read error when cache device is clean bcache: recover data from backing when data is clean drm/fsl-dcu: avoid disabling pixel clock twice on suspend drm/fsl-dcu: enable IRQ before drm_atomic_helper_resume() Revert "crypto: caam - get rid of tasklet" mm, oom_reaper: gather each vma to prevent leaking TLB entry uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub serial: 8250_pci: Add Amazon PCI serial device ID s390/runtime instrumentation: simplify task exit handling USB: serial: option: add Quectel BG96 id ima: fix hash algorithm initialization s390/pci: do not require AIS facility selftests/x86/ldt_get: Add a few additional tests for limits staging: greybus: loopback: Fix iteration count on async path m68k: fix ColdFire node shift size calculation serial: 8250_fintek: Fix rs485 disablement on invalid ioctl() staging: rtl8188eu: avoid a null dereference on pmlmepriv spi: sh-msiof: Fix DMA transfer size check spi: spi-axi: fix potential use-after-free after deregistration mmc: sdhci-msm: fix issue with power irq usb: phy: tahvo: fix error handling in tahvo_usb_probe() serial: 8250: Preserve DLD[7:4] for PORT_XR17V35X x86/entry: Use SYSCALL_DEFINE() macros for sys_modify_ldt() EDAC, sb_edac: Fix missing break in switch sysrq : fix Show Regs call trace on ARM usbip: tools: Install all headers needed for libusbip development perf test attr: Fix ignored test case result kprobes/x86: Disable preemption in ftrace-based jprobes tools include: Do not use poison with C++ iio: adc: ti-ads1015: add 10% to conversion wait time dax: Avoid page invalidation races and unnecessary radix tree traversals net/mlx4_en: Fix type mismatch for 32-bit systems l2tp: take remote address into account in l2tp_ip and l2tp_ip6 socket lookups dmaengine: stm32-dma: Set correct args number for DMA request from DT dmaengine: stm32-dma: Fix null pointer dereference in stm32_dma_tx_status usb: gadget: f_fs: Fix ExtCompat descriptor validation libcxgb: fix error check for ip6_route_output() net: systemport: Utilize skb_put_padto() net: systemport: Pad packet before inserting TSB ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate ARM: OMAP1: DMA: Correct the number of logical channels vti6: fix device register to report IFLA_INFO_KIND be2net: fix accesses to unicast list be2net: fix unicast list filling net/appletalk: Fix kernel memory disclosure libfs: Modify mount_pseudo_xattr to be clear it is not a userspace mount net: qrtr: Mark 'buf' as little endian mm: fix remote numa hits statistics mac80211: calculate min channel width correctly ravb: Remove Rx overflow log messages nfs: Don't take a reference on fl->fl_file for LOCK operation drm/exynos/decon5433: update shadow registers iff there are active windows drm/exynos/decon5433: set STANDALONE_UPDATE_F also if planes are disabled KVM: arm/arm64: Fix occasional warning from the timer work function mac80211: prevent skb/txq mismatch NFSv4: Fix client recovery when server reboots multiple times perf/x86/intel: Account interrupts for PEBS errors powerpc/mm: Fix memory hotplug BUG() on radix qla2xxx: Fix wrong IOCB type assumption drm/amdgpu: fix bug set incorrect value to vce register drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement net: sctp: fix array overrun read on sctp_timer_tbl x86/fpu: Set the xcomp_bv when we fake up a XSAVES area drm/amdgpu: fix unload driver issue for virtual display mac80211: don't try to sleep in rate_control_rate_init() RDMA/qedr: Return success when not changing QP state RDMA/qedr: Fix RDMA CM loopback tipc: fix nametbl_lock soft lockup at module exit tipc: fix cleanup at module unload dmaengine: pl330: fix double lock tcp: correct memory barrier usage in tcp_check_space() i2c: i2c-cadence: Initialize configuration before probing devices nvmet: cancel fatal error and flush async work before free controller gtp: clear DF bit on GTP packet tx gtp: fix cross netns recv on gtp socket net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause net: thunderx: avoid dereferencing xcv when NULL be2net: fix initial MAC setting vfio/spapr: Fix missing mutex unlock when creating a window mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers xen-netfront: Improve error handling during initialization cec: initiator should be the same as the destination for, poll xen-netback: vif counters from int/long to u64 net: fec: fix multicast filtering hardware setup dma-buf/dma-fence: Extract __dma_fence_is_later() dma-buf/sw-sync: Fix the is-signaled test to handle u32 wraparound dma-buf/sw-sync: Prevent user overflow on timeline advance dma-buf/sw-sync: Reduce irqsave/irqrestore from known context dma-buf/sw-sync: sync_pt is private and of fixed size dma-buf/sw-sync: Fix locking around sync_timeline lists dma-buf/sw-sync: Use an rbtree to sort fences in the timeline dma-buf/sw_sync: move timeline_fence_ops around dma-buf/sw_sync: clean up list before signaling the fence dma-fence: Clear fence->status during dma_fence_init() dma-fence: Wrap querying the fence->status dma-fence: Introduce drm_fence_set_error() helper dma-buf/sw_sync: force signal all unsignaled fences on dying timeline dma-buf/sync_file: hold reference to fence when creating sync_file dma-buf: Update kerneldoc for sync_file_create usb: hub: Cycle HUB power when initialization fails usb: xhci: fix panic in xhci_free_virt_devices_depth_first USB: core: Add type-specific length check of BOS descriptors USB: Increase usbfs transfer limit USB: devio: Prevent integer overflow in proc_do_submiturb() USB: usbfs: Filter flags passed in from user space usb: host: fix incorrect updating of offset xen-netfront: avoid crashing on resume after a failure in talk_to_netback() Linux 4.9.68 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-12-10 17:13:13 +01:00
Jiri Olsa	a88ff235e8	perf/x86/intel: Account interrupts for PEBS errors [ Upstream commit `475113d937` ] It's possible to set up PEBS events to get only errors and not any data, like on SNB-X (model 45) and IVB-EP (model 62) via 2 perf commands running simultaneously: taskset -c 1 ./perf record -c 4 -e branches:pp -j any -C 10 This leads to a soft lock up, because the error path of the intel_pmu_drain_pebs_nhm() does not account event->hw.interrupt for error PEBS interrupts, so in case you're getting ONLY errors you don't have a way to stop the event when it's over the max_samples_per_tick limit: NMI watchdog: BUG: soft lockup - CPU#22 stuck for 22s! [perf_fuzzer:5816] ... RIP: 0010:[<ffffffff81159232>] [<ffffffff81159232>] smp_call_function_single+0xe2/0x140 ... Call Trace: ? trace_hardirqs_on_caller+0xf5/0x1b0 ? perf_cgroup_attach+0x70/0x70 perf_install_in_context+0x199/0x1b0 ? ctx_resched+0x90/0x90 SYSC_perf_event_open+0x641/0xf90 SyS_perf_event_open+0x9/0x10 do_syscall_64+0x6c/0x1f0 entry_SYSCALL64_slow_path+0x25/0x25 Add perf_event_account_interrupt() which does the interrupt and frequency checks and call it from intel_pmu_drain_pebs_nhm()'s error path. We keep the pending_kill and pending_wakeup logic only in the __perf_event_overflow() path, because they make sense only if there's any data to deliver. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vince@deater.net> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1482931866-6018-2-git-send-email-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-09 22:01:52 +01:00
Viresh Kumar	87cdf4eda5	BACKPORT: schedutil: Reset cached freq if it is not in sync with next_freq 'cached_raw_freq' is used to get the next frequency quickly but should always be in sync with sg_policy->next_freq. There are cases where it is not and in such cases it should be reset to avoid switching to incorrect frequencies. Consider this case for example: - policy->cur is 1.2 GHz (Max) - New request comes for 780 MHz and we store that in cached_raw_freq. - Based on 780 MHz, we calculate the effective frequency as 800 MHz. - We then decide not to update the frequency as sugov_up_down_rate_limit() return true. - Here cached_raw_freq is 780 MHz and sg_policy->next_freq is 1.2 GHz. - Now if the utilization doesn't change in next request, then the next target frequency will still be 780 MHz and it will match with cached_raw_freq and so we will directly return 1.2 GHz instead of 800 MHz. BACKPORT of upstream commit `07458f6a51` ("cpufreq: schedutil: Reset cached_raw_freq when not in sync with next_freq"). This also updates sugov_update_commit() for handling up/down tunables, which aren't present in mainline. Change-Id: Ie86465231e7cb265e5b4c26f59d6faf8d9630b0a Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>	2017-12-04 16:23:03 +00:00
Victor Wan	1e547d2935	Merge branch 'android-4.9' into amlogic-4.9-dev	2017-12-02 16:52:23 +08:00
Ke Wang	b763480947	sched: EAS/WALT: Don't take into account of running task's util For upmigrating misfit running task case, the currently running task's util has been counted into cpu_util(). Thus currently __cpu_overutilized() which add task's uitl twice is overestimated. Change-Id: I5326f4c736a55679009d2e7293f8792311c04294 Signed-off-by: Ke Wang <ke.wang@spreadtrum.com>	2017-12-01 16:14:01 +00:00
Joonwoo Park	b41e1ca689	sched: EAS/WALT: take into account of waking task's load WALT's function cpu_util(cpu) reports CPU's load without taking into account of waking task's load. Thus currently cpu_overutilized() underestimates load on the previous CPU of waking task. Take into account of task's load to determine whether previous CPU is overutilzed to bail out early without running energy_diff() which is expensive. Change-Id: I30f146984a880ad2cc1b8a4ce35bd239a8c9a607 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> (minor rebase conflicts) Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit `94e5c96507`) [trivial cherry-pick issues] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-12-01 16:11:57 +00:00
Joonwoo Park	4f0693ad08	sched: EAS: upmigrate misfit current task Upmigrate misfit current task upon scheduler tick with stopper. We can kick an random (not necessarily big CPU) NOHZ idle CPU when a CPU bound task is in need of upmigration. But it's not efficient as that way needs following unnecessary wakeups: 1. Busy little CPU A to kick idle B 2. B runs idle balancer and enqueue migration/A 3. B goes idle 4. A runs migration/A, enqueues busy task on B. 5. B wakes up again. This change makes active upmigration more efficiently by doing: 1. Busy little CPU A find target CPU B upon tick. 2. CPU A enqueues migration/A. Change-Id: Ie865738054ea3296f28e6ba01710635efa7193c0 [joonwoop: The original version had logic to reserve CPU. The logic is omitted in this version.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org> (cherry picked from commit `9e293db052`) [trivial cherry-pick issues] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-12-01 16:11:45 +00:00
Prasad Sodagudi	d345372a29	sched: avoid pushing tasks to an offline CPU Currently active_load_balance_cpu_stop is run by cpu stopper and it pushes running tasks off the busiest CPU onto idle target CPU. But there is no check to see whether target cpu is offline or not before pushing the tasks. With the introduction of active migration in the scheduler tick path (see check_for_migration()) there have been instances of attempts to migrate tasks to offline CPUs. Add a check as to whether the target cpu is online or not to prevent scheduling on offline CPUs. Change-Id: Ib8ac7f8aeabd3ca7365f3eae977075952dab4f21 Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org> (cherry picked from commit `dc626b28ee`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-12-01 16:11:35 +00:00
Srivatsa Vaddagiri	70e14af60c	sched: Extend active balance to accept 'push_task' argument Active balance currently picks one task to migrate from busy cpu to a chosen cpu (push_cpu). This patch extends active load balance to recognize a particular task ('push_task') that needs to be migrated to 'push_cpu'. This capability will be leveraged by HMP-aware task placement in a subsequent patch. Change-Id: If31320111e6cc7044e617b5c3fd6d8e0c0e16952 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> (cherry picked from commit `2da014c0d8`) [ trivial cherry-pick issues, fix "may be used uninitialized" warning for push_task ] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-12-01 16:11:25 +00:00
Vikram Mulukutla	44310bf8ab	sched: walt: Correct WALT window size initialization It is preferable that WALT window rollover occurs just before a tick, since the tick is an opportune moment to record a complete window's statistics, as well as report those stats to the cpu frequency governor. When CONFIG_HZ results in a TICK_NSEC that isn't a integral number, this requirement may be violated. Account for this by reducing the WALT window size to the nearest multiple of TICK_NSEC. Commit `d368c6faa1` ("sched: walt: fix window misalignment when HZ=300") attempted to do this but WALT isn't using MIN_SCHED_RAVG_WINDOW as the window size and the patch was doing nothing. Also, change the type of 'walt_disabled' to bool and warn if an invalid window size causes WALT to be disabled. Change-Id: Ie3dcfc21a3df4408254ca1165a355bbe391ed5c7 Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org> (cherry picked from commit `e79f447a97`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-12-01 16:11:15 +00:00
Joonwoo Park	7f17fff119	sched: WALT: account cumulative window demand Energy cost estimation has been a long lasting challenge for WALT because WALT guides CPU frequency based on the CPU utilization of previous window. Consequently it's not possible to know newly waking-up task's energy cost until WALT's end of the current window. The WALT already tracks 'Previous Runnable Sum' (prev_runnable_sum) and 'Cumulative Runnable Average' (cr_avg). They are designed for CPU frequency guidance and task placement but unfortunately both are not suitable for the energy cost estimation. It's because using prev_runnable_sum for energy cost calculation would make us to account CPU and task's energy solely based on activity in the previous window so for example, any task didn't have an activity in the previous window will be accounted as a 'zero energy cost' task. Energy estimation with cr_avg is what energy_diff() relies on at present. However cr_avg can only represent instantaneous picture of energy cost thus for example, if a CPU was fully occupied for an entire WALT window and became idle just before window boundary, and if there is a wake-up, energy_diff() accounts that CPU is a 'zero energy cost' CPU. As a result, introduce a new accounting unit 'Cumulative Window Demand'. The cumulative window demand tracks all the tasks' demands have seen in current window which is neither instantaneous nor actual execution time. Because task demand represents estimated scaled execution time when the task runs a full window, accumulation of all the demands represents predicted CPU load at the end of window. Thus we can estimate CPU's frequency at the end of current WALT window with the cumulative window demand. The use of prev_runnable_sum for the CPU frequency guidance and cr_avg for the task placement have not changed and these are going to be used for both purpose while this patch aims to add an additional statistics. Change-Id: I9908c77ead9973a26dea2b36c001c2baf944d4f5 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> (cherry picked from commit `43bd960dfe`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-12-01 16:11:06 +00:00
Joonwoo Park	6cb3bed2d0	sched: EAS/WALT: finish accounting prior to task_tick In order to set rq->misfit_task in time, call update_task_ravg() prior to task_tick. This reduces upmigration delay by 1 scheduler window. Change-Id: I7cc80badd423f2e7684125fbfd853b0a3610f0e8 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org> (cherry picked from commit `ed9e749668`) [Trivial cherry-pick issue] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-12-01 16:10:56 +00:00
Joonwoo Park	a8611936de	sched/fair: prevent meaningless active migration At present need_active_balance() determines whether an active upmigration is needed by using capacity_of(). A CPU's capacity may be reduced by RT pressure, and therefore distinguishing capability differences with capacity_of() may lead to suboptimal active migrations to less capable CPUs. Use capacity_orig_of to distinguish differently capable CPUs in addition to capacity_of(), thus avoiding placing tasks on less capable CPUs due to instantaneous RT pressure. Change-Id: I3e1435246a8edc3ad618ef98a34866cfbd8c16a5 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> [markivx: Reworked the commit text a bit] Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org> (cherry picked from commit `7ab48e4c8d`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-12-01 16:10:47 +00:00
Vikram Mulukutla	b28cab9cd2	sched: walt: Leverage existing helper APIs to apply invariance There's no need for a separate hierarchy of notifiers, APIs and variables in walt.c for the purpose of applying frequency and IPC invariance. Let's just use capacity_curr_of and get rid of a lot of the infrastructure relating to capacity, load_scale_factor etc. Change-Id: Ia220e2c896373fa535db05bff60f9aa33aefc978 Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org> (cherry picked from commit `be832f69a9`) [Trivial cherry pick issues] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-12-01 16:10:25 +00:00
Greg Kroah-Hartman	c1a286429a	Merge 4.9.66 into android-4.9 Changes in 4.9.66 s390: fix transactional execution control register handling s390/runtime instrumention: fix possible memory corruption s390/disassembler: add missing end marker for e7 table s390/disassembler: increase show_code buffer size ACPI / EC: Fix regression related to triggering source of EC event handling x86/mm: fix use-after-free of vma during userfaultfd fault ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER vsock: use new wait API for vsock_stream_sendmsg() sched: Make resched_cpu() unconditional lib/mpi: call cond_resched() from mpi_powm() loop x86/decoder: Add new TEST instruction pattern x86/entry/64: Add missing irqflags tracing to native_load_gs_index() arm64: Implement arch-specific pte_access_permitted() ARM: 8722/1: mm: make STRICT_KERNEL_RWX effective for LPAE ARM: 8721/1: mm: dump: check hardware RO bit for LPAE MIPS: ralink: Fix MT7628 pinmux MIPS: ralink: Fix typo in mt7628 pinmux function PCI: Set Cavium ACS capability quirk flags to assert RR/CR/SV/UF ALSA: hda: Add Raven PCI ID dm bufio: fix integer overflow when limiting maximum cache size dm: allocate struct mapped_device with kvzalloc MIPS: pci: Remove KERN_WARN instance inside the mt7620 driver dm: fix race between dm_get_from_kobject() and __dm_destroy() MIPS: Fix odd fp register warnings with MIPS64r2 MIPS: dts: remove bogus bcm96358nb4ser.dtb from dtb-y entry MIPS: Fix an n32 core file generation regset support regression MIPS: BCM47XX: Fix LED inversion for WRT54GSv1 rt2x00usb: mark device removed when get ENOENT usb error autofs: don't fail mount for transient error nilfs2: fix race condition that causes file system corruption eCryptfs: use after free in ecryptfs_release_messaging() libceph: don't WARN() if user tries to add invalid key bcache: check ca->alloc_thread initialized before wake up it isofs: fix timestamps beyond 2027 NFS: Fix typo in nomigration mount option nfs: Fix ugly referral attributes NFS: Avoid RCU usage in tracepoints nfsd: deal with revoked delegations appropriately rtlwifi: rtl8192ee: Fix memory leak when loading firmware rtlwifi: fix uninitialized rtlhal->last_suspend_sec time ata: fixes kernel crash while tracing ata_eh_link_autopsy event ext4: fix interaction between i_size, fallocate, and delalloc after a crash ALSA: pcm: update tstamp only if audio_tstamp changed ALSA: usb-audio: Add sanity checks to FE parser ALSA: usb-audio: Fix potential out-of-bound access at parsing SU ALSA: usb-audio: Add sanity checks in v2 clock parsers ALSA: timer: Remove kernel warning at compat ioctl error paths ALSA: hda: Fix too short HDMI/DP chmap reporting ALSA: hda/realtek - Fix ALC700 family no sound issue fix a page leak in vhost_scsi_iov_to_sgl() error recovery fs/9p: Compare qid.path in v9fs_test_inode iscsi-target: Fix non-immediate TMR reference leak target: Fix QUEUE_FULL + SCSI task attribute handling mtd: nand: omap2: Fix subpage write mtd: nand: Fix writing mtdoops to nand flash. mtd: nand: mtk: fix infinite ECC decode IRQ issue p54: don't unregister leds when they are not initialized block: Fix a race between blk_cleanup_queue() and timeout handling irqchip/gic-v3: Fix ppi-partitions lookup lockd: double unregister of inetaddr notifiers KVM: nVMX: set IDTR and GDTR limits when loading L1 host state KVM: SVM: obey guest PAT SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status clk: ti: dra7-atl-clock: fix child-node lookups libnvdimm, pfn: make 'resource' attribute only readable by root libnvdimm, namespace: fix label initialization to use valid seq numbers libnvdimm, namespace: make 'resource' attribute only readable by root IB/srpt: Do not accept invalid initiator port names IB/srp: Avoid that a cable pull can trigger a kernel crash NFC: fix device-allocation error return i40e: Use smp_rmb rather than read_barrier_depends igb: Use smp_rmb rather than read_barrier_depends igbvf: Use smp_rmb rather than read_barrier_depends ixgbevf: Use smp_rmb rather than read_barrier_depends i40evf: Use smp_rmb rather than read_barrier_depends fm10k: Use smp_rmb rather than read_barrier_depends ixgbe: Fix skb list corruption on Power systems parisc: Fix validity check of pointer size argument in new CAS implementation powerpc/signal: Properly handle return value from uprobe_deny_signal() media: Don't do DMA on stack for firmware upload in the AS102 driver media: rc: check for integer overflow cx231xx-cards: fix NULL-deref on missing association descriptor media: v4l2-ctrl: Fix flags field on Control events sched/rt: Simplify the IPI based RT balancing logic fscrypt: lock mutex before checking for bounce page pool net/9p: Switch to wait_event_killable() PM / OPP: Add missing of_node_put(np) Revert "drm/i915: Do not rely on wm preservation for ILK watermarks" e1000e: Fix error path in link detection e1000e: Fix return value test e1000e: Separate signaling for link check/link up e1000e: Avoid receiver overrun interrupt bursts RDS: make message size limit compliant with spec RDS: RDMA: return appropriate error on rdma map failures RDS: RDMA: fix the ib_map_mr_sg_zbva() argument PCI: Apply _HPX settings only to relevant devices drm/sun4i: Fix a return value in case of error clk: sunxi-ng: A31: Fix spdif clock register clk: sunxi-ng: fix PLL_CPUX adjusting on A33 dmaengine: zx: set DMA_CYCLIC cap_mask bit fscrypt: use ENOKEY when file cannot be created w/o key fscrypt: use ENOTDIR when setting encryption policy on nondirectory net: Allow IP_MULTICAST_IF to set index to L3 slave net: 3com: typhoon: typhoon_init_one: make return values more specific net: 3com: typhoon: typhoon_init_one: fix incorrect return values drm/armada: Fix compile fail rt2800: set minimum MPDU and PSDU lengths to sane values adm80211: return an error if adm8211_alloc_rings() fails mwifiex: sdio: fix use after free issue for save_adapter ath10k: fix incorrect txpower set by P2P_DEVICE interface ath10k: ignore configuring the incorrect board_id ath10k: fix potential memory leak in ath10k_wmi_tlv_op_pull_fw_stats() pinctrl: sirf: atlas7: Add missing 'of_node_put()' bnxt_en: Set default completion ring for async events. ath10k: set CTS protection VDEV param only if VDEV is up ALSA: hda - Apply ALC269_FIXUP_NO_SHUTUP on HDA_FIXUP_ACT_PROBE gpio: mockup: dynamically allocate memory for chip name drm: Apply range restriction after color adjustment when allocation clk: qcom: ipq4019: Add all the frequencies for apss cpu drm/mediatek: don't use drm_put_dev mac80211: Remove invalid flag operations in mesh TSF synchronization mac80211: Suppress NEW_PEER_CANDIDATE event if no room adm80211: add checks for dma mapping errors iio: light: fix improper return value staging: iio: cdc: fix improper return value spi: SPI_FSL_DSPI should depend on HAS_DMA netfilter: nft_queue: use raw_smp_processor_id() netfilter: nf_tables: fix oob access ASoC: rsnd: don't double free kctrl crypto: marvell - Copy IVDIG before launching partial DMA ahash requests btrfs: return the actual error value from from btrfs_uuid_tree_iterate ASoC: wm_adsp: Don't overrun firmware file buffer when reading region data s390/kbuild: enable modversions for symbols exported from asm cec: when canceling a message, don't overwrite old status info cec: CEC_MSG_GIVE_FEATURES should abort for CEC version < 2 cec: update log_addr[] before finishing configuration nvmet: fix KATO offset in Set Features xen: xenbus driver must not accept invalid transaction ids Linux 4.9.66 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-11-30 16:24:14 +00:00
Steven Rostedt (Red Hat)	1c37ff7829	sched/rt: Simplify the IPI based RT balancing logic commit `4bdced5c9a` upstream. When a CPU lowers its priority (schedules out a high priority task for a lower priority one), a check is made to see if any other CPU has overloaded RT tasks (more than one). It checks the rto_mask to determine this and if so it will request to pull one of those tasks to itself if the non running RT task is of higher priority than the new priority of the next task to run on the current CPU. When we deal with large number of CPUs, the original pull logic suffered from large lock contention on a single CPU run queue, which caused a huge latency across all CPUs. This was caused by only having one CPU having overloaded RT tasks and a bunch of other CPUs lowering their priority. To solve this issue, commit: `b6366f048e` ("sched/rt: Use IPI to trigger RT task push migration instead of pulling") changed the way to request a pull. Instead of grabbing the lock of the overloaded CPU's runqueue, it simply sent an IPI to that CPU to do the work. Although the IPI logic worked very well in removing the large latency build up, it still could suffer from a large number of IPIs being sent to a single CPU. On a 80 CPU box, I measured over 200us of processing IPIs. Worse yet, when I tested this on a 120 CPU box, with a stress test that had lots of RT tasks scheduling on all CPUs, it actually triggered the hard lockup detector! One CPU had so many IPIs sent to it, and due to the restart mechanism that is triggered when the source run queue has a priority status change, the CPU spent minutes! processing the IPIs. Thinking about this further, I realized there's no reason for each run queue to send its own IPI. As all CPUs with overloaded tasks must be scanned regardless if there's one or many CPUs lowering their priority, because there's no current way to find the CPU with the highest priority task that can schedule to one of these CPUs, there really only needs to be one IPI being sent around at a time. This greatly simplifies the code! The new approach is to have each root domain have its own irq work, as the rto_mask is per root domain. The root domain has the following fields attached to it: rto_push_work - the irq work to process each CPU set in rto_mask rto_lock - the lock to protect some of the other rto fields rto_loop_start - an atomic that keeps contention down on rto_lock the first CPU scheduling in a lower priority task is the one to kick off the process. rto_loop_next - an atomic that gets incremented for each CPU that schedules in a lower priority task. rto_loop - a variable protected by rto_lock that is used to compare against rto_loop_next rto_cpu - The cpu to send the next IPI to, also protected by the rto_lock. When a CPU schedules in a lower priority task and wants to make sure overloaded CPUs know about it. It increments the rto_loop_next. Then it atomically sets rto_loop_start with a cmpxchg. If the old value is not "0", then it is done, as another CPU is kicking off the IPI loop. If the old value is "0", then it will take the rto_lock to synchronize with a possible IPI being sent around to the overloaded CPUs. If rto_cpu is greater than or equal to nr_cpu_ids, then there's either no IPI being sent around, or one is about to finish. Then rto_cpu is set to the first CPU in rto_mask and an IPI is sent to that CPU. If there's no CPUs set in rto_mask, then there's nothing to be done. When the CPU receives the IPI, it will first try to push any RT tasks that is queued on the CPU but can't run because a higher priority RT task is currently running on that CPU. Then it takes the rto_lock and looks for the next CPU in the rto_mask. If it finds one, it simply sends an IPI to that CPU and the process continues. If there's no more CPUs in the rto_mask, then rto_loop is compared with rto_loop_next. If they match, everything is done and the process is over. If they do not match, then a CPU scheduled in a lower priority task as the IPI was being passed around, and the process needs to start again. The first CPU in rto_mask is sent the IPI. This change removes this duplication of work in the IPI logic, and greatly lowers the latency caused by the IPIs. This removed the lockup happening on the 120 CPU machine. It also simplifies the code tremendously. What else could anyone ask for? Thanks to Peter Zijlstra for simplifying the rto_loop_start atomic logic and supplying me with the rto_start_trylock() and rto_start_unlock() helper functions. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Clark Williams <williams@redhat.com> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Scott Wood <swood@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170424114732.1aac6dc4@gandalf.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-30 08:39:09 +00:00
Paul E. McKenney	fb8bd56e35	sched: Make resched_cpu() unconditional commit `7c2102e56a` upstream. The current implementation of synchronize_sched_expedited() incorrectly assumes that resched_cpu() is unconditional, which it is not. This means that synchronize_sched_expedited() can hang when resched_cpu()'s trylock fails as follows (analysis by Neeraj Upadhyay): o CPU1 is waiting for expedited wait to complete: sync_rcu_exp_select_cpus rdp->exp_dynticks_snap & 0x1 // returns 1 for CPU5 IPI sent to CPU5 synchronize_sched_expedited_wait ret = swait_event_timeout(rsp->expedited_wq, sync_rcu_preempt_exp_done(rnp_root), jiffies_stall); expmask = 0x20, CPU 5 in idle path (in cpuidle_enter()) o CPU5 handles IPI and fails to acquire rq lock. Handles IPI sync_sched_exp_handler resched_cpu returns while failing to try lock acquire rq->lock need_resched is not set o CPU5 calls rcu_idle_enter() and as need_resched is not set, goes to idle (schedule() is not called). o CPU 1 reports RCU stall. Given that resched_cpu() is now used only by RCU, this commit fixes the assumption by making resched_cpu() unconditional. Reported-by: Neeraj Upadhyay <neeraju@codeaurora.org> Suggested-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-30 08:39:01 +00:00
John Stultz	5311c740c0	UPSTREAM: time: Clean up CLOCK_MONOTONIC_RAW time handling (cherry pick from commit `fc6eead7c1`) Now that we fixed the sub-ns handling for CLOCK_MONOTONIC_RAW, remove the duplicitive tk->raw_time.tv_nsec, which can be stored in tk->tkr_raw.xtime_nsec (similarly to how its handled for monotonic time). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Daniel Mentz <danielmentz@google.com> Tested-by: Daniel Mentz <danielmentz@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Bug: 20045882 Bug: 63737556 Change-Id: I243827d21b08703a09d2d2fe738a9258be224582	2017-11-29 13:44:16 -08:00
Greg Kroah-Hartman	a6d71ba679	Merge 4.9.62 into android-4.9 Changes in 4.9.62 adv7604: Initialize drive strength to default when using DT video: fbdev: pmag-ba-fb: Remove bad `__init' annotation PCI: mvebu: Handle changes to the bridge windows while enabled sched/core: Add missing update_rq_clock() call in sched_move_task() xen/netback: set default upper limit of tx/rx queues to 8 ARM: dts: imx53-qsb-common: fix FEC pinmux config dt-bindings: clockgen: Add compatible string for LS1012A EDAC, amd64: Add x86cpuid sanity check during init PM / OPP: Error out on failing to add static OPPs for v1 bindings clk: samsung: exynos5433: Add IDs for PHYCLK_MIPIDPHY0_* clocks drm: drm_minor_register(): Clean up debugfs on failure KVM: PPC: Book 3S: XICS: correct the real mode ICP rejecting counter iommu/arm-smmu-v3: Clear prior settings when updating STEs pinctrl: baytrail: Fix debugfs offset output powerpc/corenet: explicitly disable the SDHC controller on kmcoge4 cxl: Force psl data-cache flush during device shutdown ARM: omap2plus_defconfig: Fix probe errors on UARTs 5 and 6 arm64: dma-mapping: Only swizzle DMA ops for IOMMU_DOMAIN_DMA crypto: vmx - disable preemption to enable vsx in aes_ctr.c drm: mali-dp: fix Lx_CONTROL register fields clobber iio: trigger: free trigger resource correctly iio: pressure: ms5611: claim direct mode during oversampling changes iio: magnetometer: mag3110: claim direct mode during raw writes iio: proximity: sx9500: claim direct mode during raw proximity reads dt-bindings: Add LEGO MINDSTORMS EV3 compatible specification dt-bindings: Add vendor prefix for LEGO phy: increase size of MII_BUS_ID_SIZE and bus_id serial: sh-sci: Fix register offsets for the IRDA serial port libertas: fix improper return value usb: hcd: initialize hcd->flags to 0 when rm hcd netfilter: nft_meta: deal with PACKET_LOOPBACK in netdev family brcmfmac: setup wiphy bands after registering it first rt2800usb: mark tx failure on timeout apparmor: fix undefined reference to `aa_g_hash_policy' IPsec: do not ignore crypto err in ah4 input EDAC, amd64: Save and return err code from probe_one_instance() s390/topology: make "topology=off" parameter work Input: mpr121 - handle multiple bits change of status register Input: mpr121 - set missing event capability sched/cputime, powerpc32: Fix stale scaled stime on context switch IB/ipoib: Change list_del to list_del_init in the tx object ARM: dts: STiH410-family: fix wrong parent clock frequency s390/qeth: fix retrieval of vipa and proxy-arp addresses s390/qeth: issue STARTLAN as first IPA command wcn36xx: Don't use the destroyed hal_mutex IB/rxe: Fix reference leaks in memory key invalidation code clk: mvebu: adjust AP806 CPU clock frequencies to production chip net: dsa: select NET_SWITCHDEV platform/x86: hp-wmi: Fix detection for dock and tablet mode cdc_ncm: Set NTB format again after altsetting switch for Huawei devices KEYS: trusted: sanitize all key material KEYS: trusted: fix writing past end of buffer in trusted_read() platform/x86: hp-wmi: Fix error value for hp_wmi_tablet_state platform/x86: hp-wmi: Do not shadow error values x86/uaccess, sched/preempt: Verify access_ok() context workqueue: Fix NULL pointer dereference crypto: ccm - preserve the IV buffer crypto: x86/sha1-mb - fix panic due to unaligned access crypto: x86/sha256-mb - fix panic due to unaligned access KEYS: fix NULL pointer dereference during ASN.1 parsing [ver #2] ARM: 8720/1: ensure dump_instr() checks addr_limit ALSA: seq: Fix OSS sysex delivery in OSS emulation ALSA: seq: Avoid invalid lockdep class warning drm/i915: Do not rely on wm preservation for ILK watermarks MIPS: microMIPS: Fix incorrect mask in insn_table_MM MIPS: Fix CM region target definitions MIPS: SMP: Use a completion event to signal CPU up MIPS: Fix race on setting and getting cpu_online_mask MIPS: SMP: Fix deadlock & online race selftests: firmware: send expected errors to /dev/null tools: firmware: check for distro fallback udev cancel rule ASoC: sun4i-spdif: remove legacy dapm components MIPS: BMIPS: Fix missing cbr address MIPS: AR7: Defer registration of GPIO MIPS: AR7: Ensure that serial ports are properly set up Input: elan_i2c - add ELAN060C to the ACPI table rbd: use GFP_NOIO for parent stat and data requests drm/vmwgfx: Fix Ubuntu 17.10 Wayland black screen issue drm/bridge: adv7511: Rework adv7511_power_on/off() so they can be reused internally drm/bridge: adv7511: Reuse __adv7511_power_on/off() when probing EDID drm/bridge: adv7511: Re-write the i2c address before EDID probing can: sun4i: handle overrun in RX FIFO can: ifi: Fix transmitter delay calculation can: c_can: don't indicate triple sampling support for D_CAN x86/smpboot: Make optimization of delay calibration work correctly x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context Linux 4.9.62 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-11-15 16:13:49 +01:00
Li Bin	46f15501c5	workqueue: Fix NULL pointer dereference commit `cef572ad9b` upstream. When queue_work() is used in irq (not in task context), there is a potential case that trigger NULL pointer dereference. ---------------------------------------------------------------- worker_thread() \|-spin_lock_irq() \|-process_one_work() \|-worker->current_pwq = pwq \|-spin_unlock_irq() \|-worker->current_func(work) \|-spin_lock_irq() \|-worker->current_pwq = NULL \|-spin_unlock_irq() //interrupt here \|-irq_handler \|-__queue_work() //assuming that the wq is draining \|-is_chained_work(wq) \|-current_wq_worker() //Here, 'current' is the interrupted worker! \|-current->current_pwq is NULL here! \|-schedule() ---------------------------------------------------------------- Avoid it by checking for task context in current_wq_worker(), and if not in task context, we shouldn't use the 'current' to check the condition. Reported-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Li Bin <huawei.libin@huawei.com> Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: `8d03ecfe47` ("workqueue: reimplement is_chained_work() using current_wq_worker()") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-15 15:53:17 +01:00
Peter Zijlstra	6da1c989cc	sched/core: Add missing update_rq_clock() call in sched_move_task() [ Upstream commit `1b1d62254d` ] Bug was noticed via this warning: WARNING: CPU: 6 PID: 1 at kernel/sched/sched.h:804 detach_task_cfs_rq+0x8e8/0xb80 rq->clock_update_flags < RQCF_ACT_SKIP Modules linked in: CPU: 6 PID: 1 Comm: systemd Not tainted 4.10.0-rc5-00140-g0874170baf55-dirty #1 Hardware name: Supermicro SYS-4048B-TRFT/X10QBi, BIOS 1.0 04/11/2014 Call Trace: dump_stack+0x4d/0x65 __warn+0xcb/0xf0 warn_slowpath_fmt+0x5f/0x80 detach_task_cfs_rq+0x8e8/0xb80 ? allocate_cgrp_cset_links+0x59/0x80 task_change_group_fair+0x27/0x150 sched_change_group+0x48/0xf0 sched_move_task+0x53/0x150 cpu_cgroup_attach+0x36/0x70 cgroup_taskset_migrate+0x175/0x300 cgroup_migrate+0xab/0xd0 cgroup_attach_task+0xf0/0x190 __cgroup_procs_write+0x1ed/0x2f0 cgroup_procs_write+0x14/0x20 cgroup_file_write+0x3f/0x100 kernfs_fop_write+0x104/0x180 __vfs_write+0x37/0x140 vfs_write+0xb8/0x1b0 SyS_write+0x55/0xc0 do_syscall_64+0x61/0x170 entry_SYSCALL64_slow_path+0x25/0x25 Reported-by: Ingo Molnar <mingo@kernel.org> Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-15 15:53:11 +01:00
Joonwoo Park	cc1034265e	sched: WALT: fix potential overflow Task demand and CPU util are in u64. Change-Id: If7ec1623e723026d3346201122aab0303a6d2ba2 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> (cherry picked from commit `c8b8c92bbc`) [trivial cherry-pick issues] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-14 16:41:51 +00:00
Olav Haugan	77ba2b9f0b	sched: Update task->on_rq when tasks are moving between runqueues Task->on_rq has three states: 0 - Task is not on runqueue (rq) 1 (TASK_ON_RQ_QUEUED) - Task is on rq 2 (TASK_ON_RQ_MIGRATING) - Task is on rq but in the process of being migrated to another rq When a task is moving between rqs task->on_rq state should be TASK_ON_RQ_MIGRATING in order for WALT to account rq's cumulative runnable average correctly. Without such state marking for all the classes, WALT's update_history() would try to fixup task's demand which was never contributed to any of CPUs during migration. Change-Id: I65e74a8f176c3ed4b8577577f6da8897ecda7bb8 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [joonwoop: Reinforced changelog to explain why this is needed by WALT. Fixed conflicts in deadline.c] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> (cherry picked from commit 0f8791c90a99f718c43ab8214076c0c671a36667) [fixed cherry-pick issue due to missing `fab5cc59bf`] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-14 16:39:00 +00:00
Joonwoo Park	98a5fa3b95	sched: WALT: fix window mis-alignment The initial window start needs to be close to ktime ns = 0 to be aligned with scheduler tick. Change-Id: Ia91f74efce2f910106622a054a6fcd507e763ca5 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> (cherry picked from commit `0caf1df0c5`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-14 16:38:50 +00:00
Joonwoo Park	cd76b21535	sched: EAS: kill incorrect nohz idle cpu kick EAS won't allow NOHZ idle balancer until CPU's over utilized. However nohz_kick_needed() can return true. This causes idle CPU wake up for nothing. Change-Id: I6e548442e29e4f85cda695e4c7101dd591b12fe6 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> (cherry picked from commit `3989a247e2`) [trivial cherry-pick issues] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-14 16:38:38 +00:00
Joonwoo Park	d7a6b8be91	sched: EAS: fix incorrect energy delta calculation due to rounding error In order to calculate energy difference we currently iterates CPUs under the same sched doamin to accumulate total energy cost and compare before and after : for_each_domain(cpu) total_energy_before += (cpu_util * power) >> SCHED_CAPACITY_SHIFT; for_each_domain(cpu) total_energy_after += (cpu_util * power) >> SCHED_CAPACITY_SHIFT; Doing such can incorrectly calculate and report abs(delta) > 0 when there is actually no energy delta between before and after because the same total accumulated cpu_util of all the CPUs can be distributed differently before and after and it causes different amount of rounding error. Fix such incorrectness by shifting just once with accumulated total_energy. Change-Id: I82f1e2e358367058960938b4ef81714f57e921cf Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> (moved part to another commit) Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit `11b618a0b2`) [trivial cherry-pick issues] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-14 16:38:23 +00:00
Joonwoo Park	8b34bba642	sched: EAS/WALT: use cr_avg instead of prev_runnable_sum WALT accounts two major statistics; CPU load and cumulative tasks demand. The CPU load which is account of accumulated each CPU's absolute execution time is for CPU frequency guidance. Whereas cumulative tasks demand which is each CPU's instantaneous load to reflect CPU's load at given time is for task placement decision. Use cumulative tasks demand for cpu_util() for task placement and introduce cpu_util_freq() for frequency guidance. Change-Id: Id928f01dbc8cb2a617cdadc584c1f658022565c5 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> (cherry picked from commit `ee4cebd75e`) [removed schedfreq dependency] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-14 16:35:34 +00:00
Joonwoo Park	8b1a1ce14f	sched: WALT: fix broken cumulative runnable average accounting When running tasks's ravg.demand is changed update_history() adjusts rq->cumulative_runnable_avg to reflect change of CPU load. Currently this fixup is broken by accumulating task's new demand without subtracting the task's old demand. Fix the fixup logic to subtract the task's old demand. Change-Id: I61beb32a4850879ccb39b733f5564251e465bfeb Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> (cherry picked from commit `48f67ea85d`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-14 16:34:26 +00:00
Joonwoo Park	241a319ae7	sched: deadline: WALT: account cumulative runnable avg Account cumulative runnable average for WALT CPU utilization accounting. Change-Id: I56934894e626dec183740eeaf89a57d2ef638143 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> (cherry picked from commit `26b37261ea`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-14 16:33:12 +00:00
Victor Wan	ee46236755	Merge branch 'android-4.9' into amlogic-4.9-dev	2017-11-14 17:18:44 +08:00
Quentin Perret	904c79c425	sched: compute task utilisation with WALT consistently Using WALT, the utilisation of a task is computed with a resolution scaling factor that has been used inconsistently in the code with either hardcoded values or macros (NICE_0_LOAD_SHIFT in this case). Changes in these macros (as the 32 to 64 bits resolution shift of `2159197d66`) happened to break the utilisation calculation wherever they have been used whilst results remained correct in other places. This commit fixes this issue by using SCHED_CAPACITY_SCALE as resolution scaling factor consistently. Change-Id: Ic5418f8a5dfc455a22bafbebb4142b4665b61c6f Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-10 14:16:08 +00:00
Jiamin Ma	8e96d0d385	printk: a fix for log output disordering PD#154008: the log output is in disorder The defination for a continues line in printk is much more strict from 3.14 to 4.9: in kernel 3.14, if the first fragment does not end with CR, and the next fragment does not start with LOG_PREFIX(KERN_ALERT, KERN_ERR and so on), then they are in a continues line eg. pr_err("foo "); printk("bar\n") or pr_err("foo "); pr_cont("bar\n"); both are printing a continues line in kernel 4.9, if only the first fragment does not end with CR, and the next fragment start with LOG_CONT, then they are in a continues line eg. pr_err("foo "); printk("bar\n"); are not printing a continues line and pr_err("foo "); pr_cont("bar\n"); are printing a continues line but in the code path of crash info dumping in kernel 4.9.y, not all of the continues line printing has been switched to the 4.9 way(aka. calling pr_cont). Only in kernel 4.13.y all of that has been updated. so in this commit, we lose the definition of continues line back to 3.14 way to sovle current issue, and need to revert it when we sync with upstream kernel 4.13.y Change-Id: I64403d3a18531ceb832b41d96dff4b79a6d7fb5a Signed-off-by: Jiamin Ma <jiamin.ma@amlogic.com>	2017-11-07 22:43:05 -07:00
Chenbo Feng	0521e0b3fc	UPSTREAM: selinux: bpf: Add addtional check for bpf object file receive Introduce a bpf object related check when sending and receiving files through unix domain socket as well as binder. It checks if the receiving process have privilege to read/write the bpf map or use the bpf program. This check is necessary because the bpf maps and programs are using a anonymous inode as their shared inode so the normal way of checking the files and sockets when passing between processes cannot work properly on eBPF object. This check only works when the BPF_SYSCALL is configured. Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry-pick from net-next: `f66e448cfd`) Bug: 30950746 Change-Id: I5b2cf4ccb4eab7eda91ddd7091d6aa3e7ed9f2cd	2017-11-07 12:59:54 -08:00
Chenbo Feng	f3ad3766a9	BACKPORT: security: bpf: Add LSM hooks for bpf object related syscall Introduce several LSM hooks for the syscalls that will allow the userspace to access to eBPF object such as eBPF programs and eBPF maps. The security check is aimed to enforce a per object security protection for eBPF object so only processes with the right priviliges can read/write to a specific map or use a specific eBPF program. Besides that, a general security hook is added before the multiplexer of bpf syscall to check the cmd and the attribute used for the command. The actual security module can decide which command need to be checked and how the cmd should be checked. Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Added the LIST_HEAD_INIT call for security hooks, it nolonger exist in uptream code. (cherry-pick from net-next: `afdb09c720`) Bug: 30950746 Change-Id: Ieb3ac74392f531735fc7c949b83346a5f587a77b	2017-11-07 12:59:20 -08:00
Chenbo Feng	4672ded3ec	BACKPORT: bpf: Add file mode configuration into bpf maps Introduce the map read/write flags to the eBPF syscalls that returns the map fd. The flags is used to set up the file mode when construct a new file descriptor for bpf maps. To not break the backward capability, the f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise it should be O_RDONLY or O_WRONLY. When the userspace want to modify or read the map content, it will check the file mode to see if it is allowed to make the change. Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Deleted the file mode configuration code in unsupported map type and removed the file mode check in non-existing helper functions. (cherry-pick from net-next: `6e71b04a82`) Bug: 30950746 Change-Id: Icfad20f1abb77f91068d244fb0d87fa40824dd1b	2017-11-07 12:47:56 -08:00
Viresh Kumar	e0907557ef	cpufreq: Drop schedfreq governor We all should be using (and improving) the schedutil governor now. Get rid of the non-upstream governor. Tested on Hikey. Change-Id: I2104558b03118b0a9c5f099c23c42cd9a6c2a963 Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>	2017-11-07 10:31:04 +05:30
Chris Redpath	dfe0a9bcfc	Merge branch 'ack/android-4.9-eas-dev' into ack/android_4.9/merge_eas_dev_r1.4 Merge in the EAS r1.4 patches from eas-dev to 4.9 common branch. There is one patch in android-4.9-eas-dev which is not part of the 1.4 patches ANDROID: sched/fair: Select correct capacity state for energy_diff but we have already merged it into android-4.4 so in the interests of keeping aligned, let's include that in the merge. Merge Log: * ack/android-4.9-eas-dev: sched: EAS: Fix the condition to distinguish energy before/after sched: EAS: update trg_cpu to backup_cpu if no energy saving for target_cpu sched/fair: consider task utilization in group_max_util() sched/fair: consider task utilization in group_norm_util() sched/fair: enforce EAS mode sched/fair: ignore backup CPU when not valid sched/fair: trace energy_diff for non boosted tasks UPSTREAM: sched/fair: Sync task util before slow-path wakeup UPSTREAM: sched/core: Add missing update_rq_clock() call in set_user_nice() UPSTREAM: sched/core: Add missing update_rq_clock() call for task_hot() UPSTREAM: sched/core: Add missing update_rq_clock() in detach_task_cfs_rq() UPSTREAM: sched/core: Add missing update_rq_clock() in post_init_entity_util_avg() UPSTREAM: sched/fair: Fix task group initialization cpufreq/sched: Consider max cpu capacity when choosing frequencies cpufreq/sched: Use cpu max freq rather than policy max sched/fair: remove erroneous RCU_LOCKDEP_WARN from start_cpu() ANDROID: sched/fair: Select correct capacity state for energy_diff UPSTREAM: sched/fair: Fix usage of find_idlest_group() when the local group is idlest UPSTREAM: sched/fair: Fix usage of find_idlest_group() when no groups are allowed UPSTREAM: sched/fair: Fix find_idlest_group() when local group is not allowed UPSTREAM: sched/fair: Remove unnecessary comparison with -1 UPSTREAM: sched/fair: Move select_task_rq_fair() slow-path into its own function UPSTREAM: sched/fair: Force balancing on NOHZ balance if local group has capacity UPSTREAM: sched: use load_avg for selecting idlest group UPSTREAM: sched: fix find_idlest_group for fork Change-Id: I57bc516f9c804bfc7144a6a5bcf70572d82f7321 Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-11-03 13:51:48 +00:00
Ke Wang	c409b20240	sched: EAS: Fix the condition to distinguish energy before/after Before commit `5f8b3a757d` ("sched/fair: consider task utilization in group_norm_util()"), eenv->util_delta is used to distinguish energy before and energy after in sched_group_energy(). After that commit, eenv->util_delta can not do that any more. In this commit, use trg_cpu to distinguish energy before/after in sched_group_energy(). Before apply this commit, cap_before/cap_delta is not correct: <idle>-0 [001] 147504.608920: sched_energy_diff: pid=7 comm=rcu_preempt src_cpu=1 dst_cpu=3 usage_delta=7 nrg_before=250 nrg_after=250 nrg_diff=0 cap_before=0 cap_after=528 cap_delta=1056 nrg_delta=0 nrg_payoff=0 After apply this commit, cap_before/cap_delta retrun to normal: <idle>-0 [001] 220.494011: sched_energy_diff: pid=7 comm=rcu_preempt src_cpu=1 dst_cpu=2 usage_delta=3 nrg_before=248 nrg_after=248 nrg_diff=0 cap_before=528 cap_after=528 cap_delta=0 nrg_delta=0 nrg_payoff=0 Change-Id: I7b5f7ccce56e93af7ea4e87d8e0ea6e2405f9c27 Signed-off-by: Ke Wang <ke.wang@spreadtrum.com> (cherry picked from commit 0da783a605cd20d5a37c2a840e8a1fa641c09768) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:24:24 +00:00
Ke Wang	ece6d3b76e	sched: EAS: update trg_cpu to backup_cpu if no energy saving for target_cpu If no energy saving for target_cpu in the calculation of energy_diff(), backup_cpu will be set as the new dst_cpu for the next calculation. At this point, we also need update the new trg_cpu as backup_cpu to make sure the subsequent calculation of energy_diff() is correct. Change-Id: If3b35b6dc54865f1cb4b1603134102d4422227d5 Signed-off-by: Ke Wang <ke.wang@spreadtrum.com> (cherry picked from commit b1923e22f4eca3e537e015d6ea3dce1187edea37) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:24:04 +00:00
Patrick Bellasi	06d637c9f9	sched/fair: consider task utilization in group_max_util() The group_max_util() function is used to compute the maximum utilization across the CPUs of a certain energy_env configuration. Its main client is the energy_diff function when it needs to compute the SG capacity for one of the before/after scheduling candidates. Currently, the energy_diff function sets util_delta = 0 when it wants to compute the energy corresponding to the scheduling candidate where the task runs in the previous CPU. This implies that, for the task waking up in the previous CPU we consider only its blocked load tracked by the CPU RQ. However, in case of a medium-big task which is waking up on a long time idle CPU, this blocked load can be already completely decayed. More in general, the current approach is biased towards under-estimating the capacity requirements for the "before" scheduling candidate. This patch fixes this by: - always use the cpu_util_wake() to properly get the utilization of a CPU without any (partially decayed) contribution of the waking up task - adding the task utilization to the cpu_util_wake just for the target cpu The "target CPU" is defined by the energy_env to be either the src_cpu or the dst_cpu, depending on which scheduling candidate we are considering. Finally, since this update removes the last usage of calc_util_delta() this function is now safely removed. Change-Id: I20ee1bcf40cee6bf6e265fb2d32ef79061ad6ced Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit 52d70152fade678e304683bd1a5842af95e83558) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:23:46 +00:00
Chris Redpath	4530ed9a46	sched/fair: consider task utilization in group_norm_util() The group_norm_util() function is used to compute the normalized utilization of a SG given a certain energy_env configuration. The main client of this function is the energy_diff function when it comes to compute the SG energy for one of the before/after scheduling candidates. Currently, the energy_diff function sets util_delta = 0 when it wants to compute the energy corresponding to the scheduling candidate where the task runs in the previous CPU. This implies that, for the task waking up in the previous CPU we consider only its blocked load tracked by the CPU RQ. However, in case of a medium-big task which is waking up on a long time idle CPU, this blocked load can be already completely decayed. More in general, the current approach is biased towards under-estimating the energy consumption for the "before" scheduling candidate. This patch fixes this by: - always use the cpu_util_wake() to properly get the utilization of a CPU without any (partially decayed) contribution of the waking up task - adding the task utilization to the cpu_util_wake just for the target cpu The "target CPU" is defined by the energy_env to be either the src_cpu or the dst_cpu, depending on which scheduling candidate we are considering. This patch update also the definition of __cpu_norm_util(), which is currently called just by the group_norm_util() function. This allows to simplify the code by using this function just to normalize a specified utilization with respect to a given capacity. This update allows to completely remove any dependency of group_norm_util() from calc_util_delta(). Change-Id: I3b6ec50ce8decb1521faae660e326ab3319d3c82 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit ef34ea830347ca175ba8a1baf05357ed98d5728c) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:23:26 +00:00
Patrick Bellasi	1e58674375	sched/fair: enforce EAS mode For non latency sensitive tasks the goal is to optimize for energy efficiency. Thus, we should try our best to avoid moving a task on a CPU which is then going to be marked as overutilized. Let's use the capacity_margin metric to verify if a candidate target CPU should be considered without risking to bail out of EAS mode. Change-Id: Ib3697106f4073aedf4a6c6ce42bd5d000fa8c007 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit f95753da4ba7e1c1eee0500ffe41b3e5fa68b347) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:23:06 +00:00
Patrick Bellasi	78ff98b3aa	sched/fair: ignore backup CPU when not valid The find_best_target can sometimes not return a valid backup CPU, either because it cannot find one or just becasue it returns prev_cpu as a backup. In these cases we should skip the energy_diff evaluation for the backup CPU. Change-Id: I3787dbdfe74122348dd7a7485b88c4679051bd32 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit d9bcf5b88594a0225b51878236e49305f272eadc) [trivial cherry-pick issue] Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:22:44 +00:00
Patrick Bellasi	6abf18bda9	sched/fair: trace energy_diff for non boosted tasks In systems where SchedTune is enabled, we do not report energy diff for non boosted tasks. Let's fix this by always genereting an energy_diff event where however: nrg.delta = 0, since we skip energy normalization payoff = nrg.diff, since the payoff is defined just by the energy difference Change-Id: I9a11ec19b6f56da04147f5ae5b47daf1dd180445 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit 13e2e3c7f7d09f619444631a0962fd8020660b8d) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:22:26 +00:00
Brendan Jackman	32ea775082	UPSTREAM: sched/fair: Sync task util before slow-path wakeup We use task_util() in find_idlest_group() via capacity_spare_wake(). This task_util() updated in wake_cap(). However wake_cap() is not the only reason for ending up in find_idlest_group() - we could have been sent there by wake_wide(). So explicitly sync the task util with prev_cpu when we are about to head to find_idlest_group(). We could simply do this at the beginning of select_task_rq_fair() (i.e. irrespective of whether we're heading to select_idle_sibling() or find_idlest_group() & co), but I didn't want to slow down the select_idle_sibling() path more than necessary. Don't do this during fork balancing, we won't need the task_util and we'd just clobber the last_update_time, which is supposed to be 0. Change-Id: I56113d8d67cf338f3fb1a692422289cf659399a3 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andres Oportus <andresoportus@google.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincent Guittot <vincent.guittot@linaro.org> Link: http://lkml.kernel.org/r/20170808095519.10077-1-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `ea16f0ea6c` in tip:sched/core) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:22:03 +00:00
Peter Zijlstra	350e127dae	UPSTREAM: sched/core: Add missing update_rq_clock() call in set_user_nice() Address this rq-clock update bug: WARNING: CPU: 30 PID: 195 at ../kernel/sched/sched.h:797 set_next_entity() rq->clock_update_flags < RQCF_ACT_SKIP Call Trace: dump_stack() __warn() warn_slowpath_fmt() set_next_entity() ? _raw_spin_lock() set_curr_task_fair() set_user_nice.part.85() set_user_nice() create_worker() worker_thread() kthread() ret_from_fork() Change-Id: I8fb2653b2d9cb3bbc1637c1bcbf1c0645752ef12 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `2fb8d36787`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:21:38 +00:00
Peter Zijlstra	0aed57e79e	UPSTREAM: sched/core: Add missing update_rq_clock() call for task_hot() Add the update_rq_clock() call at the top of the callstack instead of at the bottom where we find it missing, this to aid later effort to minimize the number of update_rq_lock() calls. WARNING: CPU: 30 PID: 194 at ../kernel/sched/sched.h:797 assert_clock_updated() rq->clock_update_flags < RQCF_ACT_SKIP Call Trace: dump_stack() __warn() warn_slowpath_fmt() assert_clock_updated.isra.63.part.64() can_migrate_task() load_balance() pick_next_task_fair() __schedule() schedule() worker_thread() kthread() Change-Id: I17716141789f9fac709495951cd2e079cf49d6d8 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `3bed5e2166`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:21:15 +00:00
Peter Zijlstra	7c4e0f0832	UPSTREAM: sched/core: Add missing update_rq_clock() in detach_task_cfs_rq() Instead of adding the update_rq_clock() all the way at the bottom of the callstack, add one at the top, this to aid later effort to minimize update_rq_lock() calls. WARNING: CPU: 0 PID: 1 at ../kernel/sched/sched.h:797 detach_task_cfs_rq() rq->clock_update_flags < RQCF_ACT_SKIP Call Trace: dump_stack() __warn() warn_slowpath_fmt() detach_task_cfs_rq() switched_from_fair() __sched_setscheduler() _sched_setscheduler() sched_set_stop_task() cpu_stop_create() __smpboot_create_thread.part.2() smpboot_register_percpu_thread_cpumask() cpu_stop_init() do_one_initcall() ? print_cpu_info() kernel_init_freeable() ? rest_init() kernel_init() ret_from_fork() Change-Id: Iee08c2ed3303ae8f0c527658f13646b02a412cad Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `80f5c1b84b`) Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2017-11-02 18:20:28 +00:00

1 2 3 4 5 ...

24000 Commits