Note that savedefconfig dropped CONFIG_SCHED_TUNE, which is not present
on this branch (yet).
Test: boot cuttlefish on arm64
Signed-off-by: Tri Vo <trong@google.com>
Pull perf fixes from Ingo Molnar:
"I'd like to apologize for this very late pull request: I was dithering
through the week whether to send the fixes, and then yesterday Jiri's
crash fix for a regression introduced in this cycle clearly marked
perf/urgent as 'must merge now'.
Most of the commits are tooling fixes, plus there's three kernel fixes
via four commits:
- race fix in the Intel PEBS code
- fix an AUX bug and roll back a previous attempt
- fix AMD family 17h generic HW cache-event perf counters
The largest diffstat contribution comes from the AMD fix - a new event
table is introduced, which is a fairly low risk change but has a large
linecount"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Fix race in intel_pmu_disable_event()
perf/x86/intel/pt: Remove software double buffering PMU capability
perf/ring_buffer: Fix AUX software double buffering
perf tools: Remove needless asm/unistd.h include fixing build in some places
tools arch uapi: Copy missing unistd.h headers for arc, hexagon and riscv
tools build: Add -ldl to the disassembler-four-args feature test
perf cs-etm: Always allocate memory for cs_etm_queue::prev_packet
perf cs-etm: Don't check cs_etm_queue::prev_packet validity
perf report: Report OOM in status line in the GTK UI
perf bench numa: Add define for RUSAGE_THREAD if not present
tools lib traceevent: Change tag string for error
perf annotate: Fix build on 32 bit for BPF annotation
tools uapi x86: Sync vmx.h with the kernel
perf bpf: Return value with unlocking in perf_env__find_btf()
MAINTAINERS: Include vendor specific files under arch/*/events/*
perf/x86/amd: Update generic hardware cache events for Family 17h
Pull scheduler fix from Ingo Molnar:
"Fix a kobject memory leak in the cpufreq code"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/cpufreq: Fix kobject memleak
Pull x86 fix from Ingo Molnar:
"Disable function tracing during early SME setup to fix a boot crash on
SME-enabled kernels running distro kernels (some of which have
function tracing enabled)"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/mem_encrypt: Disable all instrumentation for early SME setup
Pull vfs fixes from Al Viro:
- a couple of ->i_link use-after-free fixes
- regression fix for wrong errno on absent device name in mount(2)
(this cycle stuff)
- ancient UFS braino in large GID handling on Solaris UFS images (bogus
cut'n'paste from large UID handling; wrong field checked to decide
whether we should look at old (16bit) or new (32bit) field)
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
Abort file_remove_privs() for non-reg. files
[fix] get rid of checking for absent device name in vfs_get_tree()
apparmorfs: fix use-after-free on symlink traversal
securityfs: fix use-after-free on symlink traversal
New race in x86_pmu_stop() was introduced by replacing the
atomic __test_and_clear_bit() of cpuc->active_mask by separate
test_bit() and __clear_bit() calls in the following commit:
3966c3feca ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
The race causes panic for PEBS events with enabled callchains:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
...
RIP: 0010:perf_prepare_sample+0x8c/0x530
Call Trace:
<NMI>
perf_event_output_forward+0x2a/0x80
__perf_event_overflow+0x51/0xe0
handle_pmi_common+0x19e/0x240
intel_pmu_handle_irq+0xad/0x170
perf_event_nmi_handler+0x2e/0x50
nmi_handle+0x69/0x110
default_do_nmi+0x3e/0x100
do_nmi+0x11a/0x180
end_repeat_nmi+0x16/0x1a
RIP: 0010:native_write_msr+0x6/0x20
...
</NMI>
intel_pmu_disable_event+0x98/0xf0
x86_pmu_stop+0x6e/0xb0
x86_pmu_del+0x46/0x140
event_sched_out.isra.97+0x7e/0x160
...
The event is configured to make samples from PEBS drain code,
but when it's disabled, we'll go through NMI path instead,
where data->callchain will not get allocated and we'll crash:
x86_pmu_stop
test_bit(hwc->idx, cpuc->active_mask)
intel_pmu_disable_event(event)
{
...
intel_pmu_pebs_disable(event);
...
EVENT OVERFLOW -> <NMI>
intel_pmu_handle_irq
handle_pmi_common
TEST PASSES -> test_bit(bit, cpuc->active_mask))
perf_event_overflow
perf_prepare_sample
{
...
if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))
data->callchain = perf_callchain(event, regs);
CRASH -> size += data->callchain->nr;
}
</NMI>
...
x86_pmu_disable_event(event)
}
__clear_bit(hwc->idx, cpuc->active_mask);
Fixing this by disabling the event itself before setting
off the PEBS bit.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Arcari <darcari@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Lendacky Thomas <Thomas.Lendacky@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 3966c3feca ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
Link: http://lkml.kernel.org/r/20190504151556.31031-1-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull powerpc fix from Michael Ellerman:
"One regression fix.
Changes we merged to STRICT_KERNEL_RWX on 32-bit were causing crashes
under load on some machines depending on memory layout.
Thanks to Christophe Leroy"
* tag 'powerpc-5.1-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/32s: Fix BATs setting with CONFIG_STRICT_KERNEL_RWX
Pull KVM fixes from Paolo Bonzini:
- PPC and ARM bugfixes from submaintainers
- Fix old Windows versions on AMD (recent regression)
- Fix old Linux versions on processors without EPT
- Fixes for LAPIC timer optimizations
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
KVM: nVMX: Fix size checks in vmx_set_nested_state
KVM: selftests: make hyperv_cpuid test pass on AMD
KVM: lapic: Check for in-kernel LAPIC before deferencing apic pointer
KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size
x86/kvm/mmu: reset MMU context when 32-bit guest switches PAE
KVM: x86: Whitelist port 0x7e for pre-incrementing %rip
Documentation: kvm: fix dirty log ioctl arch lists
KVM: VMX: Move RSB stuffing to before the first RET after VM-Exit
KVM: arm/arm64: Don't emulate virtual timers on userspace ioctls
kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP
KVM: arm/arm64: Ensure vcpu target is unset on reset failure
KVM: lapic: Convert guest TSC to host time domain if necessary
KVM: lapic: Allow user to disable adaptive tuning of timer advancement
KVM: lapic: Track lapic timer advance per vCPU
KVM: lapic: Disable timer advancement if adaptive tuning goes haywire
x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012
KVM: x86: Consider LAPIC TSC-Deadline timer expired if deadline too short
KVM: PPC: Book3S: Protect memslots while validating user address
KVM: PPC: Book3S HV: Perserve PSSCR FAKE_SUSPEND bit on guest exit
KVM: arm/arm64: vgic-v3: Retire pending interrupts on disabling LPIs
...
Pull i2c fixes from Wolfram Sang:
"I2C driver bugfixes and a MAINTAINERS update for you"
* 'i2c/for-current-fixed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: Prevent runtime suspend of adapter when Host Notify is required
i2c: synquacer: fix enumeration of slave devices
MAINTAINERS: friendly takeover of i2c-gpio driver
i2c: designware: ratelimit 'transfer when suspended' errors
i2c: imx: correct the method of getting private data in notifier_call
Invalid frequency checks are a bottleneck in reading
/proc/uid_time_in_state, but there's no reason to include invalid
frequencies in our local copies of frequency tables. Revise
cpufreq_times_create_policy() to only copy valid frequencies, and
eliminate all the checks this change makes unnecessary.
Bug: 111216804
Test: cat /proc/uid_time_in_state & confirm values & format are sane
Test: /proc/uid_time_in_state read times reduced by ~40%
Change-Id: I506420a6ac5fe8a6c87d01b16ad267b192d43f1d
Signed-off-by: Connor O'Brien <connoro@google.com>
Based on https://www.redhat.com/archives/dm-devel/2019-March/msg00025.html
Third version of dm-bow. Key changes:
Free list added
Support for block sizes other than 4k
Handles writes during trim phase, and overlapping trims
Integer overflow error
Support trims even if underlying device doesn't
Numerous small bug fixes
bow == backup on write
USE CASE:
dm-bow takes a snapshot of an existing file system before mounting.
The user may, before removing the device, commit the snapshot.
Alternatively the user may remove the device and then run a command
line utility to restore the device to its original state.
dm-bow does not require an external device
dm-bow efficiently uses all the available free space on the file system.
IMPLEMENTATION:
dm-bow can be in one of three states.
In state one, the free blocks on the device are identified by issuing
an FSTRIM to the filesystem.
In state two, any writes cause the overwritten data to be backup up
to the available free space. While in this state, the device can be
restored by unmounting the filesystem, removing the dm-bow device
and running a usermode tool over the underlying device.
In state three, the changes are committed, dm-bow is in pass-through
mode and the drive can no longer be restored.
It is planned to use this driver to enable restoration of a failed
update attempt on Android devices using ext4.
Test: Can boot Android with userdata mounted on this device. Can commit
userdata after SUW has run. Can then reboot, make changes and roll back.
Known issues:
Mutex is held around entire flush operation, including lengthy I/O. Plan
is to convert to state machine with pending queues.
Interaction with block encryption is unknown, especially with respect
to sector 0.
Bug: 119769411
Bug: 129280212
Test: Dogfooded on Wahoo.
Ran under Cuttlefish, running VtsKernelBowTest &
VtsKernelCheckpointTest tests against 4.19, 4.14 & 4.9 kernels
Change-Id: Id70988bbd797ebe3e76fc175094388b423c8da8c
Signed-off-by: Paul Lawrence <paullawrence@google.com>
This is to mirror:
https://android-review.googlesource.com/c/kernel/configs/+/920741
Require CONFIG_USB_RTL8152 != n if we have host usb support.
Generated via:
echo 'CONFIG_USB_RTL8152=y' >> arch/x86/configs/x86_64_cuttlefish_defconfig
echo 'CONFIG_USB_RTL8152=y' >> arch/arm64/configs/cuttlefish_defconfig
make ARCH=x86_64 x86_64_cuttlefish_defconfig
make ARCH=x86_64 savedefconfig
cat defconfig > arch/x86/configs/x86_64_cuttlefish_defconfig
make ARCH=arm64 cuttlefish_defconfig
make ARCH=arm64 savedefconfig
cat defconfig > arch/arm64/configs/cuttlefish_defconfig
rm defconfig
Bug: 110755806
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Iadca2a11b76296532aa42dde999ee73f58e0bd97
Enable driver support for the ac97 emulation provided by QEMU and
crosvm. This is for the older 'ac97' soundhw, not 'hda'.
Bug: 120439617
Bug: 126955561
Test: local build and test of sound from cuttlefish
Change-Id: I6c29e352e0be161e2a1dc35fde50b888b7dbf86e
Signed-off-by: Alistair Strachan <astrachan@google.com>
The majority of the time spent reading /proc/uid_time_in_state is due
to seq_printf calls. Use the faster seq_put_* variations instead.
Also skip empty hash buckets in uid_seq_next for a further performance
improvement.
Bug: 111216804
Bug: 127641090
Test: Read /proc/uid_time_in_state and confirm output is sane
Test: Compare read times to confirm performance improvement
Change-Id: If8783b498ed73d2ddb186a49438af41ac5ab9957
Signed-off-by: Connor O'Brien <connoro@google.com>
cpufreq_times_record_transition() is not called when fast switch is
enabled, leading /proc/uid_time_in_state to attribute all time on a
cluster to a single frequency. To fix this, add a call to
cpufreq_times_record_transition() in the fast switch path.
Also revise cpufreq_times_record_transition() to simplify the new call
and more closely align with cpufreq_stats_record_transition().
Bug: 121287027
Bug: 127641090
Test: /proc/uid_time_in_state shows times for more than one freq per
cluster
Change-Id: Ib63d19006878fafb88475e401ef243bdd8b11979
Signed-off-by: Connor O'Brien <connoro@google.com>
In order to model the energy used by the cpu, we need to expose cpu
time information to userspace. We can use this information to blame
uids for the energy they are using.
This patch adds 2 files:
/proc/uid_concurrent_active_time outputs the time each uid spent
running with 1 to num_possible_cpus online and not idle.
For instance, if 4 cores are online and active for 50ms and the uid is
running on one of the cores, the uid should be blamed for 1/4 of the
base energy used by the cpu when 1 or more cores is active.
/proc/uid_concurrent_policy_time outputs the time each uid spent
running on group of cpu's with the same policy with 1 to
num_related_cpus online and not idle.
For instance, if 2 cores that are a part of the same policy are online
and active for 50ms and the uid is running on one of the cores, the
uid should be blamed for 1/2 of the energy that is used everytime one
or more cpus in the same policy is active.
This patch is based on commit c89e69136fec ("ANDROID: cpufreq:
uid_concurrent_active_time") and commit 989212536842 ("ANDROID:
cpufreq: uid_concurrent_policy_time") in the android-msm-wahoo-4.4
kernel by Marissa Wall.
Bug: 111216804
Bug: 127641090
Test: cat files on hikey960 and confirm output is reasonable
Change-Id: I1a342361af5c04ecee58d1ab667c91c1bce42445
Signed-off-by: Connor O'Brien <connoro@google.com>
Add per-uid files that report the data in binary format rather than
text, to allow faster reading & parsing by userspace.
Signed-off-by: Connor O'Brien <connoro@google.com>
Bug: 72339335
Bug: 127641090
Test: compare values to those reported in /proc/uid_time_in_state
Change-Id: I463039ea7f17b842be4c70024fe772539fe2ce02
Add support for reporting per-uid information through procfs, roughly
following the approach used for per-tid and per-tgid directories in
fs/proc/base.c.
This also entails some new tracking of which uids have been used, to
avoid losing information when the last task with a given uid exits.
Bug: 72339335
Bug: 127641090
Test: ls /proc/uid/; compare with UIDs in /proc/uid_time_in_state
Change-Id: I0908f0c04438b11ceb673d860e58441bf503d478
Signed-off-by: Connor O'Brien <connoro@google.com>
[AmitP: Fix proc_fill_cache() now that upstream commit
0168b9e38c ("procfs: switch instantiate_t to d_splice_alias()"),
switched instantiate() callback to d_splice_alias()]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
[astrachan: Folded 97b7790f505e ("ANDROID: proc: fix undefined behavior
in proc_uid_base_readdir") into this change]
Signed-off-by: Alistair Strachan <astrachan@google.com>
Add time in state data to task structs, and create
/proc/<pid>/time_in_state files to show how long each individual task
has run at each frequency.
Create a CONFIG_CPU_FREQ_TIMES option to enable/disable this tracking.
Bug: 72339335
Bug: 127641090
Test: Read /proc/<pid>/time_in_state
Change-Id: Ia6456754f4cb1e83b2bc35efa8fbe9f8696febc8
Signed-off-by: Connor O'Brien <connoro@google.com>
[astrachan: Folded the following changes into this patch:
a6d3de6a7fba ("ANDROID: Reduce use of #ifdef CONFIG_CPU_FREQ_TIMES")
b89ada5d9c09 ("ANDROID: Fix massive cpufreq_times memory leaks")]
Signed-off-by: Alistair Strachan <astrachan@google.com>
Once xt_qtaguid module is deprecated, the netd strictController which
uses owner match to filter egress traffic will not work because
xt_qtaguid masquerades as (and implements/extends) the "owner" module on
android devices. It can be resolved by turning upstream xt_owner module
back on since strictController only targets egress traffic and the
upstream xt_owner module works fine in this case.
Signed-off-by: Chenbo Feng <fengc@google.com>
Bug: 79938294
Test: manual cherry-pick and compile
Change-Id: Ia099db025f17f6042384c9f0caf7b941a40b8b84
This configuration is required for the VTS test
VtsKernelApiSysfsTest#testRtcHctosys to pass.
Bug: 123860857
Test: run vts-kernel -m VtsKernelApiSysfsTest
Signed-off-by: Matthias Maennich <maennich@google.com>
Change-Id: Icae17c74460bcd2aef4cf4e3ec5381de9ea0a66c
This is to mirror
https://android-review.googlesource.com/c/kernel/configs/+/870517
android-4.9+: add CONFIG_NET_CLS_BPF to base
Generated via:
echo 'CONFIG_NET_CLS_BPF=y' >> arch/x86/configs/x86_64_cuttlefish_defconfig
echo 'CONFIG_NET_CLS_BPF=y' >> arch/arm64/configs/cuttlefish_defconfig
make ARCH=x86_64 x86_64_cuttlefish_defconfig
make ARCH=x86_64 savedefconfig
cat defconfig > arch/x86/configs/x86_64_cuttlefish_defconfig
make ARCH=arm64 cuttlefish_defconfig
make ARCH=arm64 savedefconfig
cat defconfig > arch/arm64/configs/cuttlefish_defconfig
Bug: 65674744
Change-Id: I8e4dfe7a99d38fd5942001a1aab83b7ac9df30dd
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Enabling this was previously blocked by a lack of support for this
feature in clang, but that problem has been resolved in a newer version
of the compiler.
Bug: 120439617
Change-Id: I0f5fd2439c5a71ee0988648970576b46b2c4d20b
Signed-off-by: Alistair Strachan <astrachan@google.com>
Currently, IPv6 router discovery always puts routes into
RT6_TABLE_MAIN. This causes problems for connection managers
that want to support multiple simultaneous network connections
and want control over which one is used by default (e.g., wifi
and wired).
To work around this connection managers typically take the routes
they prefer and copy them to static routes with low metrics in
the main table. This puts the burden on the connection manager
to watch netlink to see if the routes have changed, delete the
routes when their lifetime expires, etc.
Instead, this patch adds a per-interface sysctl to have the
kernel put autoconf routes into different tables. This allows
each interface to have its own autoconf table, and choosing the
default interface (or using different interfaces at the same
time for different types of traffic) can be done using
appropriate ip rules.
The sysctl behaves as follows:
- = 0: default. Put routes into RT6_TABLE_MAIN as before.
- > 0: manual. Put routes into the specified table.
- < 0: automatic. Add the absolute value of the sysctl to the
device's ifindex, and use that table.
The automatic mode is most useful in conjunction with
net.ipv6.conf.default.accept_ra_rt_table. A connection manager
or distribution could set it to, say, -100 on boot, and
thereafter just use IP rules.
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
[AmitP: Refactored original changes to align with
the changes introduced by upstream commits
830218c1ad ("net: ipv6: Fix processing of RAs in presence of VRF"),
8d1c802b28 ("net/ipv6: Flip FIB entries to fib6_info").
Also folded following android-4.9 commit changes into this patch
be65fb01da ("ANDROID: net: ipv6: remove unused variable ifindex in")]
Bug: 120445791
Change-Id: I82d16e3737d9cdfa6489e649e247894d0d60cbb1
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
When kernel.perf_event_open is set to 3 (or greater), disallow all
access to performance events by users without CAP_SYS_ADMIN.
Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that
makes this value the default.
This is based on a similar feature in grsecurity
(CONFIG_GRKERNSEC_PERF_HARDEN). This version doesn't include making
the variable read-only. It also allows enabling further restriction
at run-time regardless of whether the default is changed.
https://lkml.org/lkml/2016/1/11/587
Bug: 29054680
Bug: 120445712
Change-Id: Iff5bff4fc1042e85866df9faa01bce8d04335ab8
[jeffv: Upstream doesn't want it https://lkml.org/lkml/2016/6/17/101]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Send notifications when the label becomes active after an idle period.
Send netlink message notifications in addition to sysfs notifications.
Using a uevent with
subsystem=xt_idletimer
INTERFACE=...
STATE={active,inactive}
This is backport from common android-3.0
commit: beb914e987
with uevent support instead of a new netlink message type.
Bug: 120445672
Change-Id: I31677ef00c94b5f82c8457e5bf9e5e584c23c523
Signed-off-by: Ashish Sharma <ashishsharma@google.com>
Signed-off-by: JP Abgrall <jpa@google.com>
[astrachan: Folded the following changes into this patch:
ee0b238fada5 ("netfilter: xt_IDLETIMER: time-stamp and suspend/resume handling.")
728c058a495e ("netfilter: xt_IDLETIMER: Adds the uid field in the msg")
5ebea489d44c ("netfilter: xt_IDLETIMER: Fix use after free condition during work")
5ab69d7ba2c5 ("netfilter: xt_IDLETIMER: Use fullsock when querying uid")]
Signed-off-by: Alistair Strachan <astrachan@google.com>