The majority of the time spent reading /proc/uid_time_in_state is due
to seq_printf calls. Use the faster seq_put_* variations instead.
Also skip empty hash buckets in uid_seq_next for a further performance
improvement.
Bug: 111216804
Bug: 127641090
Test: Read /proc/uid_time_in_state and confirm output is sane
Test: Compare read times to confirm performance improvement
Change-Id: If8783b498ed73d2ddb186a49438af41ac5ab9957
Signed-off-by: Connor O'Brien <connoro@google.com>
cpufreq_times_record_transition() is not called when fast switch is
enabled, leading /proc/uid_time_in_state to attribute all time on a
cluster to a single frequency. To fix this, add a call to
cpufreq_times_record_transition() in the fast switch path.
Also revise cpufreq_times_record_transition() to simplify the new call
and more closely align with cpufreq_stats_record_transition().
Bug: 121287027
Bug: 127641090
Test: /proc/uid_time_in_state shows times for more than one freq per
cluster
Change-Id: Ib63d19006878fafb88475e401ef243bdd8b11979
Signed-off-by: Connor O'Brien <connoro@google.com>
In order to model the energy used by the cpu, we need to expose cpu
time information to userspace. We can use this information to blame
uids for the energy they are using.
This patch adds 2 files:
/proc/uid_concurrent_active_time outputs the time each uid spent
running with 1 to num_possible_cpus online and not idle.
For instance, if 4 cores are online and active for 50ms and the uid is
running on one of the cores, the uid should be blamed for 1/4 of the
base energy used by the cpu when 1 or more cores is active.
/proc/uid_concurrent_policy_time outputs the time each uid spent
running on group of cpu's with the same policy with 1 to
num_related_cpus online and not idle.
For instance, if 2 cores that are a part of the same policy are online
and active for 50ms and the uid is running on one of the cores, the
uid should be blamed for 1/2 of the energy that is used everytime one
or more cpus in the same policy is active.
This patch is based on commit c89e69136fec ("ANDROID: cpufreq:
uid_concurrent_active_time") and commit 989212536842 ("ANDROID:
cpufreq: uid_concurrent_policy_time") in the android-msm-wahoo-4.4
kernel by Marissa Wall.
Bug: 111216804
Bug: 127641090
Test: cat files on hikey960 and confirm output is reasonable
Change-Id: I1a342361af5c04ecee58d1ab667c91c1bce42445
Signed-off-by: Connor O'Brien <connoro@google.com>
Add per-uid files that report the data in binary format rather than
text, to allow faster reading & parsing by userspace.
Signed-off-by: Connor O'Brien <connoro@google.com>
Bug: 72339335
Bug: 127641090
Test: compare values to those reported in /proc/uid_time_in_state
Change-Id: I463039ea7f17b842be4c70024fe772539fe2ce02
Add support for reporting per-uid information through procfs, roughly
following the approach used for per-tid and per-tgid directories in
fs/proc/base.c.
This also entails some new tracking of which uids have been used, to
avoid losing information when the last task with a given uid exits.
Bug: 72339335
Bug: 127641090
Test: ls /proc/uid/; compare with UIDs in /proc/uid_time_in_state
Change-Id: I0908f0c04438b11ceb673d860e58441bf503d478
Signed-off-by: Connor O'Brien <connoro@google.com>
[AmitP: Fix proc_fill_cache() now that upstream commit
0168b9e38c ("procfs: switch instantiate_t to d_splice_alias()"),
switched instantiate() callback to d_splice_alias()]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
[astrachan: Folded 97b7790f505e ("ANDROID: proc: fix undefined behavior
in proc_uid_base_readdir") into this change]
Signed-off-by: Alistair Strachan <astrachan@google.com>
Add time in state data to task structs, and create
/proc/<pid>/time_in_state files to show how long each individual task
has run at each frequency.
Create a CONFIG_CPU_FREQ_TIMES option to enable/disable this tracking.
Bug: 72339335
Bug: 127641090
Test: Read /proc/<pid>/time_in_state
Change-Id: Ia6456754f4cb1e83b2bc35efa8fbe9f8696febc8
Signed-off-by: Connor O'Brien <connoro@google.com>
[astrachan: Folded the following changes into this patch:
a6d3de6a7fba ("ANDROID: Reduce use of #ifdef CONFIG_CPU_FREQ_TIMES")
b89ada5d9c09 ("ANDROID: Fix massive cpufreq_times memory leaks")]
Signed-off-by: Alistair Strachan <astrachan@google.com>
By default, all access to the upper, lower and work directories is the
recorded mounter's MAC and DAC credentials. The incoming accesses are
checked against the caller's credentials.
If the principles of least privilege are applied, the mounter's
credentials might not overlap the credentials of the caller's when
accessing the overlayfs filesystem. For example, a file that a lower
DAC privileged caller can execute, is MAC denied to the generally
higher DAC privileged mounter, to prevent an attack vector.
We add the option to turn off override_creds in the mount options; all
subsequent operations after mount on the filesystem will be only the
caller's credentials. The module boolean parameter and mount option
override_creds is also added as a presence check for this "feature",
existence of /sys/module/overlay/parameters/override_creds.
It was not always this way. Circa 4.6 there was no recorded mounter's
credentials, instead privileged access to upper or work directories
were temporarily increased to perform the operations. The MAC
(selinux) policies were caller's in all cases. override_creds=off
partially returns us to this older access model minus the insecure
temporary credential increases. This is to permit use in a system
with non-overlapping security models for each executable including
the agent that mounts the overlayfs filesystem. In Android
this is the case since init, which performs the mount operations,
has a minimal MAC set of privileges to reduce any attack surface,
and services that use the content have a different set of MAC
privileges (eg: read, for vendor labelled configuration, execute for
vendor libraries and modules). The caveats are not a problem in
the Android usage model, however they should be fixed for
completeness and for general use in time.
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@android.com
---
v9:
- Add to the caveats
v8:
- drop pr_warn message after straw poll to remove it.
- added a use case in the commit message
v7:
- change name of internal parameter to ovl_override_creds_def
- report override_creds only if different than default
v6:
- Drop CONFIG_OVERLAY_FS_OVERRIDE_CREDS.
- Do better with the documentation.
- pr_warn message adjusted to report consequences.
v5:
- beefed up the caveats in the Documentation
- Is dependent on
"overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh"
"overlayfs: check CAP_MKNOD before issuing vfs_whiteout"
- Added prwarn when override_creds=off
v4:
- spelling and grammar errors in text
v3:
- Change name from caller_credentials / creator_credentials to the
boolean override_creds.
- Changed from creator to mounter credentials.
- Updated and fortified the documentation.
- Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS
v2:
- Forward port changed attr to stat, resulting in a build error.
- altered commit message.
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
(cherry picked from https://lore.kernel.org/patchwork/patch/1009299)
Bug: 109821005
Bug: 112955896
Bug: 127298877
Change-Id: Ie43b0c7dfd64f4cfc27dfe5e1622ea01f3b000cf
Changes in 4.19.27
irq/matrix: Split out the CPU selection code into a helper
irq/matrix: Spread managed interrupts on allocation
genirq/matrix: Improve target CPU selection for managed interrupts.
mac80211: Change default tx_sk_pacing_shift to 7
scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached
drm/msm: Unblock writer if reader closes file
ASoC: Intel: Haswell/Broadwell: fix setting for .dynamic field
ALSA: compress: prevent potential divide by zero bugs
ASoC: Variable "val" in function rt274_i2c_probe() could be uninitialized
clk: tegra: dfll: Fix a potential Oop in remove()
clk: sysfs: fix invalid JSON in clk_dump
clk: vc5: Abort clock configuration without upstream clock
thermal: int340x_thermal: Fix a NULL vs IS_ERR() check
usb: dwc3: gadget: synchronize_irq dwc irq in suspend
usb: dwc3: gadget: Fix the uninitialized link_state when udc starts
usb: gadget: Potential NULL dereference on allocation error
selftests: rtc: rtctest: fix alarm tests
selftests: rtc: rtctest: add alarm test on minute boundary
genirq: Make sure the initial affinity is not empty
x86/mm/mem_encrypt: Fix erroneous sizeof()
ASoC: rt5682: Fix PLL source register definitions
ASoC: dapm: change snprintf to scnprintf for possible overflow
ASoC: imx-audmux: change snprintf to scnprintf for possible overflow
selftests/vm/gup_benchmark.c: match gup struct to kernel
phy: ath79-usb: Fix the power on error path
phy: ath79-usb: Fix the main reset name to match the DT binding
selftests: seccomp: use LDLIBS instead of LDFLAGS
selftests: gpio-mockup-chardev: Check asprintf() for error
irqchip/gic-v3-mbi: Fix uninitialized mbi_lock
ARC: fix __ffs return value to avoid build warnings
ARC: show_regs: lockdep: avoid page allocator...
drivers: thermal: int340x_thermal: Fix sysfs race condition
staging: rtl8723bs: Fix build error with Clang when inlining is disabled
mac80211: fix miscounting of ttl-dropped frames
sched/wait: Fix rcuwait_wake_up() ordering
sched/wake_q: Fix wakeup ordering for wake_q
futex: Fix (possible) missed wakeup
locking/rwsem: Fix (possible) missed wakeup
drm/amd/powerplay: OD setting fix on Vega10
tty: serial: qcom_geni_serial: Allow mctrl when flow control is disabled
serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling
drm/sun4i: hdmi: Fix usage of TMDS clock
staging: android: ion: Support cpu access during dma_buf_detach
direct-io: allow direct writes to empty inodes
writeback: synchronize sync(2) against cgroup writeback membership switches
scsi: lpfc: nvme: avoid hang / use-after-free when destroying localport
scsi: lpfc: nvmet: avoid hang / use-after-free when destroying targetport
scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state()
net: altera_tse: fix connect_local_phy error path
hv_netvsc: Fix ethtool change hash key error
hv_netvsc: Refactor assignments of struct netvsc_device_info
hv_netvsc: Fix hash key value reset after other ops
nvme-rdma: fix timeout handler
nvme-multipath: drop optimization for static ANA group IDs
drm/msm: Fix A6XX support for opp-level
net: usb: asix: ax88772_bind return error when hw_reset fail
net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP
ibmveth: Do not process frames after calling napi_reschedule
mac80211: don't initiate TDLS connection if station is not associated to AP
mac80211: Add attribute aligned(2) to struct 'action'
cfg80211: extend range deviation for DMG
svm: Fix AVIC incomplete IPI emulation
KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1
kvm: selftests: Fix region overlap check in kvm_util
mmc: spi: Fix card detection during probe
mmc: tmio_mmc_core: don't claim spurious interrupts
mmc: tmio: fix access width of Block Count Register
mmc: core: Fix NULL ptr crash from mmc_should_fail_request
mmc: cqhci: fix space allocated for transfer descriptor
mmc: cqhci: Fix a tiny potential memory leak on error condition
mmc: sdhci-esdhc-imx: correct the fix of ERR004536
mm: enforce min addr even if capable() in expand_downwards()
drm: Block fb changes for async plane updates
hugetlbfs: fix races and page leaks during migration
MIPS: fix truncation in __cmpxchg_small for short values
MIPS: BCM63XX: provide DMA masks for ethernet devices
MIPS: eBPF: Fix icache flush end address
x86/uaccess: Don't leak the AC flag into __put_user() value evaluation
Linux 4.19.27
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 2a418cf3f5 upstream.
When calling __put_user(foo(), ptr), the __put_user() macro would call
foo() in between __uaccess_begin() and __uaccess_end(). If that code
were buggy, then those bugs would be run without SMAP protection.
Fortunately, there seem to be few instances of the problem in the
kernel. Nevertheless, __put_user() should be fixed to avoid doing this.
Therefore, evaluate __put_user()'s argument before setting AC.
This issue was noticed when an objtool hack by Peter Zijlstra complained
about genregs_get() and I compared the assembly output to the C source.
[ bp: Massage commit message and fixed up whitespace. ]
Fixes: 11f1a4b975 ("x86: reorganize SMAP handling in user space accesses")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190225125231.845656645@infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1a2930d8a upstream.
The MIPS eBPF JIT calls flush_icache_range() in order to ensure the
icache observes the code that we just wrote. Unfortunately it gets the
end address calculation wrong due to some bad pointer arithmetic.
The struct jit_ctx target field is of type pointer to u32, and as such
adding one to it will increment the address being pointed to by 4 bytes.
Therefore in order to find the address of the end of the code we simply
need to add the number of 4 byte instructions emitted, but we mistakenly
add the number of instructions multiplied by 4. This results in the call
to flush_icache_range() operating on a memory region 4x larger than
intended, which is always wasteful and can cause crashes if we overrun
into an unmapped page.
Fix this by correcting the pointer arithmetic to remove the bogus
multiplication, and use braces to remove the need for a set of brackets
whilst also making it obvious that the target field is a pointer.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: b6bd53f9c4 ("MIPS: Add missing file for eBPF JIT.")
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: netdev@vger.kernel.org
Cc: bpf@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 94ee12b507 upstream.
__cmpxchg_small erroneously uses u8 for load comparison which can
be either char or short. This patch changes the local variable to
u32 which is sufficiently sized, as the loaded value is already
masked and shifted appropriately. Using an integer size avoids
any unnecessary canonicalization from use of non native widths.
This patch is part of a series that adapts the MIPS small word
atomics code for xchg and cmpxchg on short and char to RISC-V.
Cc: RISC-V Patches <patches@groups.riscv.org>
Cc: Linux RISC-V <linux-riscv@lists.infradead.org>
Cc: Linux MIPS <linux-mips@linux-mips.org>
Signed-off-by: Michael Clark <michaeljclark@mac.com>
[paul.burton@mips.com:
- Fix varialble typo per Jonas Gorski.
- Consolidate load variable with other declarations.]
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 3ba7f44d2b ("MIPS: cmpxchg: Implement 1 byte & 2 byte cmpxchg()")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb6acd01e2 upstream.
hugetlb pages should only be migrated if they are 'active'. The
routines set/clear_page_huge_active() modify the active state of hugetlb
pages.
When a new hugetlb page is allocated at fault time, set_page_huge_active
is called before the page is locked. Therefore, another thread could
race and migrate the page while it is being added to page table by the
fault code. This race is somewhat hard to trigger, but can be seen by
strategically adding udelay to simulate worst case scheduling behavior.
Depending on 'how' the code races, various BUG()s could be triggered.
To address this issue, simply delay the set_page_huge_active call until
after the page is successfully added to the page table.
Hugetlb pages can also be leaked at migration time if the pages are
associated with a file in an explicitly mounted hugetlbfs filesystem.
For example, consider a two node system with 4GB worth of huge pages
available. A program mmaps a 2G file in a hugetlbfs filesystem. It
then migrates the pages associated with the file from one node to
another. When the program exits, huge page counts are as follows:
node0
1024 free_hugepages
1024 nr_hugepages
node1
0 free_hugepages
1024 nr_hugepages
Filesystem Size Used Avail Use% Mounted on
nodev 4.0G 2.0G 2.0G 50% /var/opt/hugepool
That is as expected. 2G of huge pages are taken from the free_hugepages
counts, and 2G is the size of the file in the explicitly mounted
filesystem. If the file is then removed, the counts become:
node0
1024 free_hugepages
1024 nr_hugepages
node1
1024 free_hugepages
1024 nr_hugepages
Filesystem Size Used Avail Use% Mounted on
nodev 4.0G 2.0G 2.0G 50% /var/opt/hugepool
Note that the filesystem still shows 2G of pages used, while there
actually are no huge pages in use. The only way to 'fix' the filesystem
accounting is to unmount the filesystem
If a hugetlb page is associated with an explicitly mounted filesystem,
this information in contained in the page_private field. At migration
time, this information is not preserved. To fix, simply transfer
page_private from old to new page at migration time if necessary.
There is a related race with removing a huge page from a file and
migration. When a huge page is removed from the pagecache, the
page_mapping() field is cleared, yet page_private remains set until the
page is actually freed by free_huge_page(). A page could be migrated
while in this state. However, since page_mapping() is not set the
hugetlbfs specific routine to transfer page_private is not called and we
leak the page count in the filesystem.
To fix that, check for this condition before migrating a huge page. If
the condition is detected, return EBUSY for the page.
Link: http://lkml.kernel.org/r/74510272-7319-7372-9ea6-ec914734c179@oracle.com
Link: http://lkml.kernel.org/r/20190212221400.3512-1-mike.kravetz@oracle.com
Fixes: bcc5422230 ("mm: hugetlb: introduce page_huge_active")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org>
[mike.kravetz@oracle.com: v2]
Link: http://lkml.kernel.org/r/7534d322-d782-8ac6-1c8d-a8dc380eb3ab@oracle.com
[mike.kravetz@oracle.com: update comment and changelog]
Link: http://lkml.kernel.org/r/420bcfd6-158b-38e4-98da-26d0cd85bd01@oracle.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2216322919 upstream.
The prepare_fb call always happens on new_plane_state.
The drm_atomic_helper_cleanup_planes checks to see if
plane state pointer has changed when deciding to call cleanup_fb on
either the new_plane_state or the old_plane_state.
For a non-async atomic commit the state pointer is swapped, so this
helper calls prepare_fb on the new_plane_state and cleanup_fb on the
old_plane_state. This makes sense, since we want to prepare the
framebuffer we are going to use and cleanup the the framebuffer we are
no longer using.
For the async atomic update helpers this differs. The async atomic
update helpers perform in-place updates on the existing state. They call
drm_atomic_helper_cleanup_planes but the state pointer is not swapped.
This means that prepare_fb is called on the new_plane_state and
cleanup_fb is called on the new_plane_state (not the old).
In the case where old_plane_state->fb == new_plane_state->fb then
there should be no behavioral difference between an async update
and a non-async commit. But there are issues that arise when
old_plane_state->fb != new_plane_state->fb.
The first is that the new_plane_state->fb is immediately cleaned up
after it has been prepared, so we're using a fb that we shouldn't
be.
The second occurs during a sequence of async atomic updates and
non-async regular atomic commits. Suppose there are two framebuffers
being interleaved in a double-buffering scenario, fb1 and fb2:
- Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
- Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
- Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
We call cleanup_fb on fb2 twice in this example scenario, and any
further use will result in use-after-free.
The simple fix to this problem is to block framebuffer changes
in the drm_atomic_helper_async_check function for now.
v2: Move check by itself, add a FIXME (Daniel)
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: <stable@vger.kernel.org> # v4.14+
Fixes: fef9df8b59 ("drm/atomic: initial support for asynchronous plane update")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Link: https://patchwork.freedesktop.org/patch/275364/
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a1d52994d upstream.
security_mmap_addr() does a capability check with current_cred(), but
we can reach this code from contexts like a VFS write handler where
current_cred() must not be used.
This can be abused on systems without SMAP to make NULL pointer
dereferences exploitable again.
Fixes: 8869477a49 ("security: protect from stack expansion into low vm addresses")
Cc: stable@kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d07e9fadf3 upstream.
Free up the allocated memory in the case of error return
The value of mmc_host->cqe_enabled stays 'false'. Thus, cqhci_disable
(mmc_cqe_ops->cqe_disable) won't be called to free the memory. Also,
cqhci_disable() seems to be designed to disable and free all resources, not
suitable to handle this corner case.
Fixes: a4080225f5 ("mmc: cqhci: support for command queue enabled host")
Signed-off-by: Alamy Liu <alamy.liu@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 27ec9dc17c upstream.
There is not enough space being allocated when DCMD is disabled.
CQE_DCMD is not necessary to be enabled when CQE is enabled.
(Software could halt CQE to send command)
In the case that CQE_DCMD is not enabled, it still needs to allocate
space for data transfer. For instance:
CQE_DCMD is enabled: 31 slots space (one slot used by DCMD)
CQE_DCMD is disabled: 32 slots space
Fixes: a4080225f5 ("mmc: cqhci: support for command queue enabled host")
Signed-off-by: Alamy Liu <alamy.liu@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e5723f95d6 upstream.
In case of CQHCI, mrq->cmd may be NULL for data requests (non DCMD).
In such case mmc_should_fail_request is directly dereferencing
mrq->cmd while cmd is NULL.
Fix this by checking for mrq->cmd pointer.
Fixes: 72a5af554d ("mmc: core: Add support for handling CQE requests")
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5603731a15 upstream.
In R-Car Gen2 or later, the maximum number of transfer blocks are
changed from 0xFFFF to 0xFFFFFFFF. Therefore, Block Count Register
should use iowrite32().
If another system (U-boot, Hypervisor OS, etc) uses bit[31:16], this
value will not be cleared. So, SD/MMC card initialization fails.
So, check for the bigger register and use apropriate write. Also, mark
the register as extended on Gen2.
Signed-off-by: Takeshi Saito <takeshi.saito.xv@renesas.com>
[wsa: use max_blk_count in if(), add Gen2, update commit message]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: stable@kernel.org
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
[Ulf: Fixed build error]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c27ff5db1 upstream.
I have encountered an interrupt storm during the eMMC chip probing (and
the chip finally didn't get detected). It turned out that U-Boot left
the DMAC interrupts enabled while the Linux driver didn't use those.
The SDHI driver's interrupt handler somehow assumes that, even if an
SDIO interrupt didn't happen, it should return IRQ_HANDLED. I think
that if none of the enabled interrupts happened and got handled, we
should return IRQ_NONE -- that way the kernel IRQ code recoginizes
a spurious interrupt and masks it off pretty quickly...
Fixes: 7729c7a232 ("mmc: tmio: Provide separate interrupt handlers")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c9bd505dbd upstream.
When using the mmc_spi driver with a card-detect pin, I noticed that the
card was not detected immediately after probe, but only after it was
unplugged and plugged back in (and the CD IRQ fired).
The call tree looks something like this:
mmc_spi_probe
mmc_add_host
mmc_start_host
_mmc_detect_change
mmc_schedule_delayed_work(&host->detect, 0)
mmc_rescan
host->bus_ops->detect(host)
mmc_detect
_mmc_detect_card_removed
host->ops->get_cd(host)
mmc_gpio_get_cd -> -ENOSYS (ctx->cd_gpio not set)
mmc_gpiod_request_cd
ctx->cd_gpio = desc
To fix this issue, call mmc_detect_change after the card-detect GPIO/IRQ
is registered.
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 94a980c39c ]
Fix a call to userspace_mem_region_find to conform to its spec of
taking an inclusive, inclusive range. It was previously being called
with an inclusive, exclusive range. Also remove a redundant region bounds
check in vm_userspace_mem_region_add. Region overlap checking is already
performed by the call to userspace_mem_region_find.
Tested: Compiled tools/testing/selftests/kvm with -static
Ran all resulting test binaries on an Intel Haswell test machine
All tests passed
Signed-off-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 619ad846fc ]
kvm-unit-tests' eventinj "NMI failing on IDT" test results in NMI being
delivered to the host (L1) when it's running nested. The problem seems to
be: svm_complete_interrupts() raises 'nmi_injected' flag but later we
decide to reflect EXIT_NPF to L1. The flag remains pending and we do NMI
injection upon entry so it got delivered to L1 instead of L2.
It seems that VMX code solves the same issue in prepare_vmcs12(), this was
introduced with code refactoring in commit 5f3d579997 ("KVM: nVMX: Rework
event injection and recovery").
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bb218fbcfa ]
In case of incomplete IPI with invalid interrupt type, the current
SVM driver does not properly emulate the IPI, and fails to boot
FreeBSD guests with multiple vcpus when enabling AVIC.
Fix this by update APIC ICR high/low registers, which also
emulate sending the IPI.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 93183bdbe7 ]
Recently, DMG frequency bands have been extended till 71GHz, so extend
the range check till 20GHz (45-71GHZ), else some channels will be marked
as disabled.
Signed-off-by: Chaitanya Tata <Chaitanya.Tata@bluwireless.co.uk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7c53eb5d87 ]
During refactor in commit 9e478066ea ("mac80211: fix MU-MIMO
follow-MAC mode") a new struct 'action' was declared with packed
attribute as:
struct {
struct ieee80211_hdr_3addr hdr;
u8 category;
u8 action_code;
} __packed action;
But since struct 'ieee80211_hdr_3addr' is declared with an aligned
keyword as:
struct ieee80211_hdr {
__le16 frame_control;
__le16 duration_id;
u8 addr1[ETH_ALEN];
u8 addr2[ETH_ALEN];
u8 addr3[ETH_ALEN];
__le16 seq_ctrl;
u8 addr4[ETH_ALEN];
} __packed __aligned(2);
Solve the ambiguity of placing aligned structure in a packed one by
adding the aligned(2) attribute to struct 'action'.
This removes the following warning (W=1):
net/mac80211/rx.c:234:2: warning: alignment 1 of 'struct <anonymous>' is less than 2 [-Wpacked-not-aligned]
Cc: Johannes Berg <johannes.berg@intel.com>
Suggested-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e95d22c69b ]
The IBM virtual ethernet driver's polling function continues
to process frames after rescheduling NAPI, resulting in a warning
if it exhausted its budget. Do not restart polling after calling
napi_reschedule. Instead let frames be processed in the following
instance.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3b707c3008 ]
__bpf_redirect() and act_mirred checks this boolean
to determine whether to prefix an ethernet header.
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6eea3527e6 ]
The ax88772_bind() should return error code immediately when the PHY
was not reset properly through ax88772a_hw_reset().
Otherwise, The asix_get_phyid() will block when get the PHY
Identifier from the PHYSID1 MII registers through asix_mdio_read()
due to the PHY isn't ready. Furthermore, it will produce a lot of
error message cause system crash.As follows:
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write
reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to send
software reset: ffffffb9
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write
reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to enable
software MII access
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to read
reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write
reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to enable
software MII access
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to read
reg index 0x0000: -71
...
Signed-off-by: Zhang Run <zhang.run@zte.com.cn>
Reviewed-by: Yang Wei <yang.wei9@zte.com.cn>
Tested-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a3c5e2cd79 ]
The bindings for Qualcomm opp levels changed after being Acked but
before landing. Thus the code in the GPU driver that was relying on
the old bindings is now broken.
Let's change the code to match the new bindings by adjusting the old
string 'qcom,level' to the new string 'opp-level'. See the patch
("dt-bindings: opp: Introduce opp-level bindings").
NOTE: we will do additional cleanup to totally remove the string from
the code and use the new dev_pm_opp_get_level() but we'll do it in a
future patch. This will facilitate getting the important code fix in
sooner without having to deal with cross-maintainer dependencies.
This patch needs to land before the patch ("arm64: dts: sdm845: Add
gpu and gmu device nodes") since if a tree contains the device tree
patch but not this one you'll get a crash at bootup.
Fixes: 4b565ca5a2 ("drm/msm: Add A6XX device support")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 78a61cd42a ]
Bit 6 in the ANACAP field is used to indicate that the ANA group ID
doesn't change while the namespace is attached to the controller.
There is an optimisation in the code to only allocate space
for the ANA group header, as the namespace list won't change and
hence would not need to be refreshed.
However, this optimisation was never carried over to the actual
workflow, which always assumes that the buffer is large enough
to hold the ANA header _and_ the namespace list.
So drop this optimisation and always allocate enough space.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4c174e6366 ]
Currently, we have several problems with the timeout
handler:
1. If we timeout on the controller establishment flow, we will hang
because we don't execute the error recovery (and we shouldn't because
the create_ctrl flow needs to fail and cleanup on its own)
2. We might also hang if we get a disconnet on a queue while the
controller is already deleting. This racy flow can cause the controller
disable/shutdown admin command to hang.
We cannot complete a timed out request from the timeout handler without
mutual exclusion from the teardown flow (e.g. nvme_rdma_error_recovery_work).
So we serialize it in the timeout handler and teardown io and admin
queues to guarantee that no one races with us from completing the
request.
Reported-by: Jaesoo Lee <jalee@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 17d9125689 ]
Changing mtu, channels, or buffer sizes ops call to netvsc_attach(),
rndis_set_subchannel(), which always reset the hash key to default
value. That will override hash key changed previously. This patch
fixes the problem by save the hash key, then restore it when we re-
add the netvsc device.
Fixes: ff4a441990 ("netvsc: allow get/set of RSS indirection table")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
[sl: fix up subject line]
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7c9f335a3f ]
These assignments occur in multiple places. The patch refactor them
to a function for simplicity. It also puts the struct to heap area
for future expension.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
[sl: fix up subject line]
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe35a40e67 ]
Assign fc_vport to ln->fc_vport before calling csio_fcoe_alloc_vnp() to
avoid a NULL pointer dereference in csio_vport_set_state().
ln->fc_vport is dereferenced in csio_vport_set_state().
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c41f59884b ]
We cannot wait on a completion object in the lpfc_nvme_targetport structure
in the _destroy_targetport() code path because the NVMe/fc transport will
free that structure immediately after the .targetport_delete() callback.
This results in a use-after-free, and a hang if slub_debug=FZPU is enabled.
Fix this by putting the completion on the stack.
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7961cba6f7 ]
We cannot wait on a completion object in the lpfc_nvme_lport structure in
the _destroy_localport() code path because the NVMe/fc transport will free
that structure immediately after the .localport_delete() callback. This
results in a use-after-free, and a hang if slub_debug=FZPU is enabled.
Fix this by putting the completion on the stack.
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7fc5854f8c ]
sync_inodes_sb() can race against cgwb (cgroup writeback) membership
switches and fail to writeback some inodes. For example, if an inode
switches to another wb while sync_inodes_sb() is in progress, the new
wb might not be visible to bdi_split_work_to_wbs() at all or the inode
might jump from a wb which hasn't issued writebacks yet to one which
already has.
This patch adds backing_dev_info->wb_switch_rwsem to synchronize cgwb
switch path against sync_inodes_sb() so that sync_inodes_sb() is
guaranteed to see all the target wbs and inodes can't jump wbs to
escape syncing.
v2: Fixed misplaced rwsem init. Spotted by Jiufei.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jiufei Xue <xuejiufei@gmail.com>
Link: http://lkml.kernel.org/r/dc694ae2-f07f-61e1-7097-7c8411cee12d@gmail.com
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8b9433eb4d ]
On a DIO_SKIP_HOLES filesystem, the ->get_block() method is currently
not allowed to create blocks for an empty inode. This confusion comes
from trying to bit shift a negative number, so check the size of the
inode first.
The problem is most visible for hfsplus, because the fallback to
buffered I/O doesn't happen and the write fails with EIO. This is in
part the fault of the module, because it gives a wrong return value on
->get_block(); that will be fixed in a separate patch.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 31eb79db42 ]
Often userspace doesn't know when the kernel will be calling dma_buf_detach
on the buffer.
If userpace starts its CPU access at the same time as the sg list is being
freed it could end up accessing the sg list after it has been freed.
Thread A Thread B
- DMA_BUF_IOCTL_SYNC IOCT
- ion_dma_buf_begin_cpu_access
- list_for_each_entry
- ion_dma_buf_detatch
- free_duped_table
- dma_sync_sg_for_cpu
Fix this by getting the ion_buffer lock before freeing the sg table memory.
Fixes: 2a55e7b5e5 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Acked-by: Laura Abbott <labbott@redhat.com>
Acked-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5e1bc251ce ]
Although TMDS clock is required for HDMI to properly function,
nobody called clk_prepare_enable(). This fixes reference counting
issues and makes sure clock is running when it needs to be running.
Due to TDMS clock being parent clock for DDC clock, TDMS clock
was turned on/off for each EDID probe, causing spurious failures
for certain HDMI/DVI screens.
Fixes: 9c5681011a ("drm/sun4i: Add HDMI support")
Signed-off-by: Priit Laes <priit.laes@paf.com>
[Maxime: Moved the TMDS clock enable earlier]
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190122073232.7240-1-plaes@plaes.org
Signed-off-by: Sasha Levin <sashal@kernel.org>