Commit Graph

797820 Commits

Author SHA1 Message Date
Mel Gorman
112ced56ce UPSTREAM: mm: use alloc_flags to record if kswapd can wake
This is a preparation patch that copies the GFP flag __GFP_KSWAPD_RECLAIM
into alloc_flags.  This is a preparation patch only that avoids having to
pass gfp_mask through a long callchain in a future patch.

Note that the setting in the fast path happens in alloc_flags_nofragment()
and it may be claimed that this has nothing to do with ALLOC_NO_FRAGMENT.
That's true in this patch but is not true later so it's done now for
easier review to show where the flag needs to be recorded.

No functional change.

[mgorman@techsingularity.net: ALLOC_KSWAPD flag needs to be applied in the !CONFIG_ZONE_DMA32 case]
  Link: http://lkml.kernel.org/r/20181126143503.GO23260@techsingularity.net
Link: http://lkml.kernel.org/r/20181123114528.28802-4-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I3f54fbfd87f02bd9f926a3913d88ba3055dde33c
Git-commit: 0a79cdad5e
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
(cherry picked from commit 0a79cdad5e)
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Bug: 150378964
2020-03-04 12:01:16 -08:00
Mel Gorman
8ad4b225e8 UPSTREAM: mm, page_alloc: spread allocations across zones before introducing fragmentation
Patch series "Fragmentation avoidance improvements", v5.

It has been noted before that fragmentation avoidance (aka
anti-fragmentation) is not perfect. Given sufficient time or an adverse
workload, memory gets fragmented and the long-term success of high-order
allocations degrades. This series defines an adverse workload, a definition
of external fragmentation events (including serious) ones and a series
that reduces the level of those fragmentation events.

The details of the workload and the consequences are described in more
detail in the changelogs. However, from patch 1, this is a high-level
summary of the adverse workload. The exact details are found in the
mmtests implementation.

The broad details of the workload are as follows;

1. Create an XFS filesystem (not specified in the configuration but done
   as part of the testing for this patch)
2. Start 4 fio threads that write a number of 64K files inefficiently.
   Inefficiently means that files are created on first access and not
   created in advance (fio parameterr create_on_open=1) and fallocate
   is not used (fallocate=none). With multiple IO issuers this creates
   a mix of slab and page cache allocations over time. The total size
   of the files is 150% physical memory so that the slabs and page cache
   pages get mixed
3. Warm up a number of fio read-only threads accessing the same files
   created in step 2. This part runs for the same length of time it
   took to create the files. It'll fault back in old data and further
   interleave slab and page cache allocations. As it's now low on
   memory due to step 2, fragmentation occurs as pageblocks get
   stolen.
4. While step 3 is still running, start a process that tries to allocate
   75% of memory as huge pages with a number of threads. The number of
   threads is based on a (NR_CPUS_SOCKET - NR_FIO_THREADS)/4 to avoid THP
   threads contending with fio, any other threads or forcing cross-NUMA
   scheduling. Note that the test has not been used on a machine with less
   than 8 cores. The benchmark records whether huge pages were allocated
   and what the fault latency was in microseconds
5. Measure the number of events potentially causing external fragmentation,
   the fault latency and the huge page allocation success rate.
6. Cleanup

Overall the series reduces external fragmentation causing events by over 94%
on 1 and 2 socket machines, which in turn impacts high-order allocation
success rates over the long term. There are differences in latencies and
high-order allocation success rates. Latencies are a mixed bag as they
are vulnerable to exact system state and whether allocations succeeded
so they are treated as a secondary metric.

Patch 1 uses lower zones if they are populated and have free memory
	instead of fragmenting a higher zone. It's special cased to
	handle a Normal->DMA32 fallback with the reasons explained
	in the changelog.

Patch 2-4 boosts watermarks temporarily when an external fragmentation
	event occurs. kswapd wakes to reclaim a small amount of old memory
	and then wakes kcompactd on completion to recover the system
	slightly. This introduces some overhead in the slowpath. The level
	of boosting can be tuned or disabled depending on the tolerance
	for fragmentation vs allocation latency.

Patch 5 stalls some movable allocation requests to let kswapd from patch 4
	make some progress. The duration of the stalls is very low but it
	is possible to tune the system to avoid fragmentation events if
	larger stalls can be tolerated.

The bulk of the improvement in fragmentation avoidance is from patches
1-4 but patch 5 can deal with a rare corner case and provides the option
of tuning a system for THP allocation success rates in exchange for
some stalls to control fragmentation.

This patch (of 5):

The page allocator zone lists are iterated based on the watermarks of each
zone which does not take anti-fragmentation into account.  On x86, node 0
may have multiple zones while other nodes have one zone.  A consequence is
that tasks running on node 0 may fragment ZONE_NORMAL even though
ZONE_DMA32 has plenty of free memory.  This patch special cases the
allocator fast path such that it'll try an allocation from a lower local
zone before fragmenting a higher zone.  In this case, stealing of
pageblocks or orders larger than a pageblock are still allowed in the fast
path as they are uninteresting from a fragmentation point of view.

This was evaluated using a benchmark designed to fragment memory before
attempting THP allocations.  It's implemented in mmtests as the following
configurations

configs/config-global-dhp__workload_thpfioscale
configs/config-global-dhp__workload_thpfioscale-defrag
configs/config-global-dhp__workload_thpfioscale-madvhugepage

e.g. from mmtests
./run-mmtests.sh --run-monitor --config configs/config-global-dhp__workload_thpfioscale test-run-1

The broad details of the workload are as follows;

1. Create an XFS filesystem (not specified in the configuration but done
   as part of the testing for this patch).
2. Start 4 fio threads that write a number of 64K files inefficiently.
   Inefficiently means that files are created on first access and not
   created in advance (fio parameter create_on_open=1) and fallocate
   is not used (fallocate=none). With multiple IO issuers this creates
   a mix of slab and page cache allocations over time. The total size
   of the files is 150% physical memory so that the slabs and page cache
   pages get mixed.
3. Warm up a number of fio read-only processes accessing the same files
   created in step 2. This part runs for the same length of time it
   took to create the files. It'll refault old data and further
   interleave slab and page cache allocations. As it's now low on
   memory due to step 2, fragmentation occurs as pageblocks get
   stolen.
4. While step 3 is still running, start a process that tries to allocate
   75% of memory as huge pages with a number of threads. The number of
   threads is based on a (NR_CPUS_SOCKET - NR_FIO_THREADS)/4 to avoid THP
   threads contending with fio, any other threads or forcing cross-NUMA
   scheduling. Note that the test has not been used on a machine with less
   than 8 cores. The benchmark records whether huge pages were allocated
   and what the fault latency was in microseconds.
5. Measure the number of events potentially causing external fragmentation,
   the fault latency and the huge page allocation success rate.
6. Cleanup the test files.

Note that due to the use of IO and page cache that this benchmark is not
suitable for running on large machines where the time to fragment memory
may be excessive.  Also note that while this is one mix that generates
fragmentation that it's not the only mix that generates fragmentation.
Differences in workload that are more slab-intensive or whether SLUB is
used with high-order pages may yield different results.

When the page allocator fragments memory, it records the event using the
mm_page_alloc_extfrag ftrace event.  If the fallback_order is smaller than
a pageblock order (order-9 on 64-bit x86) then it's considered to be an
"external fragmentation event" that may cause issues in the future.
Hence, the primary metric here is the number of external fragmentation
events that occur with order < 9.  The secondary metric is allocation
latency and huge page allocation success rates but note that differences
in latencies and what the success rate also can affect the number of
external fragmentation event which is why it's a secondary metric.

1-socket Skylake machine
config-global-dhp__workload_thpfioscale XFS (no special madvise)
4 fio threads, 1 THP allocating thread
--------------------------------------

4.20-rc3 extfrag events < order 9:   804694
4.20-rc3+patch:                      408912 (49% reduction)

thpfioscale Fault Latencies
                                   4.20.0-rc3             4.20.0-rc3
                                      vanilla           lowzone-v5r8
Amean     fault-base-1      662.92 (   0.00%)      653.58 *   1.41%*
Amean     fault-huge-1        0.00 (   0.00%)        0.00 (   0.00%)

                              4.20.0-rc3             4.20.0-rc3
                                 vanilla           lowzone-v5r8
Percentage huge-1        0.00 (   0.00%)        0.00 (   0.00%)

Fault latencies are slightly reduced while allocation success rates remain
at zero as this configuration does not make any special effort to allocate
THP and fio is heavily active at the time and either filling memory or
keeping pages resident.  However, a 49% reduction of serious fragmentation
events reduces the changes of external fragmentation being a problem in
the future.

Vlastimil asked during review for a breakdown of the allocation types
that are falling back.

vanilla
   3816 MIGRATE_UNMOVABLE
 800845 MIGRATE_MOVABLE
     33 MIGRATE_UNRECLAIMABLE

patch
    735 MIGRATE_UNMOVABLE
 408135 MIGRATE_MOVABLE
     42 MIGRATE_UNRECLAIMABLE

The majority of the fallbacks are due to movable allocations and this is
consistent for the workload throughout the series so will not be presented
again as the primary source of fallbacks are movable allocations.

Movable fallbacks are sometimes considered "ok" to fallback because they
can be migrated.  The problem is that they can fill an
unmovable/reclaimable pageblock causing those allocations to fallback
later and polluting pageblocks with pages that cannot move.  If there is a
movable fallback, it is pretty much guaranteed to affect an
unmovable/reclaimable pageblock and while it might not be enough to
actually cause a unmovable/reclaimable fallback in the future, we cannot
know that in advance so the patch takes the only option available to it.
Hence, it's important to control them.  This point is also consistent
throughout the series and will not be repeated.

1-socket Skylake machine
global-dhp__workload_thpfioscale-madvhugepage-xfs (MADV_HUGEPAGE)
-----------------------------------------------------------------

4.20-rc3 extfrag events < order 9:  291392
4.20-rc3+patch:                     191187 (34% reduction)

thpfioscale Fault Latencies
                                   4.20.0-rc3             4.20.0-rc3
                                      vanilla           lowzone-v5r8
Amean     fault-base-1     1495.14 (   0.00%)     1467.55 (   1.85%)
Amean     fault-huge-1     1098.48 (   0.00%)     1127.11 (  -2.61%)

thpfioscale Percentage Faults Huge
                              4.20.0-rc3             4.20.0-rc3
                                 vanilla           lowzone-v5r8
Percentage huge-1       78.57 (   0.00%)       77.64 (  -1.18%)

Fragmentation events were reduced quite a bit although this is known
to be a little variable. The latencies and allocation success rates
are similar but they were already quite high.

2-socket Haswell machine
config-global-dhp__workload_thpfioscale XFS (no special madvise)
4 fio threads, 5 THP allocating threads
----------------------------------------------------------------

4.20-rc3 extfrag events < order 9:  215698
4.20-rc3+patch:                     200210 (7% reduction)

thpfioscale Fault Latencies
                                   4.20.0-rc3             4.20.0-rc3
                                      vanilla           lowzone-v5r8
Amean     fault-base-5     1350.05 (   0.00%)     1346.45 (   0.27%)
Amean     fault-huge-5     4181.01 (   0.00%)     3418.60 (  18.24%)

                              4.20.0-rc3             4.20.0-rc3
                                 vanilla           lowzone-v5r8
Percentage huge-5        1.15 (   0.00%)        0.78 ( -31.88%)

The reduction of external fragmentation events is slight and this is
partially due to the removal of __GFP_THISNODE in commit ac5b2c1891
("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings") as THP
allocations can now spill over to remote nodes instead of fragmenting
local memory.

2-socket Haswell machine
global-dhp__workload_thpfioscale-madvhugepage-xfs (MADV_HUGEPAGE)
-----------------------------------------------------------------

4.20-rc3 extfrag events < order 9: 166352
4.20-rc3+patch:                    147463 (11% reduction)

thpfioscale Fault Latencies
                                   4.20.0-rc3             4.20.0-rc3
                                      vanilla           lowzone-v5r8
Amean     fault-base-5     6138.97 (   0.00%)     6217.43 (  -1.28%)
Amean     fault-huge-5     2294.28 (   0.00%)     3163.33 * -37.88%*

thpfioscale Percentage Faults Huge
                              4.20.0-rc3             4.20.0-rc3
                                 vanilla           lowzone-v5r8
Percentage huge-5       96.82 (   0.00%)       95.14 (  -1.74%)

There was a slight reduction in external fragmentation events although the
latencies were higher.  The allocation success rate is high enough that
the system is struggling and there is quite a lot of parallel reclaim and
compaction activity.  There is also a certain degree of luck on whether
processes start on node 0 or not for this patch but the relevance is
reduced later in the series.

Overall, the patch reduces the number of external fragmentation causing
events so the success of THP over long periods of time would be improved
for this adverse workload.

Link: http://lkml.kernel.org/r/20181123114528.28802-2-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: If804b49c46fe359ca6addacd4c3a8b36d8571ca6
Git-commit: 6bb154504f
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
(cherry picked from commit 6bb154504f)
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Bug: 150378964
2020-03-04 12:01:16 -08:00
Alistair Delva
bd287b5e45 ANDROID: GKI: update abi for ufshcd changes
Leaf changes summary: 2 artifacts changed
Changed leaf types summary: 2 leaf types changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 0 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

'struct ufs_hba at ufshcd.h:545:1' changed:
  type size changed from 14528 to 14592 (in bits)
  1 data member insertion:
    'size_t ufs_hba::sg_entry_size', at offset 1920 (in bits) at ufshcd.h:593:1
  there are data member changes:
   'unsigned int ufs_hba::irq' offset changed from 1920 to 1984 (in bits) (by +64 bits)
   'bool ufs_hba::is_irq_enabled' offset changed from 1952 to 2016 (in bits) (by +64 bits)
   'unsigned int ufs_hba::quirks' offset changed from 1984 to 2048 (in bits) (by +64 bits)
   'unsigned int ufs_hba::dev_quirks' offset changed from 2016 to 2080 (in bits) (by +64 bits)
   'wait_queue_head_t ufs_hba::tm_wq' offset changed from 2048 to 2112 (in bits) (by +64 bits)
   'wait_queue_head_t ufs_hba::tm_tag_wq' offset changed from 2240 to 2304 (in bits) (by +64 bits)
   'unsigned long int ufs_hba::tm_condition' offset changed from 2432 to 2496 (in bits) (by +64 bits)
   'unsigned long int ufs_hba::tm_slots_in_use' offset changed from 2496 to 2560 (in bits) (by +64 bits)
   'uic_command* ufs_hba::active_uic_cmd' offset changed from 2560 to 2624 (in bits) (by +64 bits)
   'mutex ufs_hba::uic_cmd_mutex' offset changed from 2624 to 2688 (in bits) (by +64 bits)
   'completion* ufs_hba::uic_async_done' offset changed from 2880 to 2944 (in bits) (by +64 bits)
   'u32 ufs_hba::ufshcd_state' offset changed from 2944 to 3008 (in bits) (by +64 bits)
   'u32 ufs_hba::eh_flags' offset changed from 2976 to 3040 (in bits) (by +64 bits)
   'u32 ufs_hba::intr_mask' offset changed from 3008 to 3072 (in bits) (by +64 bits)
   'u16 ufs_hba::ee_ctrl_mask' offset changed from 3040 to 3104 (in bits) (by +64 bits)
   'bool ufs_hba::is_powered' offset changed from 3056 to 3120 (in bits) (by +64 bits)
   'bool ufs_hba::is_init_prefetch' offset changed from 3064 to 3128 (in bits) (by +64 bits)
   'ufs_init_prefetch ufs_hba::init_prefetch_data' offset changed from 3072 to 3136 (in bits) (by +64 bits)
   'work_struct ufs_hba::eh_work' offset changed from 3136 to 3200 (in bits) (by +64 bits)
   'work_struct ufs_hba::eeh_work' offset changed from 3392 to 3456 (in bits) (by +64 bits)
   'u32 ufs_hba::errors' offset changed from 3648 to 3712 (in bits) (by +64 bits)
   'u32 ufs_hba::uic_error' offset changed from 3680 to 3744 (in bits) (by +64 bits)
   'u32 ufs_hba::saved_err' offset changed from 3712 to 3776 (in bits) (by +64 bits)
   'u32 ufs_hba::saved_uic_err' offset changed from 3744 to 3808 (in bits) (by +64 bits)
   'ufs_stats ufs_hba::ufs_stats' offset changed from 3776 to 3840 (in bits) (by +64 bits)
   'bool ufs_hba::silence_err_logs' offset changed from 8064 to 8128 (in bits) (by +64 bits)
   'ufs_dev_cmd ufs_hba::dev_cmd' offset changed from 8128 to 8192 (in bits) (by +64 bits)
   'ktime_t ufs_hba::last_dme_cmd_tstamp' offset changed from 9152 to 9216 (in bits) (by +64 bits)
   'ufs_dev_info ufs_hba::dev_info' offset changed from 9216 to 9280 (in bits) (by +64 bits)
   'bool ufs_hba::auto_bkops_enabled' offset changed from 9232 to 9296 (in bits) (by +64 bits)
   'ufs_vreg_info ufs_hba::vreg_info' offset changed from 9280 to 9344 (in bits) (by +64 bits)
   'list_head ufs_hba::clk_list_head' offset changed from 9536 to 9600 (in bits) (by +64 bits)
   'bool ufs_hba::wlun_dev_clr_ua' offset changed from 9664 to 9728 (in bits) (by +64 bits)
   'int ufs_hba::req_abort_count' offset changed from 9696 to 9760 (in bits) (by +64 bits)
   'u32 ufs_hba::lanes_per_direction' offset changed from 9728 to 9792 (in bits) (by +64 bits)
   'ufs_pa_layer_attr ufs_hba::pwr_info' offset changed from 9760 to 9824 (in bits) (by +64 bits)
   'ufs_pwr_mode_info ufs_hba::max_pwr_info' offset changed from 9984 to 10048 (in bits) (by +64 bits)
   'ufs_clk_gating ufs_hba::clk_gating' offset changed from 10240 to 10304 (in bits) (by +64 bits)
   'u32 ufs_hba::caps' offset changed from 12032 to 12096 (in bits) (by +64 bits)
   'devfreq* ufs_hba::devfreq' offset changed from 12096 to 12160 (in bits) (by +64 bits)
   'ufs_clk_scaling ufs_hba::clk_scaling' offset changed from 12160 to 12224 (in bits) (by +64 bits)
   'bool ufs_hba::is_sys_suspended' offset changed from 13568 to 13632 (in bits) (by +64 bits)
   'bkops_status ufs_hba::urgent_bkops_lvl' offset changed from 13600 to 13664 (in bits) (by +64 bits)
   'bool ufs_hba::is_urgent_bkops_lvl_checked' offset changed from 13632 to 13696 (in bits) (by +64 bits)
   'rw_semaphore ufs_hba::clk_scaling_lock' offset changed from 13696 to 13760 (in bits) (by +64 bits)
   'ufs_desc_size ufs_hba::desc_size' offset changed from 14016 to 14080 (in bits) (by +64 bits)
   'atomic_t ufs_hba::scsi_block_reqs_cnt' offset changed from 14240 to 14304 (in bits) (by +64 bits)
   'ufs_crypto_capabilities ufs_hba::crypto_capabilities' offset changed from 14272 to 14336 (in bits) (by +64 bits)
   'ufs_crypto_cap_entry* ufs_hba::crypto_cap_array' offset changed from 14336 to 14400 (in bits) (by +64 bits)
   'u32 ufs_hba::crypto_cfg_register' offset changed from 14400 to 14464 (in bits) (by +64 bits)
   'keyslot_manager* ufs_hba::ksm' offset changed from 14464 to 14528 (in bits) (by +64 bits)

  7 impacted interfaces

'struct utp_transfer_cmd_desc at ufshci.h:451:1' changed:
  type size changed from 24576 to 8192 (in bits)
  there are data member changes:
   type 'ufshcd_sg_entry[128]' of 'utp_transfer_cmd_desc::prd_table' changed:
     type name changed from 'ufshcd_sg_entry[128]' to 'u8[]'
     array type size changed from 16384 to infinity
     array type subrange 1 changed length from 128 to infinity
     array element type 'struct ufshcd_sg_entry' changed:
       entity changed from 'struct ufshcd_sg_entry' to 'typedef u8' at int-ll64.h:17:1
       type size changed from 128 to 8 (in bits)
   and size changed from 16384 to 0 (in bits) (by -16384 bits)

  7 impacted interfaces

Bug: 129991660
Change-Id: I239c2c3bf5de37f4522922a24d46e921e1e2cbd7
Signed-off-by: Alistair Delva <adelva@google.com>
2020-03-04 07:23:30 -08:00
Alistair Delva
0f1341d69c ANDROID: Unconditionally create bridge tracepoints
Even if the bridge module is not enabled, we may need the tracepoints
downstream in products that enable bridge.ko, so avoid defining the
export of these symbols based on a config option.

Bug: 150625937
Change-Id: Ib961fd6e353fe3bdfde11a38488568f42f1dbe7a
Signed-off-by: Alistair Delva <adelva@google.com>
2020-03-04 07:57:47 +00:00
Alistair Delva
1d0b376606 ANDROID: gki_defconfig: Enable MFD_SYSCON on x86
This option was accidentally skipped when it was added on arm64.

Bug: 144867487
Change-Id: Ifa87a894954ec9a26d5ab40e7d18e2f2f5e4f416
Signed-off-by: Alistair Delva <adelva@google.com>
2020-03-04 06:57:57 +00:00
Eric Biggers
c1825407b5 ANDROID: scsi: ufs: allow ufs variants to override sg entry size
Modify the UFSHCD core to allow 'struct ufshcd_sg_entry' to be
variable-length.  The default is the standard length, but variants can
override ufs_hba::sg_entry_size with a larger value if there are
vendor-specific fields following the standard ones.

This is needed to support inline encryption with ufs-exynos (FMP).

Bug: 129991660
Signed-off-by: Eric Biggers <ebiggers@google.com>

(cherry picked from android-mainline
 commit 8de80df7d7)
(resolved trivial merge conflict in ufshcd_alloc_host())
Change-Id: I6ab9458d5c23331013e6b736d6fea378a6b5b86c
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-03-03 23:57:13 +00:00
Ram Muthiah
553684e4d7 ANDROID: Re-add default y for VIRTIO_PCI_LEGACY
This is a partial revert of Change-Id
I6943cf6b1fedc2b82332c1dcf9a91281a3ca5627.

Bug: 139431025
Test: Treehugger
Signed-off-by: Ram Muthiah <rammuthiah@google.com>
Change-Id: I2ebbe2f20e318cb01206ccbd8c6a5803f7f503b6
2020-03-03 23:28:01 +00:00
Ram Muthiah
c17dadeb28 ANDROID: GKI: build in HVC_DRIVER
Cuttlefish and Goldfish both rely on the virtio console and
HVC_DRIVER is a binary config which is a dep for that driver.

Bug: 150620456
Test: Treehugger
Signed-off-by: Ram Muthiah <rammuthiah@google.com>
Change-Id: I54e7d95da4fcddd534d0f0f48b5c546cd2f2718d
2020-03-03 14:28:12 -08:00
Ram Muthiah
039ca8988a ANDROID: Removed default m for virtual sw crypto device
Goldfish and Cuttlefish already use software encrytion drivers and
don't use this one.

Bug: 150620456
Test: Treehugger
Signed-off-by: Ram Muthiah <rammuthiah@google.com>
Change-Id: I72b0155b5db9bc54bfca0ed99734b7c2c513ceac
2020-03-03 14:26:54 -08:00
Ram Muthiah
b4667bd57d ANDROID: Remove default y on BRIDGE_IGMP_SNOOPING
This binary module gets enabled if BRIDGE, a tristate config, gets
enabled as either a builtin or y. This dependent config should also
be tristate but it seems like that hasn't been done upstream yet.

Bug: 150620456
Test: Treehugger
Signed-off-by: Ram Muthiah <rammuthiah@google.com>
Change-Id: I699b73bfac8a0c6cb5e14fefe56b6c013e2410a8
2020-03-03 14:26:54 -08:00
Ram Muthiah
6ef43b796a ANDROID: GKI: Added missing SND configs
Bug: 150620456
Test: Treehugger
Signed-off-by: Ram Muthiah <rammuthiah@google.com>
Change-Id: I809392ffe06327839be7574195b1f080db2d1f2e
2020-03-03 14:26:53 -08:00
Kiwoong Kim
2110145a33 FROMLIST: ufs: fix a bug on printing PRDT
In some architectures, an unit of PRDTO and PRDTL
in UFSHCI spec assume bytes, not double word specified
in the spec. W/o this patch, when the driver executes
this, kernel panic occurres because of abnormal accesses.

Bug: 149797634
Link: https://lore.kernel.org/linux-scsi/20200218224307.8017-1-kwmad.kim@samsung.com/
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>

(cherry picked from android-mainline
 commit 8ec7bddd87)
Change-Id: I58ffa07535df8011b8d357135b80030833e725f9
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-03-03 13:06:40 -08:00
Qais Yousef
cb0a2cdd48 UPSTREAM: sched/uclamp: Reject negative values in cpu_uclamp_write()
The check to ensure that the new written value into cpu.uclamp.{min,max}
is within range, [0:100], wasn't working because of the signed
comparison

 7301                 if (req.percent > UCLAMP_PERCENT_SCALE) {
 7302                         req.ret = -ERANGE;
 7303                         return req;
 7304                 }

	# echo -1 > cpu.uclamp.min
	# cat cpu.uclamp.min
	42949671.96

Cast req.percent into u64 to force the comparison to be unsigned and
work as intended in capacity_from_percent().

	# echo -1 > cpu.uclamp.min
	sh: write error: Numerical result out of range

Bug: 120440300
Fixes: 2480c09313 ("sched/uclamp: Extend CPU's cgroup controller")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200114210947.14083-1-qais.yousef@arm.com
(cherry picked from commit b562d14064)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I17fc2b119dcbffb212e130ed2c37ae3a8d5bbb61
2020-03-03 11:41:08 +00:00
Greg Kroah-Hartman
7cd2c86c50 Merge 4.19.107 into android-4.19
Changes in 4.19.107
	iommu/qcom: Fix bogus detach logic
	ALSA: hda: Use scnprintf() for printing texts for sysfs/procfs
	ALSA: hda/realtek - Apply quirk for MSI GP63, too
	ALSA: hda/realtek - Apply quirk for yet another MSI laptop
	ASoC: sun8i-codec: Fix setting DAI data format
	ecryptfs: fix a memory leak bug in parse_tag_1_packet()
	ecryptfs: fix a memory leak bug in ecryptfs_init_messaging()
	thunderbolt: Prevent crash if non-active NVMem file is read
	USB: misc: iowarrior: add support for 2 OEMed devices
	USB: misc: iowarrior: add support for the 28 and 28L devices
	USB: misc: iowarrior: add support for the 100 device
	floppy: check FDC index for errors before assigning it
	vt: fix scrollback flushing on background consoles
	vt: selection, handle pending signals in paste_selection
	vt: vt_ioctl: fix race in VT_RESIZEX
	staging: android: ashmem: Disallow ashmem memory from being remapped
	staging: vt6656: fix sign of rx_dbm to bb_pre_ed_rssi.
	xhci: Force Maximum Packet size for Full-speed bulk devices to valid range.
	xhci: fix runtime pm enabling for quirky Intel hosts
	xhci: Fix memory leak when caching protocol extended capability PSI tables - take 2
	usb: host: xhci: update event ring dequeue pointer on purpose
	USB: core: add endpoint-blacklist quirk
	USB: quirks: blacklist duplicate ep on Sound Devices USBPre2
	usb: uas: fix a plug & unplug racing
	USB: Fix novation SourceControl XL after suspend
	USB: hub: Don't record a connect-change event during reset-resume
	USB: hub: Fix the broken detection of USB3 device in SMSC hub
	usb: dwc2: Fix SET/CLEAR_FEATURE and GET_STATUS flows
	usb: dwc3: gadget: Check for IOC/LST bit in TRB->ctrl fields
	staging: rtl8188eu: Fix potential security hole
	staging: rtl8188eu: Fix potential overuse of kernel memory
	staging: rtl8723bs: Fix potential security hole
	staging: rtl8723bs: Fix potential overuse of kernel memory
	powerpc/tm: Fix clearing MSR[TS] in current when reclaiming on signal delivery
	jbd2: fix ocfs2 corrupt when clearing block group bits
	x86/mce/amd: Publish the bank pointer only after setup has succeeded
	x86/mce/amd: Fix kobject lifetime
	x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF
	serial: 8250: Check UPF_IRQ_SHARED in advance
	tty/serial: atmel: manage shutdown in case of RS485 or ISO7816 mode
	tty: serial: imx: setup the correct sg entry for tx dma
	serdev: ttyport: restore client ops on deregistration
	MAINTAINERS: Update drm/i915 bug filing URL
	Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"
	mm/memcontrol.c: lost css_put in memcg_expand_shrinker_maps()
	nvme-multipath: Fix memory leak with ana_log_buf
	genirq/irqdomain: Make sure all irq domain flags are distinct
	mm/vmscan.c: don't round up scan size for online memory cgroup
	drm/amdgpu/soc15: fix xclk for raven
	xhci: apply XHCI_PME_STUCK_QUIRK to Intel Comet Lake platforms
	KVM: nVMX: Don't emulate instructions in guest mode
	KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI
	tty: serial: qcom_geni_serial: Fix UART hang
	tty: serial: qcom_geni_serial: Remove interrupt storm
	tty: serial: qcom_geni_serial: Remove use of *_relaxed() and mb()
	tty: serial: qcom_geni_serial: Remove set_rfr_wm() and related variables
	tty: serial: qcom_geni_serial: Remove xfer_mode variable
	tty: serial: qcom_geni_serial: Fix RX cancel command failure
	lib/stackdepot.c: fix global out-of-bounds in stack_slabs
	drm/nouveau/kms/gv100-: Re-set LUT after clearing for modesets
	ext4: fix a data race in EXT4_I(inode)->i_disksize
	ext4: add cond_resched() to __ext4_find_entry()
	ext4: fix potential race between online resizing and write operations
	ext4: fix potential race between s_group_info online resizing and access
	ext4: fix potential race between s_flex_groups online resizing and access
	ext4: fix mount failure with quota configured as module
	ext4: rename s_journal_flag_rwsem to s_writepages_rwsem
	ext4: fix race between writepages and enabling EXT4_EXTENTS_FL
	KVM: nVMX: Refactor IO bitmap checks into helper function
	KVM: nVMX: Check IO instruction VM-exit conditions
	KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1
	KVM: apic: avoid calculating pending eoi from an uninitialized val
	btrfs: fix bytes_may_use underflow in prealloc error condtition
	btrfs: reset fs_root to NULL on error in open_ctree
	btrfs: do not check delayed items are empty for single transaction cleanup
	Btrfs: fix btrfs_wait_ordered_range() so that it waits for all ordered extents
	Revert "dmaengine: imx-sdma: Fix memory leak"
	scsi: Revert "RDMA/isert: Fix a recently introduced regression related to logout"
	scsi: Revert "target: iscsi: Wait for all commands to finish before freeing a session"
	usb: gadget: composite: Fix bMaxPower for SuperSpeedPlus
	usb: dwc2: Fix in ISOC request length checking
	staging: rtl8723bs: fix copy of overlapping memory
	staging: greybus: use after free in gb_audio_manager_remove_all()
	ecryptfs: replace BUG_ON with error handling code
	iommu/vt-d: Fix compile warning from intel-svm.h
	genirq/proc: Reject invalid affinity masks (again)
	bpf, offload: Replace bitwise AND by logical AND in bpf_prog_offload_info_fill
	ALSA: rawmidi: Avoid bit fields for state flags
	ALSA: seq: Avoid concurrent access to queue flags
	ALSA: seq: Fix concurrent access to queue current tick/time
	netfilter: xt_hashlimit: limit the max size of hashtable
	rxrpc: Fix call RCU cleanup using non-bh-safe locks
	ata: ahci: Add shutdown to freeze hardware resources of ahci
	xen: Enable interrupts when calling _cond_resched()
	s390/mm: Explicitly compare PAGE_DEFAULT_KEY against zero in storage_key_init_range
	Revert "char/random: silence a lockdep splat with printk()"
	Linux 4.19.107

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I74e3d49c54d4afcfa4049042163cb879c3de3100
2020-03-03 07:33:01 +01:00
Suren Baghdasaryan
f78dc61d0e ANDROID: gki_defconfig: Disable CONFIG_RT_GROUP_SCHED
Disable CONFIG_RT_GROUP_SCHED to control RT cpu allowance globally.

ABI update report:

 ABI DIFFERENCES HAVE BEEN DETECTED! (RC=8)
========================================================
Leaf changes summary: 2 artifacts changed
Changed leaf types summary: 2 leaf types changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 0 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

'struct sched_rt_entity at sched.h:481:1' changed:
  type size changed from 576 to 384 (in bits)
  3 data member deletions:
    'sched_rt_entity* sched_rt_entity::parent', at offset 384 (in bits) at sched.h:491:1

    'rt_rq* sched_rt_entity::rt_rq', at offset 448 (in bits) at sched.h:493:1

    'rt_rq* sched_rt_entity::my_q', at offset 512 (in bits) at sched.h:495:1

  1033 impacted interfaces
========================================================

Bug: 149954332
Change-Id: I9487bd113502e52f19637e43109433cb13e97a23
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2020-03-02 13:40:28 -08:00
Ram Muthiah
c2eb4abe93 ANDROID: GKI: Remove CONFIG_BRIDGE from arm64 config
CONFIG_BRIDGE is not needed at boot time and is tristate.
Any GKI device which requires this config can load the bridge module
during init.

Bug: 135666008
Test: Treehugger
Signed-off-by: Ram Muthiah <rammuthiah@google.com>
Change-Id: If22ceac2982a0f6b7a922393fb1dd08c68f6bc70
2020-03-01 01:53:17 +00:00
Matthias Maennich
c9a157ba3f ANDROID: Add ABI Whitelist for qcom
This adds a ABI whitelist definition for qcom SoCs and updates the ABI
representation accordingly.

Leaf changes summary: 1451 artifacts changed
Changed leaf types summary: 1 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 1450 Added functions
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

Bug: 150481249
Change-Id: I45c9dd1a6b6a7bd373a8e6c295ad66a8f21531e1
Signed-off-by: Matthias Maennich <maennich@google.com>
2020-02-28 23:45:43 +00:00
Siarhei Vishniakou
9906480a03 ANDROID: Enable HID_NINTENDO as y
This config will enable the Nintendo Switch Pro controller driver.

Change-Id: I50645a611566928e20a1afd4024f71803ed5fefa
Signed-off-by: Siarhei Vishniakou <svv@google.com>
Bug: 135136477
Test: tested via custom test app
Test: atest NintendoSwitchProTest
2020-02-28 21:32:52 +00:00
Daniel J. Ogorchock
8b094ebb4f FROMLIST: HID: nintendo: add nintendo switch controller driver
The hid-nintendo driver supports the Nintendo Switch Pro Controllers and
the Joy-Cons. The Pro Controllers can be used over USB or Bluetooth.

The Joy-Cons each create their own, independent input devices, so it is
up to userspace to combine them if desired.

Signed-off-by: Daniel J. Ogorchock <djogorchock@gmail.com>

Test: tested via custom test app
Test: atest NintendoSwitchProTest

Bug: 135136477
Link: https://patchwork.kernel.org/patch/11312547/
Link: https://lore.kernel.org/linux-input/20191230012720.2368987-2-djogorchock@gmail.com/
Change-Id: I179da1092faedc2ad25336224cf5ec8ff00e0d3f
Signed-off-by: Siarhei Vishniakou <svv@google.com>
2020-02-28 21:32:40 +00:00
zoro
cba87a6d12 UPSTREAM: regulator/of_get_regulator: add child path to find the regulator supplier
when the VIR_LDO1 regulator supplier is it's brother,
we can't find the supplier.

example code :
&vir_regulator {
	ldo0_vir: ldo0-virtual {
		regulator-compatible = "VIR_LDO0";
		regulator-name= "VIR_LDO0";
		regulator-min-microvolt = <1000000>;
		regulator-max-microvolt = <2000000>;
	};
	ldo1_vir: ldo1-virtual {
		regulator-compatible = "VIR_LDO1";
		regulator-name= "VIR_LDO1";
		regulator-min-microvolt = <1000000>;
		regulator-max-microvolt = <3000000>;
		ldo1-supply = <&ldo0_vir>;
	};
	...
}

so we add the child ptah to find the suppier.

Signed-off-by: zoro <long17.cool@163.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
(cherry picked from commit fe06051dbf)
Bug: 146234416
Signed-off-by: Saravana Kannan <saravanak@google.com>
Change-Id: I660b64b9738d6dd2e2bc224bd18560d755c4baae
2020-02-28 20:57:51 +00:00
Roman Kiryanov
b99814ce58 ANDROID: gki_defconfig: Remove 'BRIDGE_NETFILTER is not set'
BRIDGE_NETFILTER is disabled by default now.

Bug: 147493341
Bug: 150463745
Test: presubmit
Signed-off-by: Roman Kiryanov <rkir@google.com>
Change-Id: Ife8bb0bf2aacf64ad2594a449389e3706e54c5b5
2020-02-28 10:43:10 -08:00
Roman Kiryanov
5d9319bf8a BACKPORT: net: disable BRIDGE_NETFILTER by default
The description says 'If unsure, say N.' but
the module is built as M by default (once
the dependencies are satisfied).

When the module is selected (Y or M), it enables
NETFILTER_FAMILY_BRIDGE and SKB_EXTENSIONS
which alter kernel internal structures.

We (Android Studio Emulator) currently do not
use this module and think this it is more consistent
to have it disabled by default as opposite to
disabling it explicitly to prevent enabling
NETFILTER_FAMILY_BRIDGE and SKB_EXTENSIONS.

Signed-off-by: Roman Kiryanov <rkir@google.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 98bda63e20)
[adelva: rediff against missing SKB_EXTENSIONS change]
Bug: 150463745
Change-Id: I664d752f504598d21747d046930e6d7257c31253
2020-02-28 10:42:51 -08:00
Quentin Perret
457e386a94 ANDROID: kbuild: fix UNUSED_KSYMS_WHITELIST backport
The backport of the UNUSED_KSYMS_WHITELIST feature to android-4.19
missed a dependency on $MODVERDIR (which is still there in 4.19), hence
causing it to un-export any symbol that is not on the whitelist, even if
it has an in-tree user.

I was _really_ close to call that a 'feature', but it wasn't exactly
intended, so I'll call it a 'bug' instead.

Kill the bugger.

Bug: 148277666
Fixes: 3d0431a87a ("BACKPORT: FROMLIST: kbuild: allow symbol
whitelisting with TRIM_UNUSED_KSYMS")
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I35449dc92437f2928a659c8ecd6fbf725e0e1b87
2020-02-28 16:21:09 +00:00
Greg Kroah-Hartman
a083db7611 Linux 4.19.107 2020-02-28 16:39:01 +01:00
Greg Kroah-Hartman
cfc30449bb Revert "char/random: silence a lockdep splat with printk()"
This reverts commit 15341b1dd4 which is
commit 1b710b1b10 upstream.

Lech writes:
	After upgrading kernel on our boards from v4.19.105 to v4.19.106
	we found out that syslog fails to read the messages after ones
	read initially after opening /proc/kmsg just after booting.

	I also found out, that output of 'dmesg --follow' also doesn't
	react on new printks appearing for whatever reason - to read new
	messages, reopening /proc/kmsg or /dev/kmsg was needed.

	I bisected this down to commit
	15341b1dd4 ("char/random: silence
	a lockdep splat with printk()"), and reverting it on top of
	v4.19.106 restored correct behaviour.

While people dig to find out how such an odd change causes a lockup,
let's just revert this for now as it's not all that big of a deal for
4.19.y.

Reported-by: Lech Perczak <l.perczak@camlintechnologies.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:39:00 +01:00
Nathan Chancellor
8541452acb s390/mm: Explicitly compare PAGE_DEFAULT_KEY against zero in storage_key_init_range
commit 3803247349 upstream.

Clang warns:

 In file included from ../arch/s390/purgatory/purgatory.c:10:
 In file included from ../include/linux/kexec.h:18:
 In file included from ../include/linux/crash_core.h:6:
 In file included from ../include/linux/elfcore.h:5:
 In file included from ../include/linux/user.h:1:
 In file included from ../arch/s390/include/asm/user.h:11:
 ../arch/s390/include/asm/page.h:45:6: warning: converting the result of
 '<<' to a boolean always evaluates to false
 [-Wtautological-constant-compare]
         if (PAGE_DEFAULT_KEY)
            ^
 ../arch/s390/include/asm/page.h:23:44: note: expanded from macro
 'PAGE_DEFAULT_KEY'
 #define PAGE_DEFAULT_KEY        (PAGE_DEFAULT_ACC << 4)
                                                  ^
 1 warning generated.

Explicitly compare this against zero to silence the warning as it is
intended to be used in a boolean context.

Fixes: de3fa841e4 ("s390/mm: fix compile for PAGE_DEFAULT_KEY != 0")
Link: https://github.com/ClangBuiltLinux/linux/issues/860
Link: https://lkml.kernel.org/r/20200214064207.10381-1-natechancellor@gmail.com
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:39:00 +01:00
Thomas Gleixner
fee87e931c xen: Enable interrupts when calling _cond_resched()
commit 8645e56a4a upstream.

xen_maybe_preempt_hcall() is called from the exception entry point
xen_do_hypervisor_callback with interrupts disabled.

_cond_resched() evades the might_sleep() check in cond_resched() which
would have caught that and schedule_debug() unfortunately lacks a check
for irqs_disabled().

Enable interrupts around the call and use cond_resched() to catch future
issues.

Fixes: fdfd811ddd ("x86/xen: allow privcmd hypercalls to be preempted")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/878skypjrh.fsf@nanos.tec.linutronix.de
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:39:00 +01:00
Prabhakar Kushwaha
28a73a946a ata: ahci: Add shutdown to freeze hardware resources of ahci
commit 10a663a1b1 upstream.

device_shutdown() called from reboot or power_shutdown expect
all devices to be shutdown. Same is true for even ahci pci driver.
As no ahci shutdown function is implemented, the ata subsystem
always remains alive with DMA & interrupt support. File system
related calls should not be honored after device_shutdown().

So defining ahci pci driver shutdown to freeze hardware (mask
interrupt, stop DMA engine and free DMA resources).

Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:39:00 +01:00
David Howells
43cac315be rxrpc: Fix call RCU cleanup using non-bh-safe locks
commit 963485d436 upstream.

rxrpc_rcu_destroy_call(), which is called as an RCU callback to clean up a
put call, calls rxrpc_put_connection() which, deep in its bowels, takes a
number of spinlocks in a non-BH-safe way, including rxrpc_conn_id_lock and
local->client_conns_lock.  RCU callbacks, however, are normally called from
softirq context, which can cause lockdep to notice the locking
inconsistency.

To get lockdep to detect this, it's necessary to have the connection
cleaned up on the put at the end of the last of its calls, though normally
the clean up is deferred.  This can be induced, however, by starting a call
on an AF_RXRPC socket and then closing the socket without reading the
reply.

Fix this by having rxrpc_rcu_destroy_call() punt the destruction to a
workqueue if in softirq-mode and defer the destruction to process context.

Note that another way to fix this could be to add a bunch of bh-disable
annotations to the spinlocks concerned - and there might be more than just
those two - but that means spending more time with BHs disabled.

Note also that some of these places were covered by bh-disable spinlocks
belonging to the rxrpc_transport object, but these got removed without the
_bh annotation being retained on the next lock in.

Fixes: 999b69f892 ("rxrpc: Kill the client connection bundle concept")
Reported-by: syzbot+d82f3ac8d87e7ccbb2c9@syzkaller.appspotmail.com
Reported-by: syzbot+3f1fd6b8cbf8702d134e@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Hillf Danton <hdanton@sina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:39:00 +01:00
Cong Wang
acbc5071f0 netfilter: xt_hashlimit: limit the max size of hashtable
commit 8d0015a7ab upstream.

The user-specified hashtable size is unbound, this could
easily lead to an OOM or a hung task as we hold the global
mutex while allocating and initializing the new hashtable.

Add a max value to cap both cfg->size and cfg->max, as
suggested by Florian.

Reported-and-tested-by: syzbot+adf6c6c2be1c3a718121@syzkaller.appspotmail.com
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:39:00 +01:00
Takashi Iwai
5a2972600a ALSA: seq: Fix concurrent access to queue current tick/time
commit dc7497795e upstream.

snd_seq_check_queue() passes the current tick and time of the given
queue as a pointer to snd_seq_prioq_cell_out(), but those might be
updated concurrently by the seq timer update.

Fix it by retrieving the current tick and time via the proper helper
functions at first, and pass those values to snd_seq_prioq_cell_out()
later in the loops.

snd_seq_timer_get_cur_time() takes a new argument and adjusts with the
current system time only when it's requested so; this update isn't
needed for snd_seq_check_queue(), as it's called either from the
interrupt handler or right after queuing.

Also, snd_seq_timer_get_cur_tick() is changed to read the value in the
spinlock for the concurrency, too.

Reported-by: syzbot+fd5e0eaa1a32999173b2@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20200214111316.26939-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:39:00 +01:00
Takashi Iwai
b105447809 ALSA: seq: Avoid concurrent access to queue flags
commit bb51e669fa upstream.

The queue flags are represented in bit fields and the concurrent
access may result in unexpected results.  Although the current code
should be mostly OK as it's only reading a field while writing other
fields as KCSAN reported, it's safer to cover both with a proper
spinlock protection.

This patch fixes the possible concurrent read by protecting with
q->owner_lock.  Also the queue owner field is protected as well since
it's the field to be protected by the lock itself.

Reported-by: syzbot+65c6c92d04304d0a8efc@syzkaller.appspotmail.com
Reported-by: syzbot+e60ddfa48717579799dd@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20200214111316.26939-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:39:00 +01:00
Takashi Iwai
63495d1e1c ALSA: rawmidi: Avoid bit fields for state flags
commit dfa9a5efe8 upstream.

The rawmidi state flags (opened, append, active_sensing) are stored in
bit fields that can be potentially racy when concurrently accessed
without any locks.  Although the current code should be fine, there is
also no any real benefit by keeping the bitfields for this kind of
short number of members.

This patch changes those bit fields flags to the simple bool fields.
There should be no size increase of the snd_rawmidi_substream by this
change.

Reported-by: syzbot+576cc007eb9f2c968200@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20200214111316.26939-4-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:59 +01:00
Johannes Krude
bf3043d277 bpf, offload: Replace bitwise AND by logical AND in bpf_prog_offload_info_fill
commit e20d3a055a upstream.

This if guards whether user-space wants a copy of the offload-jited
bytecode and whether this bytecode exists. By erroneously doing a bitwise
AND instead of a logical AND on user- and kernel-space buffer-size can lead
to no data being copied to user-space especially when user-space size is a
power of two and bigger then the kernel-space buffer.

Fixes: fcfb126def ("bpf: add new jited info fields in bpf_dev_offload and bpf_prog_info")
Signed-off-by: Johannes Krude <johannes@krude.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/bpf/20200212193227.GA3769@phlox.h.transitiv.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:59 +01:00
Thomas Gleixner
3132696dd7 genirq/proc: Reject invalid affinity masks (again)
commit cba6437a18 upstream.

Qian Cai reported that the WARN_ON() in the x86/msi affinity setting code,
which catches cases where the affinity setting is not done on the CPU which
is the current target of the interrupt, triggers during CPU hotplug stress
testing.

It turns out that the warning which was added with the commit addressing
the MSI affinity race unearthed yet another long standing bug.

If user space writes a bogus affinity mask, i.e. it contains no online CPUs,
then it calls irq_select_affinity_usr(). This was introduced for ALPHA in

  eee45269b0 ("[PATCH] Alpha: convert to generic irq framework (generic part)")

and subsequently made available for all architectures in

  1840475676 ("genirq: Expose default irq affinity mask (take 3)")

which introduced the circumvention of the affinity setting restrictions for
interrupt which cannot be moved in process context.

The whole exercise is bogus in various aspects:

  1) If the interrupt is already started up then there is absolutely
     no point to honour a bogus interrupt affinity setting from user
     space. The interrupt is already assigned to an online CPU and it
     does not make any sense to reassign it to some other randomly
     chosen online CPU.

  2) If the interupt is not yet started up then there is no point
     either. A subsequent startup of the interrupt will invoke
     irq_setup_affinity() anyway which will chose a valid target CPU.

So the only correct solution is to just return -EINVAL in case user space
wrote an affinity mask which does not contain any online CPUs, except for
ALPHA which has it's own magic sauce for this.

Fixes: 1840475676 ("genirq: Expose default irq affinity mask (take 3)")
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Qian Cai <cai@lca.pw>
Link: https://lkml.kernel.org/r/878sl8xdbm.fsf@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:59 +01:00
Joerg Roedel
ba2c07dfa0 iommu/vt-d: Fix compile warning from intel-svm.h
commit e7598fac32 upstream.

The intel_svm_is_pasid_valid() needs to be marked inline, otherwise it
causes the compile warning below:

  CC [M]  drivers/dma/idxd/cdev.o
In file included from drivers/dma/idxd/cdev.c:9:0:
./include/linux/intel-svm.h:125:12: warning: ‘intel_svm_is_pasid_valid’ defined but not used [-Wunused-function]
 static int intel_svm_is_pasid_valid(struct device *dev, int pasid)
            ^~~~~~~~~~~~~~~~~~~~~~~~

Reported-by: Borislav Petkov <bp@alien8.de>
Fixes: 15060aba71 ('iommu/vt-d: Helper function to query if a pasid has any active users')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:59 +01:00
Aditya Pakki
c0965be4b2 ecryptfs: replace BUG_ON with error handling code
commit 2c2a7552dd upstream.

In crypt_scatterlist, if the crypt_stat argument is not set up
correctly, the kernel crashes. Instead, by returning an error code
upstream, the error is handled safely.

The issue is detected via a static analysis tool written by us.

Fixes: 237fead619 (ecryptfs: fs/Makefile and fs/Kconfig)
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: Tyler Hicks <code@tyhicks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:59 +01:00
Dan Carpenter
1bae8f424c staging: greybus: use after free in gb_audio_manager_remove_all()
commit b7db58105b upstream.

When we call kobject_put() and it's the last reference to the kobject
then it calls gb_audio_module_release() and frees module.  We dereference
"module" on the next line which is a use after free.

Fixes: c77f85bbc9 ("greybus: audio: Fix incorrect counting of 'ida'")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Vaibhav Agarwal <vaibhav.sr@gmail.com>
Link: https://lore.kernel.org/r/20200205123217.jreendkyxulqsool@kili.mountain
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:59 +01:00
Colin Ian King
568991c918 staging: rtl8723bs: fix copy of overlapping memory
commit 8ae9a588ca upstream.

Currently the rtw_sprintf prints the contents of thread_name
onto thread_name and this can lead to a potential copy of a
string over itself. Avoid this by printing the literal string RTWHALXT
instread of the contents of thread_name.

Addresses-Coverity: ("copy of overlapping memory")
Fixes: 554c0a3abf ("staging: Add rtl8723bs sdio wifi driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200126220549.9849-1-colin.king@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:59 +01:00
Minas Harutyunyan
f8e6a3412d usb: dwc2: Fix in ISOC request length checking
commit 860ef6cd3f upstream.

Moved ISOC request length checking from dwc2_hsotg_start_req() function to
dwc2_hsotg_ep_queue().

Fixes: 4fca54aa58 ("usb: gadget: s3c-hsotg: add multi count support")
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:59 +01:00
Jack Pham
de8dbb7b02 usb: gadget: composite: Fix bMaxPower for SuperSpeedPlus
commit c724417baf upstream.

SuperSpeedPlus peripherals must report their bMaxPower of the
configuration descriptor in units of 8mA as per the USB 3.2
specification. The current switch statement in encode_bMaxPower()
only checks for USB_SPEED_SUPER but not USB_SPEED_SUPER_PLUS so
the latter falls back to USB 2.0 encoding which uses 2mA units.
Replace the switch with a simple if/else.

Fixes: eae5820b85 ("usb: gadget: composite: Write SuperSpeedPlus config descriptors")
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:59 +01:00
Bart Van Assche
1cad1a6497 scsi: Revert "target: iscsi: Wait for all commands to finish before freeing a session"
commit 807b9515b7 upstream.

Since commit e9d3009cb9 introduced a regression and since the fix for
that regression was not perfect, revert this commit.

Link: https://marc.info/?l=target-devel&m=158157054906195
Cc: Rahul Kundu <rahul.kundu@chelsio.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Reported-by: Dakshaja Uppalapati <dakshaja@chelsio.com>
Fixes: e9d3009cb9 ("scsi: target: iscsi: Wait for all commands to finish before freeing a session")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:58 +01:00
Bart Van Assche
c66b2b5712 scsi: Revert "RDMA/isert: Fix a recently introduced regression related to logout"
commit 76261ada16 upstream.

Since commit 04060db411 introduces soft lockups when toggling network
interfaces, revert it.

Link: https://marc.info/?l=target-devel&m=158157054906196
Cc: Rahul Kundu <rahul.kundu@chelsio.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Reported-by: Dakshaja Uppalapati <dakshaja@chelsio.com>
Fixes: 04060db411 ("scsi: RDMA/isert: Fix a recently introduced regression related to logout")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:58 +01:00
Greg Kroah-Hartman
b046c6fec0 Revert "dmaengine: imx-sdma: Fix memory leak"
This reverts commit af8eca600b which is
commit 02939cd167 upstream.

Andreas writes:
	This patch breaks our imx6 board with the attached trace.
	Reverting the patch makes it boot again.

Reported-by: Andreas Tobler <andreas.tobler@onway.ch>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Robin Gong <yibin.gong@nxp.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:58 +01:00
Filipe Manana
cd26d53a27 Btrfs: fix btrfs_wait_ordered_range() so that it waits for all ordered extents
commit e75fd33b3f upstream.

In btrfs_wait_ordered_range() once we find an ordered extent that has
finished with an error we exit the loop and don't wait for any other
ordered extents that might be still in progress.

All the users of btrfs_wait_ordered_range() expect that there are no more
ordered extents in progress after that function returns. So past fixes
such like the ones from the two following commits:

  ff612ba784 ("btrfs: fix panic during relocation after ENOSPC before
                   writeback happens")

  28aeeac1dd ("Btrfs: fix panic when starting bg cache writeout after
                   IO error")

don't work when there are multiple ordered extents in the range.

Fix that by making btrfs_wait_ordered_range() wait for all ordered extents
even after it finds one that had an error.

Link: https://github.com/kdave/btrfs-progs/issues/228#issuecomment-569777554
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:58 +01:00
Josef Bacik
4d886f91ca btrfs: do not check delayed items are empty for single transaction cleanup
commit 1e90315149 upstream.

btrfs_assert_delayed_root_empty() will check if the delayed root is
completely empty, but this is a filesystem-wide check.  On cleanup we
may have allowed other transactions to begin, for whatever reason, and
thus the delayed root is not empty.

So remove this check from cleanup_one_transation().  This however can
stay in btrfs_cleanup_transaction(), because it checks only after all of
the transactions have been properly cleaned up, and thus is valid.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:58 +01:00
Josef Bacik
68b7db197b btrfs: reset fs_root to NULL on error in open_ctree
commit 315bf8ef91 upstream.

While running my error injection script I hit a panic when we tried to
clean up the fs_root when freeing the fs_root.  This is because
fs_info->fs_root == PTR_ERR(-EIO), which isn't great.  Fix this by
setting fs_info->fs_root = NULL; if we fail to read the root.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:58 +01:00
Josef Bacik
0ba8e5f347 btrfs: fix bytes_may_use underflow in prealloc error condtition
commit b778cf962d upstream.

I hit the following warning while running my error injection stress
testing:

  WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
  RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs]
  Call Trace:
  btrfs_free_reserved_data_space+0x4f/0x70 [btrfs]
  __btrfs_prealloc_file_range+0x378/0x470 [btrfs]
  elfcorehdr_read+0x40/0x40
  ? elfcorehdr_read+0x40/0x40
  ? btrfs_commit_transaction+0xca/0xa50 [btrfs]
  ? dput+0xb4/0x2a0
  ? btrfs_log_dentry_safe+0x55/0x70 [btrfs]
  ? btrfs_sync_file+0x30e/0x420 [btrfs]
  ? do_fsync+0x38/0x70
  ? __x64_sys_fdatasync+0x13/0x20
  ? do_syscall_64+0x5b/0x1b0
  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

This happens if we fail to insert our reserved file extent.  At this
point we've already converted our reservation from ->bytes_may_use to
->bytes_reserved.  However once we break we will attempt to free
everything from [cur_offset, end] from ->bytes_may_use, but our extent
reservation will overlap part of this.

Fix this problem by adding ins.offset (our extent allocation size) to
cur_offset so we remove the actual remaining part from ->bytes_may_use.

I validated this fix using my inject-error.py script

python inject-error.py -o should_fail_bio -t cache_save_setup -t \
	__btrfs_prealloc_file_range \
	-t insert_reserved_file_extent.constprop.0 \
	-r "-5" ./run-fsstress.sh

where run-fsstress.sh simply mounts and runs fsstress on a disk.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:58 +01:00
Miaohe Lin
e541982a6e KVM: apic: avoid calculating pending eoi from an uninitialized val
commit 23520b2def upstream.

When pv_eoi_get_user() fails, 'val' may remain uninitialized and the return
value of pv_eoi_get_pending() becomes random. Fix the issue by initializing
the variable.

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:58 +01:00
Vitaly Kuznetsov
267eec2d21 KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1
commit 91a5f413af upstream.

Even when APICv is disabled for L1 it can (and, actually, is) still
available for L2, this means we need to always call
vmx_deliver_nested_posted_interrupt() when attempting an interrupt
delivery.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 16:38:58 +01:00