commit 698c1aa9f8 upstream.
On the ThinkPad P71, we have one eDP connector exposed along with 5 DP
connectors, resulting in a total of 11 TMDS encoders. Since the GPU on
this system is also capable of MST, we create an additional 4 fake MST
encoders for each DP port. Unfortunately, we also do this for the eDP
port as well, resulting in:
1 eDP port: +1 TMDS encoder
+4 DPMST encoders
5 DP ports: +2 TMDS encoders
+4 DPMST encoders
*5 ports
== 35 encoders
Which breaks things, since DRM has a hard coded limit of 32 encoders.
So, fix this by not creating MSTMs for any eDP connectors. This brings
us down to 31 encoders, although we can do better.
This fixes driver probing for nouveau on the ThinkPad P71.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 443f2d5ba1 upstream.
Observe a segmentation fault when 'perf stat' is asked to repeat forever
with the interval option.
Without fix:
# perf stat -r 0 -I 5000 -e cycles -a sleep 10
# time counts unit events
5.000211692 3,13,89,82,34,157 cycles
10.000380119 1,53,98,52,22,294 cycles
10.040467280 17,16,79,265 cycles
Segmentation fault
This problem was only observed when we use forever option aka -r 0 and
works with limited repeats. Calling print_counter with ts being set to
NULL, is not a correct option when interval is set. Hence avoid
print_counter(NULL,..) if interval is set.
With fix:
# perf stat -r 0 -I 5000 -e cycles -a sleep 10
# time counts unit events
5.019866622 3,15,14,43,08,697 cycles
10.039865756 3,15,16,31,95,261 cycles
10.059950628 1,26,05,47,158 cycles
5.009902655 3,14,52,62,33,932 cycles
10.019880228 3,14,52,22,89,154 cycles
10.030543876 66,90,18,333 cycles
5.009848281 3,14,51,98,25,437 cycles
10.029854402 3,15,14,93,04,918 cycles
5.009834177 3,14,51,95,92,316 cycles
Committer notes:
Did the 'git bisect' to find the cset introducing the problem to add the
Fixes tag below, and at that time the problem reproduced as:
(gdb) run stat -r0 -I500 sleep 1
<SNIP>
Program received signal SIGSEGV, Segmentation fault.
print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866
866 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep);
(gdb) bt
#0 print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866
#1 0x000000000041860a in print_counters (ts=ts@entry=0x0, argc=argc@entry=2, argv=argv@entry=0x7fffffffd640) at builtin-stat.c:938
#2 0x0000000000419a7f in cmd_stat (argc=2, argv=0x7fffffffd640, prefix=<optimized out>) at builtin-stat.c:1411
#3 0x000000000045c65a in run_builtin (p=p@entry=0x6291b8 <commands+216>, argc=argc@entry=5, argv=argv@entry=0x7fffffffd640) at perf.c:370
#4 0x000000000045c893 in handle_internal_command (argc=5, argv=0x7fffffffd640) at perf.c:429
#5 0x000000000045c8f1 in run_argv (argcp=argcp@entry=0x7fffffffd4ac, argv=argv@entry=0x7fffffffd4a0) at perf.c:473
#6 0x000000000045cac9 in main (argc=<optimized out>, argv=<optimized out>) at perf.c:588
(gdb)
Mostly the same as just before this patch:
Program received signal SIGSEGV, Segmentation fault.
0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964
964 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep);
(gdb) bt
#0 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964
#1 0x0000000000588047 in perf_evlist__print_counters (evlist=0xbc9b90, config=0xa1f2a0 <stat_config>, _target=0xa1f0c0 <target>, ts=0x0, argc=2, argv=0x7fffffffd670)
at util/stat-display.c:1172
#2 0x000000000045390f in print_counters (ts=0x0, argc=2, argv=0x7fffffffd670) at builtin-stat.c:656
#3 0x0000000000456bb5 in cmd_stat (argc=2, argv=0x7fffffffd670) at builtin-stat.c:1960
#4 0x00000000004dd2e0 in run_builtin (p=0xa30e00 <commands+288>, argc=5, argv=0x7fffffffd670) at perf.c:310
#5 0x00000000004dd54d in handle_internal_command (argc=5, argv=0x7fffffffd670) at perf.c:362
#6 0x00000000004dd694 in run_argv (argcp=0x7fffffffd4cc, argv=0x7fffffffd4c0) at perf.c:406
#7 0x00000000004dda11 in main (argc=5, argv=0x7fffffffd670) at perf.c:531
(gdb)
Fixes: d4f63a4741 ("perf stat: Introduce print_counters function")
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lore.kernel.org/lkml/20190904094738.9558-3-srikar@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 144783a80c upstream.
Converting from ms to s requires dividing by 1000, not multiplying. So
this is currently taking the smaller of new_timeout and 1.28e8,
i.e. effectively new_timeout.
The driver knows what it set max_hw_heartbeat_ms to, so use that
value instead of doing a division at run-time.
FWIW, this can easily be tested by booting into a busybox shell and
doing "watchdog -t 5 -T 130 /dev/watchdog" - without this patch, the
watchdog fires after 130&127 == 2 seconds.
Fixes: b07e228eee "watchdog: imx2_wdt: Fix set_timeout for big timeout values"
Cc: stable@vger.kernel.org # 5.2 plus anything the above got backported to
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20190812131356.23039-1-linux@rasmusvillemoes.dk
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e430d802d6 upstream.
The timer delayed for more than 3 seconds warning was triggered during
testing.
Workqueue: events_unbound sched_tick_remote
RIP: 0010:sched_tick_remote+0xee/0x100
...
Call Trace:
process_one_work+0x18c/0x3a0
worker_thread+0x30/0x380
kthread+0x113/0x130
ret_from_fork+0x22/0x40
The reason is that the code in collect_expired_timers() uses jiffies
unprotected:
if (next_event > jiffies)
base->clk = jiffies;
As the compiler is allowed to reload the value base->clk can advance
between the check and the store and in the worst case advance farther than
next event. That causes the timer expiry to be delayed until the wheel
pointer wraps around.
Convert the code to use READ_ONCE()
Fixes: 236968383c ("timers: Optimize collect_expired_timers() for NOHZ")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Liang ZhiCheng <liangzhicheng@baidu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1568894687-14499-1-git-send-email-lirongqing@baidu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 314eed30ed upstream.
When running on a system with >512MB RAM with a 32-bit kernel built with:
CONFIG_DEBUG_VIRTUAL=y
CONFIG_HIGHMEM=y
CONFIG_HARDENED_USERCOPY=y
all execve()s will fail due to argv copying into kmap()ed pages, and on
usercopy checking the calls ultimately of virt_to_page() will be looking
for "bad" kmap (highmem) pointers due to CONFIG_DEBUG_VIRTUAL=y:
------------[ cut here ]------------
kernel BUG at ../arch/x86/mm/physaddr.c:83!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc8 #6
Hardware name: Dell Inc. Inspiron 1318/0C236D, BIOS A04 01/15/2009
EIP: __phys_addr+0xaf/0x100
...
Call Trace:
__check_object_size+0xaf/0x3c0
? __might_sleep+0x80/0xa0
copy_strings+0x1c2/0x370
copy_strings_kernel+0x2b/0x40
__do_execve_file+0x4ca/0x810
? kmem_cache_alloc+0x1c7/0x370
do_execve+0x1b/0x20
...
The check is from arch/x86/mm/physaddr.c:
VIRTUAL_BUG_ON((phys_addr >> PAGE_SHIFT) > max_low_pfn);
Due to the kmap() in fs/exec.c:
kaddr = kmap(kmapped_page);
...
if (copy_from_user(kaddr+offset, str, bytes_to_copy)) ...
Now we can fetch the correct page to avoid the pfn check. In both cases,
hardened usercopy will need to walk the page-span checker (if enabled)
to do sanity checking.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: f5509cc18d ("mm: Hardened usercopy")
Cc: Matthew Wilcox <willy@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/201909171056.7F2FFD17@keescook
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 17f8607a16 upstream.
Original changelog from Steve Rostedt (except last sentence which
explains the problem, and the Fixes: tag):
I performed a three way histogram with the following commands:
echo 'irq_lat u64 lat pid_t pid' > synthetic_events
echo 'wake_lat u64 lat u64 irqlat pid_t pid' >> synthetic_events
echo 'hist:keys=common_pid:irqts=common_timestamp.usecs if function == 0xffffffff81200580' > events/timer/hrtimer_start/trigger
echo 'hist:keys=common_pid:lat=common_timestamp.usecs-$irqts:onmatch(timer.hrtimer_start).irq_lat($lat,pid) if common_flags & 1' > events/sched/sched_waking/trigger
echo 'hist:keys=pid:wakets=common_timestamp.usecs,irqlat=lat' > events/synthetic/irq_lat/trigger
echo 'hist:keys=next_pid:lat=common_timestamp.usecs-$wakets,irqlat=$irqlat:onmatch(synthetic.irq_lat).wake_lat($lat,$irqlat,next_pid)' > events/sched/sched_switch/trigger
echo 1 > events/synthetic/wake_lat/enable
Basically I wanted to see:
hrtimer_start (calling function tick_sched_timer)
Note:
# grep tick_sched_timer /proc/kallsyms
ffffffff81200580 t tick_sched_timer
And save the time of that, and then record sched_waking if it is called
in interrupt context and with the same pid as the hrtimer_start, it
will record the latency between that and the waking event.
I then look at when the task that is woken is scheduled in, and record
the latency between the wakeup and the task running.
At the end, the wake_lat synthetic event will show the wakeup to
scheduled latency, as well as the irq latency in from hritmer_start to
the wakeup. The problem is that I found this:
<idle>-0 [007] d... 190.485261: wake_lat: lat=27 irqlat=190485230 pid=698
<idle>-0 [005] d... 190.485283: wake_lat: lat=40 irqlat=190485239 pid=10
<idle>-0 [002] d... 190.488327: wake_lat: lat=56 irqlat=190488266 pid=335
<idle>-0 [005] d... 190.489330: wake_lat: lat=64 irqlat=190489262 pid=10
<idle>-0 [003] d... 190.490312: wake_lat: lat=43 irqlat=190490265 pid=77
<idle>-0 [005] d... 190.493322: wake_lat: lat=54 irqlat=190493262 pid=10
<idle>-0 [005] d... 190.497305: wake_lat: lat=35 irqlat=190497267 pid=10
<idle>-0 [005] d... 190.501319: wake_lat: lat=50 irqlat=190501264 pid=10
The irqlat seemed quite large! Investigating this further, if I had
enabled the irq_lat synthetic event, I noticed this:
<idle>-0 [002] d.s. 249.429308: irq_lat: lat=164968 pid=335
<idle>-0 [002] d... 249.429369: wake_lat: lat=55 irqlat=249429308 pid=335
Notice that the timestamp of the irq_lat "249.429308" is awfully
similar to the reported irqlat variable. In fact, all instances were
like this. It appeared that:
irqlat=$irqlat
Wasn't assigning the old $irqlat to the new irqlat variable, but
instead was assigning the $irqts to it.
The issue is that assigning the old $irqlat to the new irqlat variable
creates a variable reference alias, but the alias creation code
forgets to make sure the alias uses the same var_ref_idx to access the
reference.
Link: http://lkml.kernel.org/r/1567375321.5282.12.camel@kernel.org
Cc: Linux Trace Devel <linux-trace-devel@vger.kernel.org>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: 7e8b88a30b ("tracing: Add hist trigger support for variable reference aliases")
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe55e77032 upstream.
when the battery is set to sbs-mode and no gpio detection is enabled
"health" is always returning a value even when the battery is not present.
All other fields return "not present".
This leads to a scenario where the driver is constantly switching between
"present" and "not present" state. This generates a lot of constant
traffic on the i2c.
This commit changes the response of "health" to an error when the battery
is not responding leading to a consistent "not present" state.
Fixes: 76b16f4cdf ("power: supply: sbs-battery: don't assume MANUFACTURER_DATA formats")
Cc: <stable@vger.kernel.org>
Signed-off-by: Michael Nosthoff <committed@heine.so>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Tested-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 99956a9e08 upstream.
the type flag is stored in the chip->flags field not in the
client->flags field. This currently leads to never using the ti
specific health function as client->flags doesn't use that bit.
So it's always falling back to the general one.
Fixes: 76b16f4cdf ("power: supply: sbs-battery: don't assume MANUFACTURER_DATA formats")
Cc: <stable@vger.kernel.org>
Signed-off-by: Michael Nosthoff <committed@heine.so>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 76a95bd8f9 upstream.
When ccree driver runs it checks the state of the Trusted Execution
Environment CryptoCell driver before proceeding. We did not account
for cases where the TEE side is not ready or not available at all.
Fix it by only considering TEE error state after sync with the TEE
side driver.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Fixes: ab8ec9658f ("crypto: ccree - add FIPS support")
CC: stable@vger.kernel.org # v4.17+
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48f89d2a29 upstream.
IV transfer from ofifo to class2 (set up at [29][30]) is not guaranteed
to be scheduled before the data transfer from ofifo to external memory
(set up at [38]:
[29] 10FA0004 ld: ind-nfifo (len=4) imm
[30] 81F00010 <nfifo_entry: ofifo->class2 type=msg len=16>
[31] 14820004 ld: ccb2-datasz len=4 offs=0 imm
[32] 00000010 data:0x00000010
[33] 8210010D operation: cls1-op aes cbc init-final enc
[34] A8080B04 math: (seqin + math0)->vseqout len=4
[35] 28000010 seqfifold: skip len=16
[36] A8080A04 math: (seqin + math0)->vseqin len=4
[37] 2F1E0000 seqfifold: both msg1->2-last2-last1 len=vseqinsz
[38] 69300000 seqfifostr: msg len=vseqoutsz
[39] 5C20000C seqstr: ccb2 ctx len=12 offs=0
If ofifo -> external memory transfer happens first, DECO will hang
(issuing a Watchdog Timeout error, if WDOG is enabled) waiting for
data availability in ofifo for the ofifo -> c2 ififo transfer.
Make sure IV transfer happens first by waiting for all CAAM internal
transfers to end before starting payload transfer.
New descriptor with jump command inserted at [37]:
[..]
[36] A8080A04 math: (seqin + math0)->vseqin len=4
[37] A1000401 jump: jsl1 all-match[!nfifopend] offset=[01] local->[38]
[38] 2F1E0000 seqfifold: both msg1->2-last2-last1 len=vseqinsz
[39] 69300000 seqfifostr: msg len=vseqoutsz
[40] 5C20000C seqstr: ccb2 ctx len=12 offs=0
[Note: the issue is present in the descriptor from the very beginning
(cf. Fixes tag). However I've marked it v4.19+ since it's the oldest
maintained kernel that the patch applies clean against.]
Cc: <stable@vger.kernel.org> # v4.19+
Fixes: 1acebad3d8 ("crypto: caam - faster aead implementation")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0ba3c026e6 upstream.
skcipher_walk_done may be called with an error by internal or
external callers. For those internal callers we shouldn't unmap
pages but for external callers we must unmap any pages that are
in use.
This patch distinguishes between the two cases by checking whether
walk->nbytes is zero or not. For internal callers, we now set
walk->nbytes to zero prior to the call. For external callers,
walk->nbytes has always been non-zero (as zero is used to indicate
the termination of a walk).
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: 5cde0af2a9 ("[CRYPTO] cipher: Added block cipher type")
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b82feb6c5 upstream.
It seems that smp_processor_id() is only used for a best-effort
load-balancing, refer to qat_crypto_get_instance_node(). It's not feasible
to disable preemption for the duration of the crypto requests. Therefore,
just silence the warning. This commit is similar to e7a9b05ca4
("crypto: cavium - Fix smp_processor_id() warnings").
Silences the following splat:
BUG: using smp_processor_id() in preemptible [00000000] code: cryptomgr_test/2904
caller is qat_alg_ablkcipher_setkey+0x300/0x4a0 [intel_qat]
CPU: 1 PID: 2904 Comm: cryptomgr_test Tainted: P O 4.14.69 #1
...
Call Trace:
dump_stack+0x5f/0x86
check_preemption_disabled+0xd3/0xe0
qat_alg_ablkcipher_setkey+0x300/0x4a0 [intel_qat]
skcipher_setkey_ablkcipher+0x2b/0x40
__test_skcipher+0x1f3/0xb20
? cpumask_next_and+0x26/0x40
? find_busiest_group+0x10e/0x9d0
? preempt_count_add+0x49/0xa0
? try_module_get+0x61/0xf0
? crypto_mod_get+0x15/0x30
? __kmalloc+0x1df/0x1f0
? __crypto_alloc_tfm+0x116/0x180
? crypto_skcipher_init_tfm+0xa6/0x180
? crypto_create_tfm+0x4b/0xf0
test_skcipher+0x21/0xa0
alg_test_skcipher+0x3f/0xa0
alg_test.part.6+0x126/0x2a0
? finish_task_switch+0x21b/0x260
? __schedule+0x1e9/0x800
? __wake_up_common+0x8d/0x140
cryptomgr_test+0x40/0x50
kthread+0xff/0x130
? cryptomgr_notify+0x540/0x540
? kthread_create_on_node+0x70/0x70
ret_from_fork+0x24/0x50
Fixes: ed8ccaef52 ("crypto: qat - Add support for SRIOV")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 677733e296 upstream.
The store ordering vs tlbie issue mentioned in commit
a5d4b5891c ("powerpc/mm: Fixup tlbie vs store ordering issue on
POWER9") is fixed for Nimbus 2.3 and Cumulus 1.3 revisions. We don't
need to apply the fixup if we are running on them
We can only do this on PowerNV. On pseries guest with KVM we still
don't support redoing the feature fixup after migration. So we should
be enabling all the workarounds needed, because whe can possibly
migrate between DD 2.3 and DD 2.2
Fixes: a5d4b5891c ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190924035254.24612-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56090a3902 upstream.
pnv_tce() returns a pointer to a TCE entry and originally a TCE table
would be pre-allocated. For the default case of 2GB window the table
needs only a single level and that is fine. However if more levels are
requested, it is possible to get a race when 2 threads want a pointer
to a TCE entry from the same page of TCEs.
This adds cmpxchg to handle the race. Note that once TCE is non-zero,
it cannot become zero again.
Fixes: a68bd1267b ("powerpc/powernv/ioda: Allocate indirect TCE levels on demand")
CC: stable@vger.kernel.org # v4.19+
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190718051139.74787-2-aik@ozlabs.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 99ead78afd upstream.
The current code would fail on huge pages addresses, since the shift would
be incorrect. Use the correct page shift value returned by
__find_linux_pte() to get the correct physical address. The code is more
generic and can handle both regular and compound pages.
Fixes: ba41e1e1cc ("powerpc/mce: Hookup derror (load/store) UE errors")
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[arbab@linux.ibm.com: Fixup pseries_do_memory_failure()]
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Tested-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820081352.8641-3-santosh@fossix.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b1f373a11d upstream.
VAG power control is improved to fit the manual [1]. This patch fixes as
minimum one bug: if customer muxes Headphone to Line-In right after boot,
the VAG power remains off that leads to poor sound quality from line-in.
I.e. after boot:
- Connect sound source to Line-In jack;
- Connect headphone to HP jack;
- Run following commands:
$ amixer set 'Headphone' 80%
$ amixer set 'Headphone Mux' LINE_IN
Change VAG power on/off control according to the following algorithm:
- turn VAG power ON on the 1st incoming event.
- keep it ON if there is any active VAG consumer (ADC/DAC/HP/Line-In).
- turn VAG power OFF when there is the latest consumer's pre-down event
come.
- always delay after VAG power OFF to avoid pop.
- delay after VAG power ON if the initiative consumer is Line-In, this
prevents pop during line-in muxing.
According to the data sheet [1], to avoid any pops/clicks,
the outputs should be muted during input/output
routing changes.
[1] https://www.nxp.com/docs/en/data-sheet/SGTL5000.pdf
Cc: stable@vger.kernel.org
Fixes: 9b34e6cc3b ("ASoC: Add Freescale SGTL5000 codec support")
Signed-off-by: Oleksandr Suvorov <oleksandr.suvorov@toradex.com>
Reviewed-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://lore.kernel.org/r/20190719100524.23300-3-oleksandr.suvorov@toradex.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 62bacb06b9 upstream.
The kHz to Hz is incorrectly converted in a few places in the code,
this results in a wrong frequency being calculated because devfreq core
uses OPP frequencies that are given in Hz to clamp the rate, while
tegra-devfreq gives to the core value in kHz and then it also expects to
receive value in kHz from the core. In a result memory freq is always set
to a value which is close to ULONG_MAX because of the bug. Hence the EMC
frequency is always capped to the maximum and the driver doesn't do
anything useful. This patch was tested on Tegra30 and Tegra124 SoC's, EMC
frequency scaling works properly now.
Cc: <stable@vger.kernel.org> # 4.14+
Tested-by: Steev Klimaszewski <steev@kali.org>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e9e006f5fc upstream.
This fixes a bug added in 4.10 with commit:
commit 9561a7ade0
Author: Josef Bacik <jbacik@fb.com>
Date: Tue Nov 22 14:04:40 2016 -0500
nbd: add multi-connection support
that limited the number of devices to 256. Before the patch we could
create 1000s of devices, but the patch switched us from using our
own thread to using a work queue which has a default limit of 256
active works.
The problem is that our recv_work function sits in a loop until
disconnection but only handles IO for one connection. The work is
started when the connection is started/restarted, but if we end up
creating 257 or more connections, the queue_work call just queues
connection257+'s recv_work and that waits for connection 1 - 256's
recv_work to be disconnected and that work instance completing.
Instead of reverting back to kthreads, this has us allocate a
workqueue_struct per device, so we can block in the work.
Cc: stable@vger.kernel.org
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ca9419227 upstream.
Reported by syzkaller:
WARNING: CPU: 0 PID: 6544 at /home/kernel/data/kvm/arch/x86/kvm//vmx/vmx.c:4689 handle_desc+0x37/0x40 [kvm_intel]
CPU: 0 PID: 6544 Comm: a.out Tainted: G OE 5.3.0-rc4+ #4
RIP: 0010:handle_desc+0x37/0x40 [kvm_intel]
Call Trace:
vmx_handle_exit+0xbe/0x6b0 [kvm_intel]
vcpu_enter_guest+0x4dc/0x18d0 [kvm]
kvm_arch_vcpu_ioctl_run+0x407/0x660 [kvm]
kvm_vcpu_ioctl+0x3ad/0x690 [kvm]
do_vfs_ioctl+0xa2/0x690
ksys_ioctl+0x6d/0x80
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x74/0x720
entry_SYSCALL_64_after_hwframe+0x49/0xbe
When CR4.UMIP is set, guest should have UMIP cpuid flag. Current
kvm set_sregs function doesn't have such check when userspace inputs
sregs values. SECONDARY_EXEC_DESC is enabled on writes to CR4.UMIP
in vmx_set_cr4 though guest doesn't have UMIP cpuid flag. The testcast
triggers handle_desc warning when executing ltr instruction since
guest architectural CR4 doesn't set UMIP. This patch fixes it by
adding valid CR4 and CPUID combination checking in __set_sregs.
syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=138efb99600000
Reported-by: syzbot+0f1819555fbdce992df9@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ff42df49e7 upstream.
On POWER9, when userspace reads the value of the DPDES register on a
vCPU, it is possible for 0 to be returned although there is a doorbell
interrupt pending for the vCPU. This can lead to a doorbell interrupt
being lost across migration. If the guest kernel uses doorbell
interrupts for IPIs, then it could malfunction because of the lost
interrupt.
This happens because a newly-generated doorbell interrupt is signalled
by setting vcpu->arch.doorbell_request to 1; the DPDES value in
vcpu->arch.vcore->dpdes is not updated, because it can only be updated
when holding the vcpu mutex, in order to avoid races.
To fix this, we OR in vcpu->arch.doorbell_request when reading the
DPDES value.
Cc: stable@vger.kernel.org # v4.13+
Fixes: 579006944e ("KVM: PPC: Book3S HV: Virtualize doorbell facility on POWER9")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d28eafc5a6 upstream.
When we are running multiple vcores on the same physical core, they
could be from different VMs and so it is possible that one of the
VMs could have its arch.mmu_ready flag cleared (for example by a
concurrent HPT resize) when we go to run it on a physical core.
We currently check the arch.mmu_ready flag for the primary vcore
but not the flags for the other vcores that will be run alongside
it. This adds that check, and also a check when we select the
secondary vcores from the preempted vcores list.
Cc: stable@vger.kernel.org # v4.14+
Fixes: 38c53af853 ("KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updates")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 959c5d5134 upstream.
Escalation interrupts are interrupts sent to the host by the XIVE
hardware when it has an interrupt to deliver to a guest VCPU but that
VCPU is not running anywhere in the system. Hence we disable the
escalation interrupt for the VCPU being run when we enter the guest
and re-enable it when the guest does an H_CEDE hypercall indicating
it is idle.
It is possible that an escalation interrupt gets generated just as we
are entering the guest. In that case the escalation interrupt may be
using a queue entry in one of the interrupt queues, and that queue
entry may not have been processed when the guest exits with an H_CEDE.
The existing entry code detects this situation and does not clear the
vcpu->arch.xive_esc_on flag as an indication that there is a pending
queue entry (if the queue entry gets processed, xive_esc_irq() will
clear the flag). There is a comment in the code saying that if the
flag is still set on H_CEDE, we have to abort the cede rather than
re-enabling the escalation interrupt, lest we end up with two
occurrences of the escalation interrupt in the interrupt queue.
However, the exit code doesn't do that; it aborts the cede in the sense
that vcpu->arch.ceded gets cleared, but it still enables the escalation
interrupt by setting the source's PQ bits to 00. Instead we need to
set the PQ bits to 10, indicating that an interrupt has been triggered.
We also need to avoid setting vcpu->arch.xive_esc_on in this case
(i.e. vcpu->arch.xive_esc_on seen to be set on H_CEDE) because
xive_esc_irq() will run at some point and clear it, and if we race with
that we may end up with an incorrect result (i.e. xive_esc_on set when
the escalation interrupt has just been handled).
It is extremely unlikely that having two queue entries would cause
observable problems; theoretically it could cause queue overflow, but
the CPU would have to have thousands of interrupts targetted to it for
that to be possible. However, this fix will also make it possible to
determine accurately whether there is an unhandled escalation
interrupt in the queue, which will be needed by the following patch.
Fixes: 9b9b13a6d1 ("KVM: PPC: Book3S HV: Keep XIVE escalation interrupt masked unless ceded")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190813100349.GD9567@blackberry
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ab57588480 upstream.
ccw console is created early in start_kernel and used before css is
initialized or ccw console subchannel is registered. Until then console
subchannel does not have a parent. For that reason assume subchannels
with no parent are not pseudo subchannels. This fixes the following
kasan finding:
BUG: KASAN: global-out-of-bounds in sch_is_pseudo_sch+0x8e/0x98
Read of size 8 at addr 00000000000005e8 by task swapper/0/0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc8-07370-g6ac43dd12538 #2
Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
Call Trace:
([<000000000012cd76>] show_stack+0x14e/0x1e0)
[<0000000001f7fb44>] dump_stack+0x1a4/0x1f8
[<00000000007d7afc>] print_address_description+0x64/0x3c8
[<00000000007d75f6>] __kasan_report+0x14e/0x180
[<00000000018a2986>] sch_is_pseudo_sch+0x8e/0x98
[<000000000189b950>] cio_enable_subchannel+0x1d0/0x510
[<00000000018cac7c>] ccw_device_recognition+0x12c/0x188
[<0000000002ceb1a8>] ccw_device_enable_console+0x138/0x340
[<0000000002cf1cbe>] con3215_init+0x25e/0x300
[<0000000002c8770a>] console_init+0x68a/0x9b8
[<0000000002c6a3d6>] start_kernel+0x4fe/0x728
[<0000000000100070>] startup_continue+0x70/0xd0
Cc: stable@vger.kernel.org
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f3122a79a1 upstream.
arch_update_cpu_topology is first called from:
kernel_init_freeable->sched_init_smp->sched_init_domains
even before cpus has been registered in:
kernel_init_freeable->do_one_initcall->s390_smp_init
Do not trigger kobject_uevent change events until cpu devices are
actually created. Fixes the following kasan findings:
BUG: KASAN: global-out-of-bounds in kobject_uevent_env+0xb40/0xee0
Read of size 8 at addr 0000000000000020 by task swapper/0/1
BUG: KASAN: global-out-of-bounds in kobject_uevent_env+0xb36/0xee0
Read of size 8 at addr 0000000000000018 by task swapper/0/1
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B
Hardware name: IBM 3906 M04 704 (LPAR)
Call Trace:
([<0000000143c6db7e>] show_stack+0x14e/0x1a8)
[<0000000145956498>] dump_stack+0x1d0/0x218
[<000000014429fb4c>] print_address_description+0x64/0x380
[<000000014429f630>] __kasan_report+0x138/0x168
[<0000000145960b96>] kobject_uevent_env+0xb36/0xee0
[<0000000143c7c47c>] arch_update_cpu_topology+0x104/0x108
[<0000000143df9e22>] sched_init_domains+0x62/0xe8
[<000000014644c94a>] sched_init_smp+0x3a/0xc0
[<0000000146433a20>] kernel_init_freeable+0x558/0x958
[<000000014599002a>] kernel_init+0x22/0x160
[<00000001459a71d4>] ret_from_fork+0x28/0x30
[<00000001459a71dc>] kernel_thread_starter+0x0/0x10
Cc: stable@vger.kernel.org
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a13b03bbb4 upstream.
If the KVM_S390_MEM_OP ioctl is called with an access register >= 16,
then there is certainly a bug in the calling userspace application.
We check for wrong access registers, but only if the vCPU was already
in the access register mode before (i.e. the SIE block has recorded
it). The check is also buried somewhere deep in the calling chain (in
the function ar_translation()), so this is somewhat hard to find.
It's better to always report an error to the userspace in case this
field is set wrong, and it's safer in the KVM code if we block wrong
values here early instead of relying on a check somewhere deep down
the calling chain, so let's add another check to kvm_s390_guest_mem_op()
directly.
We also should check that the "size" is non-zero here (thanks to Janosch
Frank for the hint!). If we do not check the size, we could call vmalloc()
with this 0 value, and this will cause a kernel warning.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lkml.kernel.org/r/20190829122517.31042-1-thuth@redhat.com
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8769f610fe upstream.
With THREAD_INFO_IN_TASK (which is selected on s390) task's stack usage
is refcounted and should always be protected by get/put when touching
other task's stack to avoid race conditions with task's destruction code.
Fixes: d5c352cdd0 ("s390: move thread_info into task_struct")
Cc: stable@vger.kernel.org # v4.10+
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 18917d5147 upstream.
nfc_genl_deactivate_target() relies on the NFC_ATTR_TARGET_INDEX
attribute being present, but doesn't check whether it is actually
provided by the user. Same goes for nfc_genl_fw_download() and
NFC_ATTR_FIRMWARE_NAME.
This patch adds appropriate checks.
Found with syzkaller.
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3675f052b4 upstream.
There is a logic bug in the current smack_bprm_set_creds():
If LSM_UNSAFE_PTRACE is set, but the ptrace state is deemed to be
acceptable (e.g. because the ptracer detached in the meantime), the other
->unsafe flags aren't checked. As far as I can tell, this means that
something like the following could work (but I haven't tested it):
- task A: create task B with fork()
- task B: set NO_NEW_PRIVS
- task B: install a seccomp filter that makes open() return 0 under some
conditions
- task B: replace fd 0 with a malicious library
- task A: attach to task B with PTRACE_ATTACH
- task B: execve() a file with an SMACK64EXEC extended attribute
- task A: while task B is still in the middle of execve(), exit (which
destroys the ptrace relationship)
Make sure that if any flags other than LSM_UNSAFE_PTRACE are set in
bprm->unsafe, we reject the execve().
Cc: stable@vger.kernel.org
Fixes: 5663884caa ("Smack: unify all ptrace accesses in the smack")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e95584a889 ]
We have identified a problem with the "oversubscription" policy in the
link transmission code.
When small messages are transmitted, and the sending link has reached
the transmit window limit, those messages will be bundled and put into
the link backlog queue. However, bundles of data messages are counted
at the 'CRITICAL' level, so that the counter for that level, instead of
the counter for the real, bundled message's level is the one being
increased.
Subsequent, to-be-bundled data messages at non-CRITICAL levels continue
to be tested against the unchanged counter for their own level, while
contributing to an unrestrained increase at the CRITICAL backlog level.
This leaves a gap in congestion control algorithm for small messages
that can result in starvation for other users or a "real" CRITICAL
user. Even that eventually can lead to buffer exhaustion & link reset.
We fix this by keeping a 'target_bskb' buffer pointer at each levels,
then when bundling, we only bundle messages at the same importance
level only. This way, we know exactly how many slots a certain level
have occupied in the queue, so can manage level congestion accurately.
By bundling messages at the same level, we even have more benefits. Let
consider this:
- One socket sends 64-byte messages at the 'CRITICAL' level;
- Another sends 4096-byte messages at the 'LOW' level;
When a 64-byte message comes and is bundled the first time, we put the
overhead of message bundle to it (+ 40-byte header, data copy, etc.)
for later use, but the next message can be a 4096-byte one that cannot
be bundled to the previous one. This means the last bundle carries only
one payload message which is totally inefficient, as for the receiver
also! Later on, another 64-byte message comes, now we make a new bundle
and the same story repeats...
With the new bundling algorithm, this will not happen, the 64-byte
messages will be bundled together even when the 4096-byte message(s)
comes in between. However, if the 4096-byte messages are sent at the
same level i.e. 'CRITICAL', the bundling algorithm will again cause the
same overhead.
Also, the same will happen even with only one socket sending small
messages at a rate close to the link transmit's one, so that, when one
message is bundled, it's transmitted shortly. Then, another message
comes, a new bundle is created and so on...
We will solve this issue radically by another patch.
Fixes: 365ad353c2 ("tipc: reduce risk of user starvation during link congestion")
Reported-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a761129e36 ]
xennet_fill_frags() uses ~0U as return value when the sk_buff is not able
to cache extra fragments. This is incorrect because the return type of
xennet_fill_frags() is RING_IDX and 0xffffffff is an expected value for
ring buffer index.
In the situation when the rsp_cons is approaching 0xffffffff, the return
value of xennet_fill_frags() may become 0xffffffff which xennet_poll() (the
caller) would regard as error. As a result, queue->rx.rsp_cons is set
incorrectly because it is updated only when there is error. If there is no
error, xennet_poll() would be responsible to update queue->rx.rsp_cons.
Finally, queue->rx.rsp_cons would point to the rx ring buffer entries whose
queue->rx_skbs[i] and queue->grant_rx_ref[i] are already cleared to NULL.
This leads to NULL pointer access in the next iteration to process rx ring
buffer entries.
The symptom is similar to the one fixed in
commit 00b368502d ("xen-netfront: do not assume sk_buff_head list is
empty in error handling").
This patch changes the return type of xennet_fill_frags() to indicate
whether it is successful or failed. The queue->rx.rsp_cons will be
always updated inside this function.
Fixes: ad4f15dc2c ("xen/netfront: don't bug in case of too many frags")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>