[ARM] mm: change to read-allocate as default SMP cache policy

the "streaming" mode optimization which skips cacheline allocation
for fully-dirty lines is frequently defeated when coherent processors
perfom stores simultaneously

this results in cachelines being allocated in SMP which are not
allocated when run in uniprocessor, resulting in a significant
reduction in aggregate write bandwidth. for example, on Tegra 2
systems with 300MHz DDR main memory, running memset over a large
buffer (i.e., L2 miss) on a single processor will achieve ~2GB/sec
of write bandwidth, but if the same operation is run in parallel on
both CPUs, the aggregate write bandwidth is just 500MB/sec

changing the cache allocation policy to read-allocate reduces some
of this performance loss on SMP systems.

Change-Id: Ice47ab0a15f2490b7e9a007b4b37800566ed7be1
Signed-off-by: Gary King <gking@nvidia.com>
This commit is contained in:
Gary King
2010-09-15 09:32:10 -07:00
committed by Rebecca Schultz Zavin
parent ea3f8f2347
commit 72e02a1815

View File

@@ -295,7 +295,11 @@ __v7_setup:
* NOS = PRRR[24+n] = 1 - not outer shareable
*/
ldr r5, =0xff0a81a8 @ PRRR
ldr r6, =0x40e040e0 @ NMRR
#ifdef CONFIG_SMP
ldr r6, =0xc0e0c0e0 @ NMRR
#else
ldr r6, =0x40e040e0
#endif
mcr p15, 0, r5, c10, c2, 0 @ write PRRR
mcr p15, 0, r6, c10, c2, 1 @ write NMRR
#endif