ARM: LPAE: Invalidate the TLB for module addresses during translation fault

During the free_pgtables() call all user and modules/pkmap entries are
removed. If a translation fault for the modules/pkmap area occurs before
we switched away from the current pgd, do_translation_fault() would copy
the init_mm pud into the user pud.

There is a small window between pud_clear() and pmd_free_tlb() in
free_pmd_range() where the pud entry was cleared but the TLB has not
been invalidated yet and the CPU may have cached the original (valid)
pud entry in the TLB. A scenario like below would get stuck in
continuous prefetch abort:

1. Current process exiting. The modules pmd entries not populated
2. exit_mmap() -> ... -> pmd_free_tlb()
3. pud_clear() for the 1GB pud containing user stack and modules (no TLB
   invalidation yet)
4. Interrupt -> module interrupt routine
5. Level 2 (pmd) translation fault occurs when executing the module
   interrupt routine. The CPU previously cached (TLB) the old valid pud
   value for the modules area, so we don't get a level 1 translation
   fault
6. do_translation fault() copies the pud_k into the pud
7. Linux returns to the faulty instruction. Goes back to 5

At point 7, since the CPU still has the old pud value, it goes back to
point 5 and never gets out of this loop. With this patch, the stale pud
TLB entry is invalidated after point 6 and the subsequent prefetch abort
doesn't occur.

Reported-by: Tony Thompson <Anthony.Thompson@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This commit is contained in:
Catalin Marinas
2012-02-23 17:16:41 +00:00
committed by Jon Medhurst
parent 8bb495e3f0
commit 8570503862

View File

@@ -446,8 +446,16 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
if (pud_none(*pud_k))
goto bad_area;
if (!pud_present(*pud))
if (!pud_present(*pud)) {
set_pud(pud, *pud_k);
/*
* There is a small window during free_pgtables() where the
* user *pud entry is 0 but the TLB has not been invalidated
* and we get a level 2 (pmd) translation fault caused by the
* intermediate TLB caching of the old level 1 (pud) entry.
*/
flush_tlb_kernel_page(addr);
}
pmd = pmd_offset(pud, addr);
pmd_k = pmd_offset(pud_k, addr);
@@ -470,8 +478,9 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
#endif
if (pmd_none(pmd_k[index]))
goto bad_area;
if (!pmd_present(pmd[index]))
copy_pmd(pmd, pmd_k);
copy_pmd(pmd, pmd_k);
return 0;
bad_area: