Skip to content

Commit f12b435

Browse files
ryanhrobctmarinas
authored andcommitted
arm64: mm: Fix rodata=full block mapping support for realm guests
Commit a166563 ("arm64: mm: support large block mapping when rodata=full") enabled the linear map to be mapped by block/cont while still allowing granular permission changes on BBML2_NOABORT systems by lazily splitting the live mappings. This mechanism was intended to be usable by realm guests since they need to dynamically share dma buffers with the host by "decrypting" them - which for Arm CCA, means marking them as shared in the page tables. However, it turns out that the mechanism was failing for realm guests because realms need to share their dma buffers (via __set_memory_enc_dec()) much earlier during boot than split_kernel_leaf_mapping() was able to handle. The report linked below showed that GIC's ITS was one such user. But during the investigation I found other callsites that could not meet the split_kernel_leaf_mapping() constraints. The problem is that we block map the linear map based on the boot CPU supporting BBML2_NOABORT, then check that all the other CPUs support it too when finalizing the caps. If they don't, then we stop_machine() and split to ptes. For safety, split_kernel_leaf_mapping() previously wouldn't permit splitting until after the caps were finalized. That ensured that if any secondary cpus were running that didn't support BBML2_NOABORT, we wouldn't risk breaking them. I've fix this problem by reducing the black-out window where we refuse to split; there are now 2 windows. The first is from T0 until the page allocator is inititialized. Splitting allocates memory for the page allocator so it must be in use. The second covers the period between starting to online the secondary cpus until the system caps are finalized (this is a very small window). All of the problematic callers are calling __set_memory_enc_dec() before the secondary cpus come online, so this solves the problem. However, one of these callers, swiotlb_update_mem_attributes(), was trying to split before the page allocator was initialized. So I have moved this call from arch_mm_preinit() to mem_init(), which solves the ordering issue. I've added warnings and return an error if any attempt is made to split in the black-out windows. Note there are other issues which prevent booting all the way to user space, which will be fixed in subsequent patches. Reported-by: Jinjiang Tu <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Fixes: a166563 ("arm64: mm: support large block mapping when rodata=full") Cc: [email protected] Reviewed-by: Kevin Brodsky <[email protected]> Signed-off-by: Ryan Roberts <[email protected]> Reviewed-by: Suzuki K Poulose <[email protected]> Tested-by: Suzuki K Poulose <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
1 parent 1f318b9 commit f12b435

3 files changed

Lines changed: 42 additions & 14 deletions

File tree

arch/arm64/include/asm/mmu.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,5 +112,7 @@ void kpti_install_ng_mappings(void);
112112
static inline void kpti_install_ng_mappings(void) {}
113113
#endif
114114

115+
extern bool page_alloc_available;
116+
115117
#endif /* !__ASSEMBLER__ */
116118
#endif

arch/arm64/mm/init.c

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -350,7 +350,6 @@ void __init arch_mm_preinit(void)
350350
}
351351

352352
swiotlb_init(swiotlb, flags);
353-
swiotlb_update_mem_attributes();
354353

355354
/*
356355
* Check boundaries twice: Some fundamental inconsistencies can be
@@ -377,6 +376,14 @@ void __init arch_mm_preinit(void)
377376
}
378377
}
379378

379+
bool page_alloc_available __ro_after_init;
380+
381+
void __init mem_init(void)
382+
{
383+
page_alloc_available = true;
384+
swiotlb_update_mem_attributes();
385+
}
386+
380387
void free_initmem(void)
381388
{
382389
void *lm_init_begin = lm_alias(__init_begin);

arch/arm64/mm/mmu.c

Lines changed: 32 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -768,30 +768,51 @@ static inline bool force_pte_mapping(void)
768768
}
769769

770770
static DEFINE_MUTEX(pgtable_split_lock);
771+
static bool linear_map_requires_bbml2;
771772

772773
int split_kernel_leaf_mapping(unsigned long start, unsigned long end)
773774
{
774775
int ret;
775776

776-
/*
777-
* !BBML2_NOABORT systems should not be trying to change permissions on
778-
* anything that is not pte-mapped in the first place. Just return early
779-
* and let the permission change code raise a warning if not already
780-
* pte-mapped.
781-
*/
782-
if (!system_supports_bbml2_noabort())
783-
return 0;
784-
785777
/*
786778
* If the region is within a pte-mapped area, there is no need to try to
787779
* split. Additionally, CONFIG_DEBUG_PAGEALLOC and CONFIG_KFENCE may
788780
* change permissions from atomic context so for those cases (which are
789781
* always pte-mapped), we must not go any further because taking the
790-
* mutex below may sleep.
782+
* mutex below may sleep. Do not call force_pte_mapping() here because
783+
* it could return a confusing result if called from a secondary cpu
784+
* prior to finalizing caps. Instead, linear_map_requires_bbml2 gives us
785+
* what we need.
791786
*/
792-
if (force_pte_mapping() || is_kfence_address((void *)start))
787+
if (!linear_map_requires_bbml2 || is_kfence_address((void *)start))
793788
return 0;
794789

790+
if (!system_supports_bbml2_noabort()) {
791+
/*
792+
* !BBML2_NOABORT systems should not be trying to change
793+
* permissions on anything that is not pte-mapped in the first
794+
* place. Just return early and let the permission change code
795+
* raise a warning if not already pte-mapped.
796+
*/
797+
if (system_capabilities_finalized())
798+
return 0;
799+
800+
/*
801+
* Boot-time: split_kernel_leaf_mapping_locked() allocates from
802+
* page allocator. Can't split until it's available.
803+
*/
804+
if (WARN_ON(!page_alloc_available))
805+
return -EBUSY;
806+
807+
/*
808+
* Boot-time: Started secondary cpus but don't know if they
809+
* support BBML2_NOABORT yet. Can't allow splitting in this
810+
* window in case they don't.
811+
*/
812+
if (WARN_ON(num_online_cpus() > 1))
813+
return -EBUSY;
814+
}
815+
795816
/*
796817
* Ensure start and end are at least page-aligned since this is the
797818
* finest granularity we can split to.
@@ -891,8 +912,6 @@ static int range_split_to_ptes(unsigned long start, unsigned long end, gfp_t gfp
891912
return ret;
892913
}
893914

894-
static bool linear_map_requires_bbml2 __initdata;
895-
896915
u32 idmap_kpti_bbml2_flag;
897916

898917
static void __init init_idmap_kpti_bbml2_flag(void)

0 commit comments

Comments
 (0)