Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/actions/spelling/expect.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
Akad

Check warning on line 1 in .github/actions/spelling/expect.txt

View workflow job for this annotation

GitHub Actions / Check Spelling

Skipping `.github/actions/spelling/expect.txt` because it seems to have more noise (63) than unique words (9) (total: 63 / 9). (noisy-file)
Akida
Amodu
amodu
autodiff
Ayokunle
ayokunle
backpropation
Bassiouni
bassiouni
Expand Down Expand Up @@ -36,6 +40,7 @@
Medhane
medhane
Mtools
MLIR
Nellamakada
Neuromorphic
neuromorphic
Expand Down
25 changes: 25 additions & 0 deletions _data/contributors.yml
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,31 @@
# TEAM #
################################################################################

- name: Ayokunle Amodu
info: "C++ Alliance Fellowship 2026 Contributor"
photo: Ayokunle.jpg
email: [email protected]
education: Bsc in Computing Science, University of Alberta, Canada, 2022-2026
active: 1
projects:
- title: "Optimize Usage of Source Locations in Clang Modules"
status: Ongoing
description: |
This project aims to reduce memory pressure caused by duplicated source locations
in Clang's modular builds. In large-scale C++ projects, repeatedly including the
same headers across different modules can quickly exhaust Clang's 32-bit
source-location representation. Instead of switching to a more costly 64-bit
design, this project explores reusing existing source-location allocations through
an interval-mapping mechanism and careful coordination during module loading. The
work involves updating module deserialization, preserving diagnostic correctness,
and ensuring include-stack reconstruction remains transparent. If successful,
duplicated module inputs will no longer cause proportional growth in source-location
allocations, modular builds will complete without exhaustion, and Clang will scale
better for modern modular C++ codebases, all without requiring a disruptive global
change to the compiler.
proposal: /assets/docs/Ayokunle_Amodu_Proposal_2026.pdf
mentors: Vassil Vassilev, Aaron Jomy

- name: Ezzeldin Ibrahim
photo: Ezz.jpg
info: "C++ Alliance Fellowship 2026 Contributor"
Expand Down
10 changes: 10 additions & 0 deletions _pages/team/ayokunle-amodu.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: "Compiler Research - Team - Ayokunle Amodu"
layout: gridlay
excerpt: "Compiler Research: Team members"
sitemap: false
permalink: /team/AyokunleAmodu
email: [email protected]
---

{% include team-profile.html %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
title: "Optimize Usage of Source Locations in Clang Modules"
layout: post
excerpt: "Reducing source-location memory pressure in modular C++ builds through allocation reuse and interval mapping."
sitemap: false
author: Ayokunle Amodu
permalink: blogs/optimize_source_locations_clang_modules_Ayokunle_Amodu_blog/
thumbnail_image: /images/cppalliance-logo.svg
date: 2026-05-20
tags: clang llvm modules source-locations cppalliance
---

{% include dual-banner.html
left_logo="/images/cppalliance-logo.svg"
right_logo="/images/cr-logo.png"
caption=""
height="20vh" %}

## Introduction

Hi everyone! I am Ayokunle Amodu, joining the Compiler Research team as a CppAlliance
Fellow for the 2026 cycle. I am an MLIR geek at heart. Most of my compiler work has
lived in the middle end, progressive lowering. I love seeing how high-level structure
survives a full lowering chain and still means something at the bottom.

This project is front-end territory, not my typical residence, but being a comp arch
kid who has spent time with RISC-V assembly means a 32-bit space running out of room
is a very familiar kind of problem. Working under Vassil Vassilev with real upstream
review as the target is exactly the kind of opportunity I wanted.

## The Problem

Clang tracks every position in source code as a 32-bit offset into a global address
space managed by `SourceManager`. That space is finite, and in large modular builds it
fills up faster than you would expect. The issue is not the size of the representation
itself. It is that when multiple modules include the same headers, Clang has no memory
of having already mapped those files and simply maps them again. The address space
grows with the number of modules rather than the number of unique files, and once it
runs out, the build fails.

The community looked seriously at widening `SourceLocation` to 64 bits to buy more
room, but the proposal did not land. The concern was legitimate: `SourceLocation` is
embedded across a huge number of AST nodes, so making it bigger raises memory usage
for every build, not just the ones actually hitting the limit. It is a global cost for
a problem that has a much more local cause.

## The Approach

The core idea is to introduce deduplication at the point of allocation. Before
assigning a new region for a module's input files, we check whether those files have
already been allocated and reuse the existing range if they have. This keeps the
address space from growing proportionally with module count and lets it scale with
unique content instead.

The work starts with a `DenseMap`-based prototype to validate the concept and confirm
the numbers move in the right direction. From there it extends to an
`llvm::IntervalMap` design that handles more complex scenarios, including cases where
different modules serialize different amounts of location data for the same file.
Alongside the implementation, the project covers diagnostic correctness, include-stack
reconstruction, and upstream patch preparation for LLVM review.

## What We Hope to Achieve

If this works the way we expect, large modular builds that are currently approaching
the limit will complete without exhaustion. Duplicated headers stop eating into the
address space, diagnostics stay correct, and the fix is local enough that nothing
else in the compiler has to change. The address space scales with unique content
instead of module count.

## Relevant Links

- [LLVM Discourse: Revisiting 64-bit source locations](https://discourse.llvm.org/t/revisiting-64-bit-source-locations/86556)
- [LLVM Discourse: RFC: An opt-in CMake option for 64-bit Source Location](https://discourse.llvm.org/t/rfc-an-opt-in-cmake-option-for-64-bit-source-location/87538)

May the blessings of the Dragon be with us.
Binary file added assets/docs/Ayokunle_Amodu_Proposal_2026.pdf
Binary file not shown.
Binary file added images/team/Ayokunle.jpg

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this image smaller in size?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resized it. Is this better now?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. I guess it can become better optimized if you do webp (optimize for web usage).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I ran it through https://squoosh.app/ (a web optimizer for images). It's smaller in file size now.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading