diff --git a/.github/actions/spelling/allow/terms.txt b/.github/actions/spelling/allow/terms.txt index 3c30b1e3..667e3c02 100644 --- a/.github/actions/spelling/allow/terms.txt +++ b/.github/actions/spelling/allow/terms.txt @@ -134,6 +134,7 @@ GSCHEP GSMODE HOMMEXX IDD +interactable interoperate IPDPS jacobians @@ -202,6 +203,7 @@ Fibroblasts fibroblasts Hesam macrophages +Mandelbrot MDSCs Montigny Mutlu @@ -219,3 +221,4 @@ LibTorch Nanjing pytorch PyTorch +unoptimised diff --git a/_data/standing_meetings.yml b/_data/standing_meetings.yml index 07e8d86c..60be9d95 100644 --- a/_data/standing_meetings.yml +++ b/_data/standing_meetings.yml @@ -3,6 +3,10 @@ time_cest: "17:00" connect: "[Link to zoom](https://princeton.zoom.us/j/97915651167?pwd=MXJ1T2lhc3Z5QWlYbUFnMTZYQlNRdz09)" agenda: + - title: "Creating teaching materials with xeus-cpp final report" + date: 2026-06-10 17:00:00 +0200 + speaker: "Hristiyan Shterev" + link: "[Slides](/assets/presentations/Hristiyan-Shterev-Teaching-materials-with-xeus-cpp-final.pdf)" - title: "Enhance Clang Diagnostics Initial Presentation" date: 2026-05-27 17:00:00 +0200 speaker: "Aditya Medhane" diff --git a/_posts/2026-06-10-creating-teaching-materials-xeus-cpp-final.md b/_posts/2026-06-10-creating-teaching-materials-xeus-cpp-final.md new file mode 100644 index 00000000..237d4937 --- /dev/null +++ b/_posts/2026-06-10-creating-teaching-materials-xeus-cpp-final.md @@ -0,0 +1,68 @@ +--- +title: "Final report for creating teaching materials with xeus-cpp" +layout: post +excerpt: "A Final report of my project about creating notebooks for CUDA and OpenMP with xeus-cpp" +sitemap: false +author: Hristiyan Shterev +permalink: /blogs/xeus-cpp_Hristiyan_Shterev_blog_final/ +thumbnail_image: /images/mg-pld-logo.png +date: 2026-06-10 +tags: c++ xeus-cpp jupyter internship systems-programming high-school cuda +--- + + +{% include dual-banner.html +left_logo="/images/mg-pld-logo.png" +right_logo="/images/cr-logo_old.png" +caption="" +height="20vh" %} + +## Main goals of this project + +The main goal of this project was to create interactable examples for the for the CUDA and OpenMP programming models, targeting beginners who want hands-on experience with parallel computing on both the GPU and CPU. + +Each notebook builds on the previous one, introducing new concepts gradually through working code examples. + +Together the 16 notebooks cover the full beginner to intermediate journey - from launching a first thread to understanding memory hierarchies, synchronisation primitives, and performance optimization on both CPU and GPU. + +## The CUDA notebooks + +The CUDA notebooks include 8 different examples that add to one another. Here is what each one explains: + +- The first notebook is a basic introduction to CUDA with simple examples like the __global__ kernel usage and calling it. +- Next we introduce some more fundamental concepts. +- The third notebook shows how using parallel programming can speed up if we use the threads the right way. +- After that we show thread cooperation. The threads split the work instead of having one thread per task +- Then we demonstrate The Julia set. A complex mathematical shape. +- The sixth example creates a ripple pattern. +- Next we demonstrate the dot product. A mathematical operation that takes two equal-length vectors and returns a single regular number +- Lastly there is a simple ray tracing example. + +### CUDA benchmark vs the CPU + +This benchmark adds two vectors with 10 million elements each into a third one. This is done 3 times using different methods. + +- The first one is a basic CPU only demonstration. The time is around 21 ms. +- The second example is using the GPU but with only 1 thread per block. We can see that this is slower compared to the first one. This is a very unoptimized way to use the device. +- The third method is now a lot faster than the other 2. We make each block use 256 threads which speeds up the time by a lot - around 2 milliseconds. + +Benchmarked comparison of CUDA vs the CPU + +## The OpenMP notebooks + +The OpenMP notebooks also include 8 different examples that add to one another. Here is what each one explains: + +- The first notebook is a basic introduction to OpenMP with simple examples like the #pragma omp parallel directive and thread creation. +- Next we introduce the fork-join model and how threads are spawned and joined back together. +- Then we demonstrate the Pi integral. A mathematical problem solved by splitting the work across multiple threads. +- The fourth notebook calculates the area of the Mandelbrot set. A complex mathematical shape rendered in parallel by assigning different regions to different threads. +- After that we show linked list traversal. How pointer-based data structures interact with parallel execution. +- The sixth example demonstrates race conditions. What happens when threads write to the same memory without protection and how to fix it. +- Next we demonstrate false sharing. How threads can slow each other down even when touching different variables due to CPU cache line behaviour. +- Lastly there is Conway's Game of Life. A grid simulation where threads compute the next generation in parallel using double buffering to avoid race conditions by design. + +## Related links + +- [Xeus-cpp repository](https://github.com/compiler-research/xeus-cpp) +- [My github account](https://github.com/HrisShterev) +- [Notebooks repository](https://github.com/compiler-research/live-cpp-tutorials/) diff --git a/assets/presentations/Hristiyan-Shterev-Teaching-materials-with-xeus-cpp-final.pdf b/assets/presentations/Hristiyan-Shterev-Teaching-materials-with-xeus-cpp-final.pdf new file mode 100644 index 00000000..8046bbf6 Binary files /dev/null and b/assets/presentations/Hristiyan-Shterev-Teaching-materials-with-xeus-cpp-final.pdf differ diff --git a/images/blog/cuda-vs-cpu-benchmark.png b/images/blog/cuda-vs-cpu-benchmark.png new file mode 100644 index 00000000..bd522ad3 Binary files /dev/null and b/images/blog/cuda-vs-cpu-benchmark.png differ