Jai Prakashsingh jvoltci

Going deep on the layer below the model: LLM serving engines, KV-cache and attention internals, and GPU kernels, all built from scratch.

🌐 jvoltci.github.io: the climb, and the log
📚 Mosaic: my open course on AI systems, ML compilers, and inference (7 tracks)
🔗 LinkedIn
🛠 Currently building: a from-scratch LLM inference engine (mini-vLLM). Benchmarks soon.

Provide feedback