jiahongsigma

Follow

🎯

Focusing

Jiahong Dai jiahongsigma

🎯

Focusing

Follow

Time to dance.

4 followers · 5 following

Singapore
03:41 (UTC +08:00)
in/jiahong-dai

Achievements

Achievements

Popular repositories Loading

Efficient-LLM-Inference-Serving-Systems Efficient-LLM-Inference-Serving-Systems Public

Why is LLM inference slow — and how do you make it fast? A hands-on, first-principles course: roofline → KV cache → quantization → parallelism → vLLM/SGLang, with GPU labs on open models.

Python 19 1