Skip to content
View jvoltci's full-sized avatar
💭
Building the machinery beneath mind.
💭
Building the machinery beneath mind.
  • Varanasi

Highlights

  • Pro

Organizations

@ivehement

Block or report jvoltci

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jvoltci/README.md

Jai Prakashsingh — LLM Inference & AI Systems Engineer

Going deep on the layer below the model: LLM serving engines, KV-cache and attention internals, and GPU kernels, all built from scratch.

  • 🌐 jvoltci.github.io: the climb, and the log
  • 📚 Mosaic: my open course on AI systems, ML compilers, and inference (7 tracks)
  • 🔗 LinkedIn
  • 🛠 Currently building: a from-scratch LLM inference engine (mini-vLLM). Benchmarks soon.

Pinned Loading

  1. stream-md stream-md Public

    Streaming markdown for LLMs. 300x fewer chars parsed per token.

    TypeScript

  2. naina naina Public

    An embeddable computer-vision runtime for face & person understanding. C++ core, plug-and-play bindings, runs everywhere — Pi to phone to GPU server.

    C++

  3. ivehement/saf ivehement/saf Public

    Flutter plugin that leverages Storage Access Framework (SAF) API to get access and perform the operations on files and folders.

    Kotlin 25 39