megakernel

Here are 4 public repositories matching this topic...

Fast LLM speculative inference server for consumer hardware.

kernel cuda cuda-kernels nvidia-cuda luce rtx3090 llama-cpp local-ai qwen speculative-decoding dflash megakernel speculative-prefill pflash lucebox

Air.rs 70B+ inference on consumer GPU, LLM inference in Rust

open-source kernel inference lora instruction-set nvidia-cuda open-models apple-silicon llama-cpp ggml qlora local-ai megakernel

A light, transparent, and modular inference & quantization engine for studying LLMs.

framework inference awq multi-backends quantum-kernel cuda-graph megakernel

Qwen3-TTS inference with CUDA megakernels

python agent cuda tts llm pipecat qwen qwen3 megakernel

Add a description, image, and links to the megakernel topic page so that developers can more easily learn about it.

To associate your repository with the megakernel topic, visit your repo's landing page and select "manage topics."