Sandermage

Follow

🏠

Working from home

Sandermage

🏠

Working from home

Follow

6 followers · 1 following

Achievements

Achievements

Popular repositories Loading

genesis-vllm-patches genesis-vllm-patches Public

vLLM patcher for Qwen3.6 on consumer NVIDIA — Qwen3.6-35B-A3B-FP8 (192 tok/s, +68% over stock) + Qwen3.6-27B-int4-AutoRound + 256K context. 126 patches: TurboQuant k8v4 KV, MTP/DFlash spec-decode, …

Python 71 4
vllm vllm Public

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python