Releases: openmodelsrun/openmodels
Releases · openmodelsrun/openmodels
0.8.8
Added
- Kimi K2.7 Code (Moonshot AI, China) — open-source, coding-focused model in the Kimi K2 family, built for reliable end-to-end programming over long contexts. 1-trillion-parameter model that cuts reasoning token usage ~30% vs K2.6 while improving coding and agent performance (+21.8% Kimi Code Bench v2, +11.0% Program Bench, +31.5% MLS Bench Lite). Modified MIT License, 1M context window.
- 3 new mappings:
- Kimi K2.7 Code on Moonshot ($1.00/$3.00 per 1M tokens)
- Kimi K2.7 Code on Hugging Face Inference ($0.60/$2.40 per 1M tokens)
- Kimi K2.7 Code on OpenRouter ($0.60/$2.40 per 1M tokens)
0.8.7
Added
- Claude Fable 5 (Anthropic, US) — first publicly available Mythos-class model, exceeding any model Anthropic has previously made generally available. State-of-the-art on nearly all tested benchmarks with exceptional software engineering, knowledge work, vision, and scientific research; its lead grows on longer, more complex tasks. Ships with safeguards that route sensitive cybersecurity, biology, chemistry, and distillation queries to Opus 4.8. 300K context window.
- Claude Mythos 5 (Anthropic, US) — the same underlying model as Fable 5 with safeguards lifted in some areas. Strongest cybersecurity capabilities of any model in the world. Access restricted to trusted cyberdefenders and infrastructure providers via Project Glasswing. 300K context window.
- DiffusionGemma (Google, US) — experimental diffusion-based member of the Gemma 4 open family. Denoises a canvas of placeholder tokens to generate up to 256 tokens in parallel rather than autoregressively, delivering ~4x throughput of similarly sized Gemma models on local hardware. MoE with 26B total / 3.8B active parameters, Apache 2.0, 256K context window.
0.8.0
Added
- 16 new provider-model mappings (total: 151)
Changed
meta/muse-sparkmapping: regions updated fromus-east-1,us-west-2toglobalalibaba-model-studioprovider: addedap-east-1(Hong Kong) region
Improved
validate_registry.py: added referential integrity check — mappingavailable_regionsmust exist in the provider's declared regionsvalidate_registry.py:globalin provider regions acts as wildcard, allowing any region in mappings
0.7.8
Added
-
9 new models:
- Jamba Large 1.7 (AI21 Labs, Israel) — hybrid SSM-Transformer MoE, 256K context, enterprise-grade
- Yi-Lightning (01.AI, China) — MoE architecture, top Chatbot Arena in Chinese/Math/Code
- Falcon-H1 (TII, UAE) — hybrid Mamba-Transformer, outperforms Llama/Qwen in 30-70B range
- Falcon 3 10B (TII, UAE) — #1 on HuggingFace leaderboard under 13B params
- Palmyra X5 (Writer, USA) — 1M context window, adaptive reasoning, enterprise agents
- DBRX (Databricks, USA) — 132B MoE (36B active), open-source enterprise model
- Snowflake Arctic (Snowflake, USA) — 480B MoE (17B active), Apache 2.0, SQL/code specialist
- StableLM 2 12B (Stability AI, UK) — 12.1B decoder, 2T tokens multilingual training
- Alloma 8B Instruct (Uzbek LLM Lab, Uzbekistan) — first Uzbek-optimized LLM with custom tokenizer
-
5 new providers:
- AI21 Labs — Jamba model family with hybrid SSM-Transformer architecture
- Reka AI — multimodal models (text/image/video/audio) with Flash and Edge variants
- Lambda — GPU cloud and managed inference API for open-source models
- Snowflake Cortex AI — Arctic models integrated with Snowflake data platform
- 01.AI — Yi model family with strong Chinese/multilingual capabilities
-
New countries represented: Israel (IL), UAE (AE), Uzbekistan (UZ), UK (GB)
0.7.7
0.7.6
Added
- Solar Pro 3 — Upstage's 102B MoE language model (12B active params) with 128K context. Optimized for Korean with English and Japanese support. Strong reasoning, structured output, and agentic workflows.
- K2 Think — LLM360/MBZUAI's 32B open-weights reasoning model (Apache 2.0). Trained with RL and verifiable rewards for math, science, and code. ~2000 tok/s on Cerebras WSE.
- New provider: Upstage — Korean AI company with OpenAI-compatible API
0.7.5
Added
- Qwen 3.7 Max — Alibaba's flagship proprietary model for advanced agentic coding, complex reasoning, and long-horizon task execution. Ranked #13 in Arena AI Text, #7 in Math, #10 in Coding. Supports 1000+ tool integrations and 35-hour sustained autonomous operation.
- Qwen 3.7 Plus — Alibaba's multimodal variant optimized for vision understanding. Ranked #5 globally in Arena AI Vision leaderboard.
0.7.4
Added
- Gemini 3 Flash — Google's balanced model combining Gemini 3 Pro reasoning with Flash-line latency and cost efficiency. 1M context, configurable thinking levels, streaming function calling.
- Gemini 3.1 Flash-Lite — Google's most cost-efficient model optimized for high-volume, low-latency tasks. 2.5x faster TTFT vs Gemini 2.5 Flash, 1M context, full multimodal support.
- Mappings: Gemini 3 Flash on Google Vertex AI and Google AI Studio ($0.50/$3.00 per 1M tokens)
- Mappings: Gemini 3.1 Flash-Lite on Google Vertex AI and Google AI Studio ($0.25/$1.50 per 1M tokens)
0.7.3
Added
- MiniCPM-V 4.6 — OpenBMB's ultra-efficient 1B multimodal model (vision + video), edge-deployable, 256K context
- Aya Expanse 32B — Cohere For AI's 32B multilingual model, 23 languages, 8K context
- Tiny Aya — Cohere For AI's compact 3.35B multilingual model, 70+ languages, edge-optimized
0.7.2
Added
- Hy3 Preview — Tencent's 295B MoE / 21B active, fast+slow thinking, 256K context, open-weight
- Laguna M.1 — Poolside AI's 225B MoE / 23B active, agentic coding flagship, 128K context
- Mappings: Hy3 Preview on SiliconFlow and OpenRouter, Laguna M.1 on OpenRouter