TextAudio is a comprehensive, production-ready text-to-speech platform that converts documents into high-quality audiobooks using advanced AI voice synthesis. Built with a modern microservices architecture, it supports 23 languages, voice cloning, and real-time progress tracking.
- π 23 Language Support - Multi-language TTS with intelligent routing
- ποΈ Voice Cloning - Clone voices for personalized audio
- β‘ Real-time Progress - Server-Sent Events (SSE) for live updates
- π Smart Retry System - 3-tier retry logic with automatic credit bonuses
- π― Pitch-Preserving Speed Control - Adjust playback speed (0.75x-1.5x)
- π± Mobile-Responsive - Modern SvelteKit 5 frontend
- π§ͺ Production-Ready - 97%+ test coverage (192 backend + 107 frontend tests)
- π³ Docker-First - Complete containerized deployment
This project was developed as a commercial product and is now being released to the open source community. While feature-complete and production-ready, we believe it will be more valuable as a community project. We welcome contributions and hope this serves as a reference implementation for modern TTS platforms.
What's Included:
- β Complete microservices architecture
- β Comprehensive test suite
- β Production-grade code quality
- β Real-world authentication & session management
- β Credit & retry system implementation
What's Missing (Contributions Welcome!):
- β³ Payment integration (Stripe/PayPal planned)
- β³ Storage encryption (AES-256-GCM planned)
Community-Driven Development: As of December 2025, we (the original creators) are transitioning this project to community maintenance. We will not be actively implementing new features, but the project is fully functional and production-ready.
Our Continued Role:
- π Documentation Support - We'll help improve and clarify documentation
- π Issue Guidance - We can provide context and guidance on reported issues
- π‘ Architecture Questions - Happy to explain design decisions and codebase structure
- π Code Review - We may review significant PRs when time permits
We Encourage You To:
- π¨ Fork and extend the project for your needs
- π Submit pull requests for new features
- π Improve documentation
- π Report and fix bugs
- π¬ Help other community members in discussions
As the "parents" of this project, we're excited to see where the community takes it! Feel free to open issues for guidance or questions about the codebase.
- Docker & Docker Compose
- Python 3.12+ (for local development)
- Node.js 18+ (for frontend development)
- GPU (optional): NVIDIA or AMD for faster TTS processing
# Clone the repository
git clone https://github.com/yourusername/textaudio-platform.git
cd textaudio-platform
# Copy environment template
cp env.template .env
# Edit .env with your configuration
nano .env
# Start all services
docker compose up- Frontend: http://localhost:5173
- API Documentation: http://localhost:8000/docs
- API Health Check: http://localhost:8000/health
TextAudio uses a microservices architecture with 6 main services:
-
API Orchestrator (Port 8000)
- FastAPI-based request routing
- Job orchestration and session management
- PostgreSQL for data persistence
- Redis for job queuing and caching
-
TTS Chatterbox (Port 8001)
- 23 language text-to-speech
- Voice cloning capabilities
- GPU acceleration (CUDA/ROCm/CPU fallback)
- PyTorch-based ML models
-
Text Processor (Port 8002)
- Language detection with confidence scoring
- Multi-format extraction (PDF, EPUB, Markdown, TXT)
- Token estimation
-
Job Worker (Background)
- Redis queue consumer
- Asynchronous TTS processing
- Real-time progress via SSE
-
Preview Worker (Background)
- High-priority queue for previews
- 2-sentence preview generation
- 5-minute caching
- SvelteKit 5 with TailwindCSS 4
- Modern reactive components
- TypeScript for type safety
- Real-time SSE updates
- Mobile-responsive design
- PostgreSQL 15 - Jobs, sessions, users, credits
- Redis 7 - Job queue, SSE pub/sub, caching
- Filesystem Storage - Date-organized file storage
Comprehensive documentation is available in the docs/ directory:
- Architecture Overview - Microservices design
- Frontend Architecture - SvelteKit structure
- Installation Guide - Detailed setup instructions
- Deployment Guide - Production deployment
- API Reference - Complete API documentation
- Testing Guide - How to run tests
- Troubleshooting - Common issues
# Start with hot reload
docker compose -f docker-compose.yml -f docker-compose.dev.yml up
# Run backend tests (192 tests, 97% coverage)
cd services/textaudio/backend/api
pytest
# Run frontend tests (107 tests)
cd services/textaudio/frontend/textaudio
npm test
# Run E2E tests
cd services/textaudio/e2e
npm test# Format and fix linting (before commits)
make fix-all-full
# Full validation (before releases)
make validate-all
# Code analysis
make analyze
# Security audit
make auditWe welcome contributions! Please see our Contributing Guidelines for details on:
- Setting up your development environment
- Code quality standards
- Submitting pull requests
- Reporting issues
- Backend Tests: 192 tests passing, 97%+ coverage
- Frontend Tests: 107 tests passing
- Linting Errors: 0 (ruff, ESLint)
- Code Complexity: Average 7.70 (Grade B)
- Security Issues: 0 HIGH severity
Arabic (ar), Danish (da), German (de), Greek (el), English (en), Spanish (es), Finnish (fi), French (fr), Hebrew (he), Hindi (hi), Italian (it), Japanese (ja), Korean (ko), Malay (ms), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt), Russian (ru), Swedish (sv), Swahili (sw), Turkish (tr), Chinese (zh)
Input: PDF, TXT, EPUB, Markdown Output: MP3, WAV, FLAC
- Magic link authentication (no password storage)
- SQL injection protection (SQLAlchemy ORM)
- XSS protection (Pydantic validation)
- CORS configuration
- Rate limiting
- Environment-based secrets management
Please report security vulnerabilities via GitHub Security Advisories.
This project is licensed under the MIT License - see the LICENSE file for details.
- Chatterbox TTS - High-quality TTS engine
- FastAPI - Modern Python web framework
- SvelteKit - Reactive frontend framework
- All contributors and the open source community
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: See
docs/directory
See CHANGELOG.md for version history.
Upcoming Features (contributions welcome):
- Payment integration (Stripe/PayPal)
- Storage encryption (AES-256-GCM)
- Email notifications
- Invoice system
- Multi-speaker audiobooks
- Custom voice training
- API access for developers
Made with β€οΈ by the open source community
This project was originally developed as a commercial product and is now open source. We hope it serves the community well!