TextAudio - Open Source TTS Platform

TextAudio is a comprehensive, production-ready text-to-speech platform that converts documents into high-quality audiobooks using advanced AI voice synthesis. Built with a modern microservices architecture, it supports 23 languages, voice cloning, and real-time progress tracking.

✨ Key Features

🌍 23 Language Support - Multi-language TTS with intelligent routing
🎙️ Voice Cloning - Clone voices for personalized audio
⚡ Real-time Progress - Server-Sent Events (SSE) for live updates
🔄 Smart Retry System - 3-tier retry logic with automatic credit bonuses
🎯 Pitch-Preserving Speed Control - Adjust playback speed (0.75x-1.5x)
📱 Mobile-Responsive - Modern SvelteKit 5 frontend
🧪 Production-Ready - 97%+ test coverage (192 backend + 107 frontend tests)
🐳 Docker-First - Complete containerized deployment

🎯 Why Open Source?

This project was developed as a commercial product and is now being released to the open source community. While feature-complete and production-ready, we believe it will be more valuable as a community project. We welcome contributions and hope this serves as a reference implementation for modern TTS platforms.

What's Included:

✅ Complete microservices architecture
✅ Comprehensive test suite
✅ Production-grade code quality
✅ Real-world authentication & session management
✅ Credit & retry system implementation

What's Missing (Contributions Welcome!):

⏳ Payment integration (Stripe/PayPal planned)
⏳ Storage encryption (AES-256-GCM planned)

🤝 Project Status & Maintainership

Community-Driven Development: As of December 2025, we (the original creators) are transitioning this project to community maintenance. We will not be actively implementing new features, but the project is fully functional and production-ready.

Our Continued Role:

📚 Documentation Support - We'll help improve and clarify documentation
🐛 Issue Guidance - We can provide context and guidance on reported issues
💡 Architecture Questions - Happy to explain design decisions and codebase structure
👀 Code Review - We may review significant PRs when time permits

We Encourage You To:

🔨 Fork and extend the project for your needs
🌟 Submit pull requests for new features
📝 Improve documentation
🐛 Report and fix bugs
💬 Help other community members in discussions

As the "parents" of this project, we're excited to see where the community takes it! Feel free to open issues for guidance or questions about the codebase.

🚀 Quick Start

Prerequisites

Docker & Docker Compose
Python 3.12+ (for local development)
Node.js 18+ (for frontend development)
GPU (optional): NVIDIA or AMD for faster TTS processing

Installation

# Clone the repository
git clone https://github.com/yourusername/textaudio-platform.git
cd textaudio-platform

# Copy environment template
cp env.template .env

# Edit .env with your configuration
nano .env

# Start all services
docker compose up

Access Points

Frontend: http://localhost:5173
API Documentation: http://localhost:8000/docs
API Health Check: http://localhost:8000/health

🏗️ Architecture

TextAudio uses a microservices architecture with 6 main services:

Backend Services

API Orchestrator (Port 8000)
- FastAPI-based request routing
- Job orchestration and session management
- PostgreSQL for data persistence
- Redis for job queuing and caching
TTS Chatterbox (Port 8001)
- 23 language text-to-speech
- Voice cloning capabilities
- GPU acceleration (CUDA/ROCm/CPU fallback)
- PyTorch-based ML models
Text Processor (Port 8002)
- Language detection with confidence scoring
- Multi-format extraction (PDF, EPUB, Markdown, TXT)
- Token estimation
Job Worker (Background)
- Redis queue consumer
- Asynchronous TTS processing
- Real-time progress via SSE
Preview Worker (Background)
- High-priority queue for previews
- 2-sentence preview generation
- 5-minute caching

Frontend

SvelteKit 5 with TailwindCSS 4
Modern reactive components
TypeScript for type safety
Real-time SSE updates
Mobile-responsive design

Infrastructure

PostgreSQL 15 - Jobs, sessions, users, credits
Redis 7 - Job queue, SSE pub/sub, caching
Filesystem Storage - Date-organized file storage

📚 Documentation

Comprehensive documentation is available in the docs/ directory:

Architecture Overview - Microservices design
Frontend Architecture - SvelteKit structure
Installation Guide - Detailed setup instructions
Deployment Guide - Production deployment
API Reference - Complete API documentation
Testing Guide - How to run tests
Troubleshooting - Common issues

🛠️ Development

Development Mode

# Start with hot reload
docker compose -f docker-compose.yml -f docker-compose.dev.yml up

# Run backend tests (192 tests, 97% coverage)
cd services/textaudio/backend/api
pytest

# Run frontend tests (107 tests)
cd services/textaudio/frontend/textaudio
npm test

# Run E2E tests
cd services/textaudio/e2e
npm test

Code Quality

# Format and fix linting (before commits)
make fix-all-full

# Full validation (before releases)
make validate-all

# Code analysis
make analyze

# Security audit
make audit

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details on:

Setting up your development environment
Code quality standards
Submitting pull requests
Reporting issues

📊 Project Status

Code Quality Metrics

Backend Tests: 192 tests passing, 97%+ coverage
Frontend Tests: 107 tests passing
Linting Errors: 0 (ruff, ESLint)
Code Complexity: Average 7.70 (Grade B)
Security Issues: 0 HIGH severity

Supported Languages

Arabic (ar), Danish (da), German (de), Greek (el), English (en), Spanish (es), Finnish (fi), French (fr), Hebrew (he), Hindi (hi), Italian (it), Japanese (ja), Korean (ko), Malay (ms), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt), Russian (ru), Swedish (sv), Swahili (sw), Turkish (tr), Chinese (zh)

Supported File Formats

Input: PDF, TXT, EPUB, Markdown Output: MP3, WAV, FLAC

🔒 Security

Magic link authentication (no password storage)
SQL injection protection (SQLAlchemy ORM)
XSS protection (Pydantic validation)
CORS configuration
Rate limiting
Environment-based secrets management

Please report security vulnerabilities via GitHub Security Advisories.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Chatterbox TTS - High-quality TTS engine
FastAPI - Modern Python web framework
SvelteKit - Reactive frontend framework
All contributors and the open source community

💬 Community & Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: See docs/ directory

🗺️ Roadmap

See CHANGELOG.md for version history.

Upcoming Features (contributions welcome):

Payment integration (Stripe/PayPal)
Storage encryption (AES-256-GCM)
Email notifications
Invoice system
Multi-speaker audiobooks
Custom voice training
API access for developers

Made with ❤️ by the open source community

This project was originally developed as a commercial product and is now open source. We hope it serves the community well!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github		.github
docs		docs
scripts		scripts
services/textaudio		services/textaudio
shared		shared
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-cleanup.sh		docker-cleanup.sh
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.e2e.yml		docker-compose.e2e.yml
docker-compose.test.yml		docker-compose.test.yml
docker-compose.yml		docker-compose.yml
env.template		env.template
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TextAudio - Open Source TTS Platform

✨ Key Features

🎯 Why Open Source?

🤝 Project Status & Maintainership

🚀 Quick Start

Prerequisites

Installation

Access Points

🏗️ Architecture

Backend Services

Frontend

Infrastructure

📚 Documentation

🛠️ Development

Development Mode

Code Quality

🤝 Contributing

📊 Project Status

Code Quality Metrics

Supported Languages

Supported File Formats

🔒 Security

📄 License

🙏 Acknowledgments

💬 Community & Support

🗺️ Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TextAudio - Open Source TTS Platform

✨ Key Features

🎯 Why Open Source?

🤝 Project Status & Maintainership

🚀 Quick Start

Prerequisites

Installation

Access Points

🏗️ Architecture

Backend Services

Frontend

Infrastructure

📚 Documentation

🛠️ Development

Development Mode

Code Quality

🤝 Contributing

📊 Project Status

Code Quality Metrics

Supported Languages

Supported File Formats

🔒 Security

📄 License

🙏 Acknowledgments

💬 Community & Support

🗺️ Roadmap

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages