STT

A FastAPI-based service for real-time speech-to-text using faster-whisper and WebRTC VAD.

stt_preview.mov

Installation

# Install system requirements
sudo apt install portaudio19-dev

# Install python dependencies
python3 src/setup.py
source src/stt-venv/bin/activate

Usage

Start the service:

cd src/
python app.py

Python example:

import requests

with open("audio.wav", "rb") as f:
    response = requests.post(
        "http://localhost:47102/transcribe",
        files={"file": f}
    )
    print(response.json()["text"])

Endpoints

Method	Path	Description
GET	`/health`	Check service health and loaded model
POST	`/transcribe`	Transcribe audio, with optional segments, word timestamps, or translation
POST	`/vad/analyze`	Analyze uploaded audio for voice activity
GET	`/vad/status`	Check VAD availability
WebSocket	`/ws/vad`	Real-time voice activity detection
WebSocket	`/ws/stt`	Streaming speech-to-text

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
src		src
todo/d809		todo/d809
.gitignore		.gitignore
.nagignore		.nagignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STT

Installation

Usage

Endpoints

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

STT

Installation

Usage

Endpoints

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages