Skip to content

ramonics/stt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

STT

A FastAPI-based service for real-time speech-to-text using faster-whisper and WebRTC VAD.

stt_preview.mov

Installation

# Install system requirements
sudo apt install portaudio19-dev

# Install python dependencies
python3 src/setup.py
source src/stt-venv/bin/activate

Usage

Start the service:

cd src/
python app.py

Python example:

import requests

with open("audio.wav", "rb") as f:
    response = requests.post(
        "http://localhost:47102/transcribe",
        files={"file": f}
    )
    print(response.json()["text"])

Endpoints

Method Path Description
GET /health Check service health and loaded model
POST /transcribe Transcribe audio, with optional segments, word timestamps, or translation
POST /vad/analyze Analyze uploaded audio for voice activity
GET /vad/status Check VAD availability
WebSocket /ws/vad Real-time voice activity detection
WebSocket /ws/stt Streaming speech-to-text

About

Real-time speech-to-text service using faster-whisper and WebRTC VAD

Topics

Resources

License

Stars

Watchers

Forks

Contributors