A T L A S

Agent-driven Transcription, Learning, Analysis and Summarization

ATLAS is a Streamlit app that transcribes and analyzes lecture audio, video, or YouTube content using Groq Whisper and a multi-agent LLM pipeline. It produces a summary, structured overview, key points, and a quiz, then exports results to PDF, Word, or JSON with validation scores.

Features

Live microphone recording with start/stop and preview
File uploads for audio/video (wav, mp3, m4a, mp4, avi, mov)
YouTube URL transcription via yt-dlp
Groq Whisper transcription with chunking for large files
Multi-agent pipeline with validator feedback and scoring
Summary, overview, key points, and quiz generation
Export to PDF, Word, or JSON
Session history and validation panels
Artifacts stored per run (inputs and outputs)

Agent Pipeline (8 Agents)

ATLAS runs 8 agents in total: 4 worker agents that generate content and 4 validator agents that check quality.

Worker agents:

Summarizer: writes the overall summary
Overview: creates a structured, hierarchical outline
Key Points: extracts definitions, facts, methods, and tips
Quiz: generates MCQ and short-answer questions

Validator agents (one per worker):

Summary Validator
Overview Validator
Key Points Validator
Quiz Validator

How it works:

The transcript is sent to a worker agent for a task.
The paired validator scores the output (1-10) and provides feedback.
If the score is below the threshold, the worker retries using that feedback.
This loop repeats up to the configured retry limit, and the best output is kept.
Final results include validation scores shown in the UI and exports.

Architecture

🎤 Input Layer
├── Microphone (live start/stop)
├── Audio/Video Upload
└── YouTube URL
  ↓
🔊 Transcription (Groq Whisper)
- Chunking for large files
- Full transcript
  ↓
┌──────────────────────────────┐
│ 🤖 Agent Manager (8 agents)  │
└──────────────────────────────┘
  ↓
Summarizer → Summary Validator → retry if below threshold
  ↓
Overview → Overview Validator → retry if below threshold
  ↓
Key Points → Key Points Validator → retry if below threshold
  ↓
Quiz → Quiz Validator → retry if below threshold
  ↓
📄 Export Layer
├── PDF
├── Word
└── JSON
  ↓
Artifacts stored in artifacts/<source>/<timestamp>/{inputs,outputs}

Tech Stack

Component	Technology
UI	Streamlit + custom CSS
Transcription	Groq Whisper (OpenAI-compatible)
LLM	Groq Llama 3.3 70B (OpenAI-compatible)
Audio	sounddevice, scipy, pydub
YouTube	yt-dlp
Export	fpdf, python-docx, json

Project Structure

main.py
pipeline.py
config.py
core/
  audio.py
  transcription.py
  llm.py
  export.py
agents/
  agent_base.py
  pipeline.py
  summarizer.py
  overview.py
  keypoints.py
  quiz.py
  validators.py
style.css
artifacts/

Setup

Clone the repo.
Install dependencies (Python 3.9+):
```
pip install -r requirements.txt
```
Create a local env file and set your key:

Windows:
```
copy example.env .env
```
macOS/Linux:
```
cp example.env .env
```
Edit .env and set GROQ_API_KEY if you are using Groq. The pipeline is model-agnostic, so you can swap in another provider or even a local LLM.
Run the app:
```
streamlit run main.py
```

Note: Do not commit .env. Use example.env for GitHub.

Environment Variables

The app is model-agnostic. Groq is the default in this repo, but you can plug in any provider or a local LLM by adapting the client in core/llm.py.

GROQ_API_KEY (required only if using Groq)
GROQ_BASE_URL (optional, default https://api.groq.com/openai/v1)

Usage

Choose input method: YouTube URL, file upload, or microphone recording.
Process the input.
Review summary, overview, key points, and quiz.
Download results in the desired format.

Output Location

Each run is stored under:

artifacts/<source>/<timestamp>/{inputs,outputs}

License

MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A T L A S

Features

Agent Pipeline (8 Agents)

Architecture

Tech Stack

Project Structure

Setup

Environment Variables

Usage

Output Location

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agents		agents
core		core
.gitignore		.gitignore
README.md		README.md
config.py		config.py
example.env		example.env
main.py		main.py
pipeline.py		pipeline.py
requirements.txt		requirements.txt
style.css		style.css

Folders and files

Latest commit

History

Repository files navigation

A T L A S

Features

Agent Pipeline (8 Agents)

Architecture

Tech Stack

Project Structure

Setup

Environment Variables

Usage

Output Location

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages