Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
__pycache__/
*.py[cod]
.pytest_cache/
.mypy_cache/
.venv/
.env
dist/
build/
*.egg-info/
157 changes: 157 additions & 0 deletions zoom-live-assistant/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# Zoom Live Assistant

A starter tool for real-time call assistance:

1. Ingest transcript events from Zoom (or any source).
2. Detect when a customer asks a question.
3. Send recent transcript context + question to ChatGPT.
4. Return a concise suggested answer you can read to the customer.

## Important note about Zoom integration

This repo includes the core runtime assistant and API/CLI interfaces.
Direct Zoom call audio capture/transcription depends on your Zoom account type and selected integration method (Zoom Apps/SDK/webhooks or external audio capture + ASR).
Use this service as the central "brain" that receives transcript lines and returns answers.

## Quick start

### 1) Install

```bash
cd /workspace/zoom-live-assistant
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
```

### 2) Configure environment

Create `.env`:

```env
OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL=gpt-4o-mini
MIN_QUESTION_LENGTH=12
MAX_CONTEXT_MESSAGES=30
API_HOST=0.0.0.0
API_PORT=8080
ZOOM_WEBHOOK_SECRET=your_zoom_webhook_secret
VERIFY_ZOOM_SIGNATURES=true
```

### 3) Run API mode

```bash
zoom-live-assistant-api
```

Ingest transcript line:

```bash
curl -X POST "http://localhost:8080/ingest" \
-H "Content-Type: application/json" \
-d '{
"speaker":"customer",
"text":"Can you explain how your pricing scales with usage?",
"source":"zoom"
}'
```

Or send a Zoom webhook envelope:

```bash
curl -X POST "http://localhost:8080/zoom/webhook" \
-H "Content-Type: application/json" \
-d '{
"event":"meeting.transcript_received",
"event_ts":1739923528123,
"payload":{
"content":{
"transcript_segments":[
{"user_name":"customer","text":"Can you explain your pricing model?"}
]
}
}
}'
```

If a question is detected, response includes:

```json
{
"accepted": true,
"answer": {
"question": "...",
"answer": "...suggested response...",
"speaker": "customer",
"timestamp": "..."
},
"context_size": 14
}
```

### 4) Run CLI mode (stdin)

Plain text mode:

```bash
printf "Hello team\nCan you describe your SLA commitments?\n" | zoom-live-assistant-cli
```

JSONL mode:

```bash
printf '{"speaker":"customer","text":"How fast can we migrate?"}\n' | zoom-live-assistant-cli --jsonl
```

## How to connect this to Zoom in practice

Common production pattern:

- Zoom transcript source -> your "bridge" service
- Bridge sends each transcript line to `POST /ingest`
- On non-null `answer`, display in desktop overlay or private chat panel

Concrete Zoom webhook path supported in this project:

- Set Zoom Event Notification endpoint to: `https://<your-domain>/zoom/webhook`
- Supports Zoom challenge event `endpoint.url_validation`
- Supports signature verification via `x-zm-signature` and `x-zm-request-timestamp`
- Parses transcript payload variants from:
- `payload.content.transcript_segments[]`
- `payload.content.text` / `payload.content.transcript`
- `payload.object.transcript[]` / `payload.object.transcript_entries[]`
- `payload.object.text`

### Desktop overlay UI

Run the overlay window (always on top) to see answers instantly:

```bash
zoom-live-assistant-overlay --ws-url ws://127.0.0.1:8080/ws/answers
```

How it works:

- API publishes each generated answer to websocket endpoint `ws://.../ws/answers`
- Overlay subscribes and updates question + suggested answer in near-real-time
- Use this while you are on a call; keep human-in-the-loop and read/adjust responses

You can build the bridge with:

- Zoom meeting transcript webhooks/events (if available in your plan)
- Zoom SDK app capturing transcript events
- Audio capture pipeline + speech-to-text (Whisper/Deepgram/Azure) then forwarding text here

## Safety guidance

- Keep a human in the loop; do not auto-send answers to customers.
- Log all suggestions for QA and compliance.
- Add domain constraints into `ASSISTANT_SYSTEM_PROMPT`.
- Mask sensitive data before sending transcripts to external APIs where required.

## Tests

```bash
pytest
```
39 changes: 39 additions & 0 deletions zoom-live-assistant/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
[project]
name = "zoom-live-assistant"
version = "0.1.0"
description = "Real-time Zoom transcript assistant powered by OpenAI"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
"fastapi",
"httpx",
"openai",
"pydantic-settings",
"python-dotenv",
"websocket-client",
"uvicorn[standard]",
]

[project.optional-dependencies]
dev = [
"pytest",
]

[tool.pytest.ini_options]
pythonpath = ["src"]
testpaths = ["tests"]

[project.scripts]
zoom-live-assistant-cli = "zoom_live_assistant.cli:main"
zoom-live-assistant-api = "zoom_live_assistant.api:run"
zoom-live-assistant-overlay = "zoom_live_assistant.overlay:main"

[build-system]
requires = ["setuptools>=68", "wheel"]
build-backend = "setuptools.build_meta"

[tool.setuptools]
package-dir = {"" = "src"}

[tool.setuptools.packages.find]
where = ["src"]
Empty file.
163 changes: 163 additions & 0 deletions zoom-live-assistant/src/zoom_live_assistant/api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
from __future__ import annotations

import json
from collections import deque
from typing import Deque

from fastapi import FastAPI, Header, HTTPException, Request, WebSocket, WebSocketDisconnect
from pydantic import BaseModel
import uvicorn

from zoom_live_assistant.assistant import LiveAssistant
from zoom_live_assistant.config import get_settings
from zoom_live_assistant.models import AssistantAnswer, TranscriptEvent
from zoom_live_assistant.zoom_webhook_adapter import (
build_zoom_validation_response,
extract_transcript_events,
verify_zoom_signature,
)

settings = get_settings()
assistant = LiveAssistant(settings)
recent_answers: Deque[AssistantAnswer] = deque(maxlen=100)
connected_clients: set[WebSocket] = set()
app = FastAPI(
title="Zoom Live Assistant API",
description="Accepts transcript events and returns suggested live answers when a question is detected.",
)


class IngestResponse(BaseModel):
accepted: bool = True
answer: AssistantAnswer | None = None
context_size: int


class ZoomTranscriptEvent(BaseModel):
speaker: str = "customer"
text: str
timestamp: str | None = None


class ZoomWebhookEnvelope(BaseModel):
event: str
payload: dict
event_ts: int | None = None


class ZoomWebhookResponse(BaseModel):
accepted: bool = True
validation: dict | None = None
ingested_events: int = 0
answers: list[AssistantAnswer] = []
message: str | None = None


@app.get("/health")
def health() -> dict[str, str]:
return {"status": "ok"}


@app.post("/ingest", response_model=IngestResponse)
async def ingest(event: TranscriptEvent) -> IngestResponse:
try:
answer = assistant.ingest_event(event)
if answer:
await _store_answer(answer)
return IngestResponse(answer=answer, context_size=assistant.context_size())
except Exception as exc: # pragma: no cover
raise HTTPException(status_code=500, detail=f"Failed to ingest event: {exc}") from exc


@app.post("/zoom/transcript", response_model=IngestResponse)
async def ingest_zoom_transcript(event: ZoomTranscriptEvent) -> IngestResponse:
normalized = TranscriptEvent(speaker=event.speaker, text=event.text, source="zoom")
return await ingest(normalized)


@app.post("/zoom/webhook", response_model=ZoomWebhookResponse)
async def zoom_webhook(
request: Request,
x_zm_request_timestamp: str | None = Header(default=None),
x_zm_signature: str | None = Header(default=None),
) -> ZoomWebhookResponse:
raw = await request.body()
try:
body = json.loads(raw.decode("utf-8"))
except json.JSONDecodeError as exc:
raise HTTPException(status_code=400, detail=f"Invalid JSON payload: {exc}") from exc

envelope = ZoomWebhookEnvelope.model_validate(body)
if envelope.event == "endpoint.url_validation":
plain_token = envelope.payload.get("plainToken", "")
if not settings.zoom_webhook_secret:
raise HTTPException(
status_code=500,
detail="ZOOM_WEBHOOK_SECRET must be configured for endpoint.url_validation.",
)
validation = build_zoom_validation_response(
plain_token=plain_token, webhook_secret=settings.zoom_webhook_secret
)
return ZoomWebhookResponse(validation=validation.model_dump(), message="url validated")

if settings.verify_zoom_signatures:
if not settings.zoom_webhook_secret:
raise HTTPException(
status_code=500,
detail="ZOOM_WEBHOOK_SECRET must be configured when VERIFY_ZOOM_SIGNATURES is true.",
)
if not verify_zoom_signature(
raw_body=raw,
request_timestamp=x_zm_request_timestamp,
zoom_signature=x_zm_signature,
webhook_secret=settings.zoom_webhook_secret,
):
raise HTTPException(status_code=401, detail="Invalid Zoom webhook signature.")

events = extract_transcript_events(body)
answers: list[AssistantAnswer] = []
for event in events:
answer = assistant.ingest_event(event)
if answer:
answers.append(answer)
await _store_answer(answer)
return ZoomWebhookResponse(ingested_events=len(events), answers=answers)


@app.websocket("/ws/answers")
async def answers_ws(websocket: WebSocket) -> None:
await websocket.accept()
connected_clients.add(websocket)
try:
for answer in recent_answers:
await websocket.send_json(answer.model_dump(mode="json"))
while True:
# Keep alive; clients do not need to send data.
await websocket.receive_text()
except WebSocketDisconnect:
pass
finally:
connected_clients.discard(websocket)


async def _store_answer(answer: AssistantAnswer) -> None:
recent_answers.append(answer)
payload = answer.model_dump(mode="json")
stale_clients: list[WebSocket] = []
for client in list(connected_clients):
try:
# Best effort push to live overlays.
await client.send_json(payload)
except Exception:
stale_clients.append(client)
for stale in stale_clients:
connected_clients.discard(stale)


def run() -> None:
uvicorn.run(
"zoom_live_assistant.api:app",
host=settings.api_host,
port=settings.api_port,
reload=False,
)
Loading