Skip to content

Commit f4cc6c7

Browse files
committed
Optimize the crawler task interface, add ChatGPT as the support of LLM part, optimize the database structure, and update the readme file.
1 parent d611508 commit f4cc6c7

26 files changed

Lines changed: 843 additions & 54 deletions

README-ZH.md

Lines changed: 39 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
* **异步模型池** :本项目实现了一个高效的异步AI模型池,在线程安全的情况下支持 OpenAI Whisper 和 Faster Whisper 模型的多实例并发处理场景,在支持CUDA加速且拥有多个GPU的场景中,通过智能加载机制可以将多个模型智能的加载在多个GPU上,然后模型实例间自动分配任务,确保任务处理速度和系统负载均衡,但是在单一GPU场景下无法提供并发功能。
2828
* **异步数据库**:本项目支持使用MySQL和SQLite作为数据库,在本机运行时无需安装和配置MySQL,使用SQLite即可快速运行项目,如果使用MySQL则可以更好的配合分布式计算,多个节点使用同一个数据库作为任务源。
2929
* **异步网络爬虫**:本项目内置了多个平台的数据爬虫模块,当前支持`抖音``TikTok`,用户只需要输入对应的视频链接即可快速的对媒体进行语音识别,并且未来计划支持更多社交媒体平台。
30+
* **ChatGPT集成**:本项目已经集成了ChatGPT作为LLM部分的支持,可以使用数据库中的数据与ChatGPT进行交互。
3031
* **工作流与组件化设计(待实现)** :围绕 Whisper 转录任务,项目支持高度自定义的工作流系统。用户可以通过 JSON 文件定义组件、任务依赖和执行顺序,甚至可以使用 Python 编写自定义组件,灵活扩展系统功能,轻松实现复杂的多步骤处理流程。
3132
* **事件驱动的智能工作流(待实现)** :工作流系统支持事件触发,可以基于时间、手动触发,或由爬虫模块自动触发。相比单一任务,工作流更加智能,支持条件分支、任务依赖、动态参数传递和重试策略,为用户提供更高的自动化和可控性。
3233

@@ -53,11 +54,12 @@
5354
- **生成字幕文件**:用户可以通过指定的任务ID来生成指定任务的字幕,并且支持指定输出格式(`output_format`),当前支持(`srt`)以及(`vtt`)作为字幕文件格式。
5455
- **创建TikTok任务**:用户可以通过 TikTok 视频链接爬取视频并创建任务。
5556
- **创建抖音任务**:用户可以通过抖音视频链接爬取视频并创建任务。
57+
- **使用ChatGPT总结任务**:用户可以使用任务ID将已经转义好的自然语言交给ChatGPT进行内容总结和其他交互,并且支持在接口选择模型和自定义提示词。
5658

5759
## 📸 项目截图
5860

5961

60-
![2024_07_56_AM.png](https://github.com/Evil0ctal/Fast-Powerful-Whisper-AI-Services-API/blob/main/github/screenshots/2024_07_56_AM.png?raw=true)
62+
![2024_02_16_AM.png](https://github.com/Evil0ctal/Fast-Powerful-Whisper-AI-Services-API/blob/main/github/screenshots/2024_02_16_AM.png?raw=true)
6163

6264
## 🚀 快速部署
6365

@@ -132,11 +134,19 @@
132134
├── 📁 app/
133135
│ ├── 📁 api/ -> API layer containing models and routes
134136
│ │ ├── 📁 models/
135-
│ │ │ └── 📄 APIResponseModel.py -> Defines API response models
137+
│ │ │ ├── 📄 APIResponseModel.py -> Defines API response models
138+
│ │ │ ├── 📄 ChatGPTTaskRequest.py -> Request model for ChatGPT tasks
139+
│ │ │ ├── 📄 DouyinTaskRequest.py -> Request model for Douyin tasks
140+
│ │ │ ├── 📄 TikTokTaskRequest.py -> Request model for TikTok tasks
141+
│ │ │ ├── 📄 WhisperTaskRequest.py -> Request model for Whisper tasks
142+
│ │ │ └── 📄 WorkFlowModels.py -> Workflow data models
136143
│ │ ├── 📁 routers/
137144
│ │ │ ├── 🔍 health_check.py -> Health check endpoint
138145
│ │ │ ├── 📝 whisper_tasks.py -> Routes for Whisper tasks
139-
│ │ │ └── 🔄 work_flows.py -> Routes for workflow management
146+
│ │ │ ├── 🔄 work_flows.py -> Routes for workflow management
147+
│ │ │ ├── 💬 chatgpt_tasks.py -> Routes for ChatGPT-related tasks
148+
│ │ │ ├── 🌐 douyin_tasks.py -> Routes for Douyin-related tasks
149+
│ │ │ └── 🎥 tiktok_tasks.py -> Routes for TikTok-related tasks
140150
│ │ └── 📄 router.py -> Main router module
141151
│ ├── 🕸️ crawlers/ -> Modules for web crawling
142152
│ │ ├── 📁 platforms/
@@ -145,17 +155,20 @@
145155
│ │ │ │ ├── 🚀 crawler.py -> Douyin data crawler
146156
│ │ │ │ ├── 📡 endpoints.py -> API endpoints for Douyin crawler
147157
│ │ │ │ ├── 🧩 models.py -> Models for Douyin data
148-
│ │ │ │ ── 🛠️ utils.py -> Utility functions for Douyin crawler
158+
│ │ │ │ ── 🛠️ utils.py -> Utility functions for Douyin crawler
149159
│ │ │ │ └── 📘 README.md -> Douyin module documentation
150160
│ │ │ └── 📁 tiktok/
151161
│ │ │ ├── 🚀 crawler.py -> TikTok data crawler
152162
│ │ │ ├── 📡 endpoints.py -> API endpoints for TikTok crawler
153163
│ │ │ ├── 🧩 models.py -> Models for TikTok data
154164
│ │ │ └── 📘 README.md -> TikTok module documentation
155165
│ ├── 💾 database/ -> Database models and management
156-
│ │ ├── 🗄️ DatabaseManager.py -> Handles database connections
157-
│ │ ├── 📂 TaskModels.py -> Task-related database models
158-
│ │ └── 📂 WorkFlowModels.py -> Workflow-related database models
166+
│ │ ├── 📁 models/
167+
│ │ │ ├── 📂 TaskModels.py -> Task-related database models
168+
│ │ │ ├── 📂 WorkFlowModels.py -> Workflow-related database models
169+
│ │ │ ├── 🧠 ChatGPTModels.py -> Models for ChatGPT tasks
170+
│ │ │ └── 🕸️ CrawlerModels.py -> Models for crawlers and platforms
171+
│ │ └── 🗄️ DatabaseManager.py -> Handles database connections
159172
│ ├── 🌐 http_client/ -> HTTP client setup
160173
│ │ ├── ⚙️ AsyncHttpClient.py -> Asynchronous HTTP client
161174
│ │ └── ❗ HttpException.py -> Custom HTTP exceptions
@@ -183,8 +196,8 @@
183196
│ └── 📂 -> Default TEMP Files Folder
184197
├── 📁 log_files/ -> Log files folder
185198
│ └── 📂 -> Default LOG Files Folder
186-
── 📂 WhisperServiceAPI.db -> Default SQLite DB File
187-
── 📄 requirements.txt -> Python package requirements
199+
── 📂 WhisperServiceAPI.db -> Default SQLite DB File
200+
── 📄 requirements.txt -> Python package requirements
188201
└── 📝 start.py -> Run to start the API
189202
```
190203
@@ -1255,15 +1268,15 @@ class Settings:
12551268
# 项目名称 | Project name
12561269
title: str = "Fast-Powerful-Whisper-AI-Services-API"
12571270
# 项目描述 | Project description
1258-
description: str = "An open source speech-to-text API that runs completely locally. The project is based on the OpenAI Whisper model and the faster inference Faster Whisper model, and implements an asynchronous model pool, using the asynchronous features of FastAPI for efficient packaging, supporting thread-safe asynchronous task queues, asynchronous file IO, asynchronous database IO, asynchronous web crawler modules, and more custom features."
1271+
description: str = "⚡ A high-performance asynchronous API for Automatic Speech Recognition (ASR) and translation. No need to purchase the Whisper API—perform inference using a locally running Whisper model with support for multi-GPU concurrency and designed for distributed deployment. It also includes built-in crawlers for social media platforms like TikTok and Douyin, enabling seamless media processing from multiple social platforms. This provides a powerful and scalable solution for automated media content data processing."
12591272
# 项目版本 | Project version
1260-
version: str = "1.0.3"
1273+
version: str = "1.0.4"
12611274
# Swagger 文档 URL | Swagger docs URL
12621275
docs_url: str = "/"
12631276
# 是否开启 debug 模式 | Whether to enable debug mode
12641277
debug: bool = False
12651278
# 当检测到项目代码变动时是否自动重载项目 | Whether to automatically reload the project when changes to the project code are detected
1266-
reload_on_file_change: bool = os.getenv("RELOAD_ON_FILE_CHANGE", True)
1279+
reload_on_file_change: bool = os.getenv("RELOAD_ON_FILE_CHANGE", False)
12671280
# FastAPI 服务 IP | FastAPI service IP
12681281
ip: str = "0.0.0.0"
12691282
# FastAPI 服务端口 | FastAPI service port
@@ -1408,6 +1421,20 @@ class Settings:
14081421
web_cookie: str = os.getenv("DOUYIN_WEB_COOKIE", "")
14091422
# Proxy
14101423
proxy: str = os.getenv("DOUYIN_PROXY", None)
1424+
1425+
# ChatGPT API 设置 | ChatGPT API settings
1426+
class ChatGPTSettings:
1427+
# OpenAI API Key
1428+
API_Key: str = os.getenv("OPENAI_API_KEY", "")
1429+
# OpenAI ChatGPT Model
1430+
GPT_Model: str = "gpt-3.5-turbo"
1431+
1432+
# TikHub.io API 设置 | TikHub.io API settings
1433+
class TikHubAPISettings:
1434+
# TikHub.io API URL
1435+
api_domain: str = "https://api.tikhub.io"
1436+
# TikHub.io API Token
1437+
api_key: str = os.getenv("TIKHUB_API_KEY", "")
14111438
```
14121439

14131440
## 🛡️ 许可协议

README.md

Lines changed: 40 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ The system efficiently manages resource scheduling and task management through a
2727
* **Asynchronous Model Pool** : Implements an efficient asynchronous AI model pool that supports multi-instance concurrent processing for OpenAI Whisper and Faster Whisper models under thread-safe conditions. In CUDA-accelerated, multi-GPU environments, intelligent loading mechanisms dynamically assign models to GPUs, balancing load and optimizing task processing. Note: Concurrency is unavailable on single-GPU setups.
2828
* **Asynchronous Database** : Supports MySQL and SQLite databases. It can run locally without MySQL, as SQLite allows for quick setup. When using MySQL, it facilitates distributed computing with multiple nodes accessing the same database for tasks.
2929
* **Asynchronous Web Crawlers** : Equipped with data crawler modules for multiple platforms, currently supporting `Douyin` and `TikTok`. By simply entering the video link, users can quickly process media for speech recognition, with plans for more social media platform support in the future.
30+
* **ChatGPT integration**: This project has integrated ChatGPT as the support for the LLM part, and can use the data in the database to interact with ChatGPT.
3031
* **Workflow and Component Design (Pending)** : With a focus on Whisper transcription tasks, the project will support a highly customizable workflow system. Users can define components, task dependencies, and execution orders in JSON files or write custom components in Python, facilitating complex multi-step processing.
3132
* **Event-Driven Intelligent Workflow (Pending)** : The workflow system supports event-driven triggers, including time-based, manual, or crawler module auto-triggers. More than single-task processing, workflows will offer intelligent, automated control with conditional branching, task dependencies, dynamic parameter passing, and retry strategies.
3233

@@ -52,10 +53,11 @@ The system efficiently manages resource scheduling and task management through a
5253
* **Generate Subtitle File** : Users can generate subtitles for a task by specifying the `task_id` and output format (`output_format`). Currently supports (`srt`) and (`vtt`) subtitle file formats.
5354
* **Create TikTok Task** : Users can create tasks by crawling TikTok videos through a video link.
5455
* **Create Douyin Task** : Users can create tasks by crawling Douyin videos through a video link.
56+
- **Use ChatGPT to summarize tasks**: Users can use the task ID to give the translated natural language to ChatGPT for content summarization and other interactions, and support selecting models and custom prompt words in the interface.
5557

5658
## 📸 Project Screenshots
5759

58-
![2024_07_56_AM.png](https://github.com/Evil0ctal/Fast-Powerful-Whisper-AI-Services-API/blob/main/github/screenshots/2024_07_56_AM.png?raw=true)
60+
![2024_02_16_AM.png](https://github.com/Evil0ctal/Fast-Powerful-Whisper-AI-Services-API/blob/main/github/screenshots/2024_02_16_AM.png?raw=true)
5961

6062
## 🚀 Quick Deployment
6163

@@ -138,30 +140,41 @@ pip install torch torchvision torchaudio --index-url https://download.pytorch.or
138140
├── 📁 app/
139141
│ ├── 📁 api/ -> API layer containing models and routes
140142
│ │ ├── 📁 models/
141-
│ │ │ └── 📄 APIResponseModel.py -> Defines API response models
143+
│ │ │ ├── 📄 APIResponseModel.py -> Defines API response models
144+
│ │ │ ├── 📄 ChatGPTTaskRequest.py -> Request model for ChatGPT tasks
145+
│ │ │ ├── 📄 DouyinTaskRequest.py -> Request model for Douyin tasks
146+
│ │ │ ├── 📄 TikTokTaskRequest.py -> Request model for TikTok tasks
147+
│ │ │ ├── 📄 WhisperTaskRequest.py -> Request model for Whisper tasks
148+
│ │ │ └── 📄 WorkFlowModels.py -> Workflow data models
142149
│ │ ├── 📁 routers/
143150
│ │ │ ├── 🔍 health_check.py -> Health check endpoint
144151
│ │ │ ├── 📝 whisper_tasks.py -> Routes for Whisper tasks
145-
│ │ │ └── 🔄 work_flows.py -> Routes for workflow management
152+
│ │ │ ├── 🔄 work_flows.py -> Routes for workflow management
153+
│ │ │ ├── 💬 chatgpt_tasks.py -> Routes for ChatGPT-related tasks
154+
│ │ │ ├── 🌐 douyin_tasks.py -> Routes for Douyin-related tasks
155+
│ │ │ └── 🎥 tiktok_tasks.py -> Routes for TikTok-related tasks
146156
│ │ └── 📄 router.py -> Main router module
147157
│ ├── 🕸️ crawlers/ -> Modules for web crawling
148158
│ │ ├── 📁 platforms/
149159
│ │ │ ├── 📁 douyin/
150-
│ │ │ │ ├── 🐛 abogus.py -> (`・ω・´) Whats This?
160+
│ │ │ │ ├── 🐛 abogus.py -> (`・ω・´) Whats This?
151161
│ │ │ │ ├── 🚀 crawler.py -> Douyin data crawler
152162
│ │ │ │ ├── 📡 endpoints.py -> API endpoints for Douyin crawler
153163
│ │ │ │ ├── 🧩 models.py -> Models for Douyin data
154-
│ │ │ │ ── 🛠️ utils.py -> Utility functions for Douyin crawler
164+
│ │ │ │ ── 🛠️ utils.py -> Utility functions for Douyin crawler
155165
│ │ │ │ └── 📘 README.md -> Douyin module documentation
156166
│ │ │ └── 📁 tiktok/
157167
│ │ │ ├── 🚀 crawler.py -> TikTok data crawler
158168
│ │ │ ├── 📡 endpoints.py -> API endpoints for TikTok crawler
159169
│ │ │ ├── 🧩 models.py -> Models for TikTok data
160170
│ │ │ └── 📘 README.md -> TikTok module documentation
161171
│ ├── 💾 database/ -> Database models and management
162-
│ │ ├── 🗄️ DatabaseManager.py -> Handles database connections
163-
│ │ ├── 📂 TaskModels.py -> Task-related database models
164-
│ │ └── 📂 WorkFlowModels.py -> Workflow-related database models
172+
│ │ ├── 📁 models/
173+
│ │ │ ├── 📂 TaskModels.py -> Task-related database models
174+
│ │ │ ├── 📂 WorkFlowModels.py -> Workflow-related database models
175+
│ │ │ ├── 🧠 ChatGPTModels.py -> Models for ChatGPT tasks
176+
│ │ │ └── 🕸️ CrawlerModels.py -> Models for crawlers and platforms
177+
│ │ └── 🗄️ DatabaseManager.py -> Handles database connections
165178
│ ├── 🌐 http_client/ -> HTTP client setup
166179
│ │ ├── ⚙️ AsyncHttpClient.py -> Asynchronous HTTP client
167180
│ │ └── ❗ HttpException.py -> Custom HTTP exceptions
@@ -189,8 +202,8 @@ pip install torch torchvision torchaudio --index-url https://download.pytorch.or
189202
│ └── 📂 -> Default TEMP Files Folder
190203
├── 📁 log_files/ -> Log files folder
191204
│ └── 📂 -> Default LOG Files Folder
192-
── 📂 WhisperServiceAPI.db -> Default SQLite DB File
193-
── 📄 requirements.txt -> Python package requirements
205+
── 📂 WhisperServiceAPI.db -> Default SQLite DB File
206+
── 📄 requirements.txt -> Python package requirements
194207
└── 📝 start.py -> Run to start the API
195208
```
196209

@@ -1260,15 +1273,15 @@ class Settings:
12601273
# 项目名称 | Project name
12611274
title: str = "Fast-Powerful-Whisper-AI-Services-API"
12621275
# 项目描述 | Project description
1263-
description: str = "An open source speech-to-text API that runs completely locally. The project is based on the OpenAI Whisper model and the faster inference Faster Whisper model, and implements an asynchronous model pool, using the asynchronous features of FastAPI for efficient packaging, supporting thread-safe asynchronous task queues, asynchronous file IO, asynchronous database IO, asynchronous web crawler modules, and more custom features."
1276+
description: str = "⚡ A high-performance asynchronous API for Automatic Speech Recognition (ASR) and translation. No need to purchase the Whisper API—perform inference using a locally running Whisper model with support for multi-GPU concurrency and designed for distributed deployment. It also includes built-in crawlers for social media platforms like TikTok and Douyin, enabling seamless media processing from multiple social platforms. This provides a powerful and scalable solution for automated media content data processing."
12641277
# 项目版本 | Project version
1265-
version: str = "1.0.3"
1278+
version: str = "1.0.4"
12661279
# Swagger 文档 URL | Swagger docs URL
12671280
docs_url: str = "/"
12681281
# 是否开启 debug 模式 | Whether to enable debug mode
12691282
debug: bool = False
12701283
# 当检测到项目代码变动时是否自动重载项目 | Whether to automatically reload the project when changes to the project code are detected
1271-
reload_on_file_change: bool = os.getenv("RELOAD_ON_FILE_CHANGE", True)
1284+
reload_on_file_change: bool = os.getenv("RELOAD_ON_FILE_CHANGE", False)
12721285
# FastAPI 服务 IP | FastAPI service IP
12731286
ip: str = "0.0.0.0"
12741287
# FastAPI 服务端口 | FastAPI service port
@@ -1413,6 +1426,20 @@ class Settings:
14131426
web_cookie: str = os.getenv("DOUYIN_WEB_COOKIE", "")
14141427
# Proxy
14151428
proxy: str = os.getenv("DOUYIN_PROXY", None)
1429+
1430+
# ChatGPT API 设置 | ChatGPT API settings
1431+
class ChatGPTSettings:
1432+
# OpenAI API Key
1433+
API_Key: str = os.getenv("OPENAI_API_KEY", "")
1434+
# OpenAI ChatGPT Model
1435+
GPT_Model: str = "gpt-3.5-turbo"
1436+
1437+
# TikHub.io API 设置 | TikHub.io API settings
1438+
class TikHubAPISettings:
1439+
# TikHub.io API URL
1440+
api_domain: str = "https://api.tikhub.io"
1441+
# TikHub.io API Token
1442+
api_key: str = os.getenv("TIKHUB_API_KEY", "")
14161443
```
14171444

14181445
## 🛡️ License

0 commit comments

Comments
 (0)