本项目是TransNetV2视频镜头检测模型的C++实现版本,包含5个子项目,支持动态输入形状、多线程推理和视频自动切割。
This is a C++ implementation of the TransNetV2 video shot boundary detection model with 5 subprojects, featuring dynamic input shapes, multi-threaded inference, and automatic video splitting.
| 子项目 | Description | 使用模型 |
|---|---|---|
| TransNet | 外部模型文件版本 | External model file |
| TransNet-embedding | 嵌入模型的可执行文件 | Embedded in executable |
| TransNetDLL | 外部模型的DLL | External model file |
| TransNetDLL-embedding | 嵌入模型的DLL | Embedded in DLL |
| video_splitter | 视频切割工具 | N/A |
cpp-TransNet/
├── CMakeLists.txt # 主构建配置 | Main build configuration
├── README.md # 项目文档 | Project documentation
│
├── src/ # 源代码目录 | Source code directory
│ ├── common/ # 公共代码 | Common code
│ │ ├── common.h # 公共头文件 | Common header
│ │ └── common.cpp # 公共实现 | Common implementation
│ │
│ ├── TransNet/ # 子项目1: 外部模型版本 | Subproject 1: External model version
│ │ ├── CMakeLists.txt
│ │ └── main.cpp
│ │
│ ├── TransNet-embedding/ # 子项目2: 嵌入版本 | Subproject 2: Embedded version
│ │ ├── CMakeLists.txt
│ │ ├── main.cpp
│ │ ├── model_data.h
│ │ └── model_data.cpp
│ │
│ ├── TransNetDLL/ # 子项目3: 外部模型DLL | Subproject 3: External model DLL
│ │ ├── CMakeLists.txt
│ │ ├── transnet_dll.cpp
│ │ └── transnet_dll.h
│ │
│ ├── TransNetDLL-embedding/ # 子项目4: 嵌入DLL | Subproject 4: Embedded DLL
│ │ ├── CMakeLists.txt
│ │ ├── transnet_dll.cpp
│ │ ├── transnet_dll.h
│ │ ├── model_data.h
│ │ └── model_data.cpp
│ │
│ └── video_splitter/ # 子项目5: 视频切割 | Subproject 5: Video splitter
│ ├── CMakeLists.txt
│ └── main.cpp
│
├── libtorch/ # PyTorch库 | PyTorch libraries
├── opencv4120/ # OpenCV库 | OpenCV libraries
├── example/ # 示例视频 | Example videos
├── install/ # 编译输出 | Build output
└── build/ # 构建目录 | Build directory
需要下载以下依赖库:
-
LibTorch (PyTorch C++): https://pytorch.org/cppinstall/
- 下载 libtorch-win-shared-with-deps-latest.zip
- Extract to:
libtorch/
-
OpenCV (4.12.0): https://opencv.org/releases/
- 下载 opencv-4.12.0-windows-x64-vc17.exe
- Extract to:
opencv4120/
cd cpp-TransNet
mkdir build
cd build
cmake .. -G "Visual Studio 17 2022" -A x64
cmake --build . --config ReleaseTransNet.exe transnetv2_scripted_dynamic.pt video.mp4 [output_dir]TransNet-embedding.exe video.mp4 [output_dir]video_splitter.exe video.mp4 scenes.txt [output_dir]# 需要提供模型文件路径
TransNet.exe transnetv2_scripted_dynamic.pt video.mp4# 使用默认输出目录 ./output
TransNet-embedding.exe video.mp4
# 指定输出目录
TransNet-embedding.exe video.mp4 my_output#include "transnet_dll.h"
// 创建实例
void* handle = transnet_create();
transnet_set_num_threads(handle, 4);
transnet_load_video(handle, "video.mp4");
transnet_run_inference(handle);
transnet_save_results(handle, "./output");
transnet_destroy(handle);video_splitter.exe video.mp4 output/scenes.txt segmentsInput Video
│
▼
Frame Splitting (Divide video into 4 segments)
│
├──────────────────────┐
▼ ▼ ▼
Thread 0 Thread 1 Thread 2 Thread 3
Frames Frames Frames Frames
0-3129 3130-6259 6260-9389 9390-12517
│ │ │ │
└─────────┴───────────┴───────────┘
│
▼
TransNetV2 Inference (Parallel)
│
▼
Result Aggregation
│
▼
predictions.txt + scenes.txt
- 视频帧不再一次性全部加载到内存
- 每个线程独立打开视频文件并读取所需帧
- 4线程同时处理不同视频段
- 使用
std::thread和std::atomic管理并发
- LibTorch 2.0+ - PyTorch C++库
- OpenCV 4.12.0 - 计算机视觉库
- CMake 3.18+ - 构建工具
- Visual Studio 2022 - 编译器
model_to_header.py- 将PyTorch模型转换为C++头文件
| 文件 | File | 说明 | Description |
|---|---|---|---|
src/TransNet/main.cpp |
Main program (external) | 外部模型版本主程序 | |
src/TransNet-embedding/main.cpp |
Main program (embedded) | 嵌入版本主程序 | |
src/TransNetDLL/transnet_dll.cpp |
DLL implementation (external) | 外部模型DLL实现 | |
src/TransNetDLL-embedding/transnet_dll.cpp |
DLL implementation (embedded) | 嵌入版本DLL实现 | |
src/video_splitter/main.cpp |
Video splitter | 视频切割工具 | |
model_data.cpp |
Embedded model | 嵌入的TransNetV2模型数据 | |
model_to_header.py |
Model converter | PyTorch模型转C++头文件脚本 |
// 创建/销毁句柄
void* transnet_create(void);
void transnet_destroy(void* handle);
// 加载视频或文件夹
int32_t transnet_load_video(void* handle, const char* video_path);
int32_t transnet_load_folder(void* handle, const char* folder_path);
// 设置线程数
void transnet_set_num_threads(void* handle, int32_t num_threads);
// 运行推理
int32_t transnet_run_inference(void* handle);
// 获取结果
TransNetResults transnet_get_results(void* handle);
TransNetScenes transnet_get_scenes(void* handle, float threshold);
// 保存结果
int32_t transnet_save_results(void* handle, const char* output_dir);- ✅ 拆分为5个子项目 | Split into 5 subprojects
- TransNet (外部模型版本)
- TransNet-embedding (嵌入版本)
- TransNetDLL (外部模型DLL)
- TransNetDLL-embedding (嵌入DLL)
- video_splitter (视频切割工具)
- ✅ 公共代码提取 | Common code extraction
- ✅ 模块化构建系统 | Modular build system
- ✅ 统一源码目录 | Unified source directory structure
- ✅ 支持动态输入形状 | Dynamic input shape support
- ✅ 4线程并行推理 | 4-thread parallel inference
- ✅ 模型嵌入可执行文件 | Model embedded in executable
- ✅ 视频自动切割功能 | Automatic video splitting
- ✅ C ABI DLL接口 | C ABI DLL interface
- ✅ 中文英文双语文档 | Bilingual documentation (Chinese/English)