Skip to content

murasakii0118/cpp-TransNetv2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TransNetV2 Video Shot Detection - C++ Implementation

项目介绍 | Project Introduction

本项目是TransNetV2视频镜头检测模型的C++实现版本,包含5个子项目,支持动态输入形状、多线程推理和视频自动切割。

This is a C++ implementation of the TransNetV2 video shot boundary detection model with 5 subprojects, featuring dynamic input shapes, multi-threaded inference, and automatic video splitting.


子项目列表 | Subprojects

子项目 Description 使用模型
TransNet 外部模型文件版本 External model file
TransNet-embedding 嵌入模型的可执行文件 Embedded in executable
TransNetDLL 外部模型的DLL External model file
TransNetDLL-embedding 嵌入模型的DLL Embedded in DLL
video_splitter 视频切割工具 N/A

项目结构 | Project Structure

cpp-TransNet/
├── CMakeLists.txt                    # 主构建配置 | Main build configuration
├── README.md                         # 项目文档 | Project documentation
│
├── src/                              # 源代码目录 | Source code directory
│   ├── common/                       # 公共代码 | Common code
│   │   ├── common.h                  # 公共头文件 | Common header
│   │   └── common.cpp                # 公共实现 | Common implementation
│   │
│   ├── TransNet/                     # 子项目1: 外部模型版本 | Subproject 1: External model version
│   │   ├── CMakeLists.txt
│   │   └── main.cpp
│   │
│   ├── TransNet-embedding/          # 子项目2: 嵌入版本 | Subproject 2: Embedded version
│   │   ├── CMakeLists.txt
│   │   ├── main.cpp
│   │   ├── model_data.h
│   │   └── model_data.cpp
│   │
│   ├── TransNetDLL/                  # 子项目3: 外部模型DLL | Subproject 3: External model DLL
│   │   ├── CMakeLists.txt
│   │   ├── transnet_dll.cpp
│   │   └── transnet_dll.h
│   │
│   ├── TransNetDLL-embedding/        # 子项目4: 嵌入DLL | Subproject 4: Embedded DLL
│   │   ├── CMakeLists.txt
│   │   ├── transnet_dll.cpp
│   │   ├── transnet_dll.h
│   │   ├── model_data.h
│   │   └── model_data.cpp
│   │
│   └── video_splitter/               # 子项目5: 视频切割 | Subproject 5: Video splitter
│       ├── CMakeLists.txt
│       └── main.cpp
│
├── libtorch/                         # PyTorch库 | PyTorch libraries
├── opencv4120/                       # OpenCV库 | OpenCV libraries
├── example/                          # 示例视频 | Example videos
├── install/                          # 编译输出 | Build output
└── build/                            # 构建目录 | Build directory

快速开始 | Quick Start

1. 下载依赖 | Download Dependencies

需要下载以下依赖库:

2. 编译项目 | Building the Project

cd cpp-TransNet
mkdir build
cd build
cmake .. -G "Visual Studio 17 2022" -A x64
cmake --build . --config Release

3. 运行程序 | Running Programs

外部模型版本(需要提供模型文件路径)

TransNet.exe transnetv2_scripted_dynamic.pt video.mp4 [output_dir]

嵌入版本(模型已嵌入可执行文件)

TransNet-embedding.exe video.mp4 [output_dir]

视频切割

video_splitter.exe video.mp4 scenes.txt [output_dir]

使用示例 | Usage Examples

1. 外部模型版本推理 | External Model Version Inference

# 需要提供模型文件路径
TransNet.exe transnetv2_scripted_dynamic.pt video.mp4

2. 嵌入版本推理 | Embedded Version Inference

# 使用默认输出目录 ./output
TransNet-embedding.exe video.mp4

# 指定输出目录
TransNet-embedding.exe video.mp4 my_output

3. DLL使用示例 | DLL Usage Example

#include "transnet_dll.h"

// 创建实例
void* handle = transnet_create();
transnet_set_num_threads(handle, 4);
transnet_load_video(handle, "video.mp4");
transnet_run_inference(handle);
transnet_save_results(handle, "./output");
transnet_destroy(handle);

4. 视频切割 | Video Splitting

video_splitter.exe video.mp4 output/scenes.txt segments

技术架构 | Technical Architecture

多线程推理 | Multi-threaded Inference

Input Video
    │
    ▼
Frame Splitting (Divide video into 4 segments)
    │
    ├──────────────────────┐
    ▼         ▼           ▼
Thread 0   Thread 1   Thread 2   Thread 3
Frames    Frames    Frames    Frames
0-3129    3130-6259 6260-9389 9390-12517
    │         │           │           │
    └─────────┴───────────┴───────────┘
    │
    ▼
TransNetV2 Inference (Parallel)
    │
    ▼
Result Aggregation
    │
    ▼
predictions.txt + scenes.txt

性能优化 | Performance Optimization

内存优化 | Memory Optimization

  • 视频帧不再一次性全部加载到内存
  • 每个线程独立打开视频文件并读取所需帧

并行策略 | Parallel Strategy

  • 4线程同时处理不同视频段
  • 使用 std::threadstd::atomic 管理并发

依赖项 | Dependencies

必需 | Required

  • LibTorch 2.0+ - PyTorch C++库
  • OpenCV 4.12.0 - 计算机视觉库
  • CMake 3.18+ - 构建工具
  • Visual Studio 2022 - 编译器

模型嵌入 | Model Embedding

  • model_to_header.py - 将PyTorch模型转换为C++头文件

文件说明 | File Descriptions

文件 File 说明 Description
src/TransNet/main.cpp Main program (external) 外部模型版本主程序
src/TransNet-embedding/main.cpp Main program (embedded) 嵌入版本主程序
src/TransNetDLL/transnet_dll.cpp DLL implementation (external) 外部模型DLL实现
src/TransNetDLL-embedding/transnet_dll.cpp DLL implementation (embedded) 嵌入版本DLL实现
src/video_splitter/main.cpp Video splitter 视频切割工具
model_data.cpp Embedded model 嵌入的TransNetV2模型数据
model_to_header.py Model converter PyTorch模型转C++头文件脚本

DLL接口 | DLL Interface

// 创建/销毁句柄
void* transnet_create(void);
void transnet_destroy(void* handle);

// 加载视频或文件夹
int32_t transnet_load_video(void* handle, const char* video_path);
int32_t transnet_load_folder(void* handle, const char* folder_path);

// 设置线程数
void transnet_set_num_threads(void* handle, int32_t num_threads);

// 运行推理
int32_t transnet_run_inference(void* handle);

// 获取结果
TransNetResults transnet_get_results(void* handle);
TransNetScenes transnet_get_scenes(void* handle, float threshold);

// 保存结果
int32_t transnet_save_results(void* handle, const char* output_dir);

参考链接 | References


版本历史 | Version History

v2.0 (2026-05-31)

  • ✅ 拆分为5个子项目 | Split into 5 subprojects
    • TransNet (外部模型版本)
    • TransNet-embedding (嵌入版本)
    • TransNetDLL (外部模型DLL)
    • TransNetDLL-embedding (嵌入DLL)
    • video_splitter (视频切割工具)
  • ✅ 公共代码提取 | Common code extraction
  • ✅ 模块化构建系统 | Modular build system
  • ✅ 统一源码目录 | Unified source directory structure

v1.0 (2026-05-12)

  • ✅ 支持动态输入形状 | Dynamic input shape support
  • ✅ 4线程并行推理 | 4-thread parallel inference
  • ✅ 模型嵌入可执行文件 | Model embedded in executable
  • ✅ 视频自动切割功能 | Automatic video splitting
  • ✅ C ABI DLL接口 | C ABI DLL interface
  • ✅ 中文英文双语文档 | Bilingual documentation (Chinese/English)

About

A TransNetv2 inference engine based on libtorch and OpenCV, written in C++. 使用C++编写的基于libtorch和opencv的TransNetv2推理引擎

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors