OpenThermalAction (OTA) is an open-source thermal human action recognition dataset with frame-level skeleton annotations. It contains 2,488 thermal videos collected from 55 subjects performing 28 action classes, including single-person and multi-person actions.
mkdir dataset
huggingface-cli download issai/OpenThermalAction --repo-type dataset --local-dir dataset/mkdir models
huggingface-cli download issai/thermal-skeleton-models --repo-type model --local-dir models/Project structure:
.
├── README.md
├── ota-balcony/ # Place downloaded train and validation sets here
│ ├── train/
│ │ ├── session_1/
│ │ └── session_2/
│ ├── val/
│ │ ├── session_1/
│ │ └── session_2/
│ ├── annotations_train.txt
│ └── annotations_val.txt
├── ota-office/ # Place downloaded test set here
│ ├── test/
│ │ ├── test_s1/
│ │ └── test_s2/
│ └── annotations_test.txt
├── pyskl/
│ ├── LICENSE
│ ├── README.md
│ ├── configs/
│ ├── data/ # move downloaded PKL files from dataset to here
│ │ ├── pyskl_ota.pkl
│ │ ├── pyskl_ota_75.pkl
│ │ ├── pyskl_ota_50.pkl
│ │ ├── pyskl_ota_25.pkl
│ │ ├── pyskl_ota_session1.pkl
│ │ ├── pyskl_ota_session2.pkl
│ │ └── ... (session-specific variants for 25/50/75)
│ ├── demo/
│ ├── examples/
│ ├── helpers/
│ ├── pyskl/
│ ├── pyskl.egg-info/
│ ├── pyskl.yaml
│ ├── pyskl_310.yaml # Minimal conda environment
│ ├── requirements.txt # Python dependencies (install after PyTorch & mmcv)
│ ├── setup.cfg
│ ├── setup.py
│ └── tools/
The example below is for NVIDIA Docker with PyTorch 2.5.0 + CUDA 12.6 (e.g., nvcr.io/nvidia/pytorch:24.10-py3).
# 1. Create minimal conda environment
cd pyskl
conda env create -f pyskl_310.yaml
conda activate pyskl_310
# 2. Install PyTorch for CUDA 12.x
pip install torch==2.5.0 torchvision==0.20.0 torchaudio --index-url https://download.pytorch.org/whl/cu124
# 3. Install mmcv-full for PyTorch 2.x + CUDA 12.x (exactly 1.7.0 for mmdet compatibility)
pip install mmcv-full==1.7.0 -f https://download.openmmlab.com/mmcv/dist/cu121/torch2.0.0/index.html
# 4. Install remaining dependencies
pip install -r requirements.txt
# 5. Install mmpose without chumpy dependency
pip install mmpose==0.29.0 --no-deps
# 6. Install pyskl in editable mode
pip install -e .
# 7. Verify everything works
python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.cuda.is_available()}')"
python -c "import mmcv, mmdet, mmpose, pyskl; print(f'✓ MMCV: {mmcv.__version__}, MMDet: {mmdet.__version__}, MMPose: {mmpose.__version__}')"Notes:
- The
pyskl_310.yamlcreates a minimal Python 3.10 environment. All dependencies are installed via pip in the correct order to avoid conflicts. - mmpose is installed with
--no-depsto skipchumpy(for SMPL body models), which is not needed and has build issues.
Important: All models are trained only once on the full 100% version of the dataset (data/pyskl_ota.pkl). All other dataset variants (25%, 50%, 75% frame-reduced) are used only for testing the trained models.
Each model is trained in both setups (pretrained and scratch) on the full OTA dataset (data/pyskl_ota.pkl).
Single GPU:
cd pyskl
CUDA_VISIBLE_DEVICES=<gpu_id> RANK=0 WORLD_SIZE=1 MASTER_ADDR=localhost MASTER_PORT=29500 python tools/train.py configs/<model>/<model>_custom_28classes_hrnet/j_pretrained_full.py --validateMulti-GPU (distributed):
cd pyskl
CUDA_VISIBLE_DEVICES=<gpu_ids> torchrun --standalone --nnodes=1 --nproc_per_node=<num_procs> --master_port <port> tools/train.py configs/<model>/<model>_custom_28classes_hrnet/j_pretrained_full.py --validate --launcher pytorchExamples:
# Train ST-GCN on single GPU
cd pyskl
CUDA_VISIBLE_DEVICES=1 RANK=0 WORLD_SIZE=1 MASTER_ADDR=localhost MASTER_PORT=29500 python tools/train.py configs/stgcn/stgcn_custom_28classes_hrnet/j_pretrained_full.py --validate
# Train AAGCN on 2 GPUs
cd pyskl
CUDA_VISIBLE_DEVICES=0,1 torchrun --standalone --nnodes=1 --nproc_per_node=2 --master_port 29523 tools/train.py configs/aagcn/aagcn_custom_28classes_hrnet/j_pretrained_full.py --validate --launcher pytorchTest trained models on the full OTA dataset, frame-reduced variants (25%, 50%, 75%), and per session:
cd pyskl
CUDA_VISIBLE_DEVICES=<gpu_id> RANK=0 WORLD_SIZE=1 MASTER_ADDR=localhost MASTER_PORT=29500 python tools/test.py configs/<model>/<model>_custom_28classes_hrnet/<config>.py -C /path/to/checkpoint.pth --eval top_k_accuracy mean_class_accuracyExamples:
# 25% frames, session 1, pretrained model
cd pyskl
CUDA_VISIBLE_DEVICES=0 RANK=0 WORLD_SIZE=1 MASTER_ADDR=localhost MASTER_PORT=29500 python tools/test.py configs/stgcn/stgcn_custom_28classes_hrnet/j_pretrained_full_25_session1.py -C work_dirs/stgcn/pretrained_full/best_top1_acc_epoch_X.pth --eval top_k_accuracy mean_class_accuracy
# 50% frames, session 2, pretrained model
cd pyskl
CUDA_VISIBLE_DEVICES=0 RANK=0 WORLD_SIZE=1 MASTER_ADDR=localhost MASTER_PORT=29500 python tools/test.py configs/stgcn/stgcn_custom_28classes_hrnet/j_pretrained_full_50_session2.py -C work_dirs/stgcn/pretrained_full/best_top1_acc_epoch_X.pth --eval top_k_accuracy mean_class_accuracy
# 75% frames, session 1, from-scratch model
cd pyskl
CUDA_VISIBLE_DEVICES=0 RANK=0 WORLD_SIZE=1 MASTER_ADDR=localhost MASTER_PORT=29500 python tools/test.py configs/stgcn/stgcn_custom_28classes_hrnet/j_scratch_75_session1.py -C work_dirs/stgcn/scratch/best_top1_acc_epoch_X.pth --eval top_k_accuracy mean_class_accuracy
# Full (100%) session 2, pretrained model
cd pyskl
CUDA_VISIBLE_DEVICES=0 RANK=0 WORLD_SIZE=1 MASTER_ADDR=localhost MASTER_PORT=29500 python tools/test.py configs/stgcn/stgcn_custom_28classes_hrnet/j_pretrained_full_session2.py -C work_dirs/stgcn/pretrained_full/best_top1_acc_epoch_X.pth --eval top_k_accuracy mean_class_accuracyFor any issues, please ensure the environment is active (conda activate pyskl_310) and that PKL files are present under pyskl/data when training/testing with PYSKL configs.
Two visualization scripts are available, both create stacked videos with thermal frame on top and skeleton on bottom.
Use visualize/visualize_pkl.py to visualize pre-extracted keypoints from the OTA PKL file. Train/val frame_dirs refer to ota-balcony (session_1, session_2), test to ota-office (test_s1, test_s2):
# OTA train/val (balcony, image folders)
python visualize/visualize_pkl.py --dataset balcony --split train --session 2 --exp 22 --action 13
# OTA test (office)
python visualize/visualize_pkl.py --dataset office --split test --session 1 --subject 14 --action 12 --view 2
# List available clips
python visualize/visualize_pkl.py --dataset office --split test --session 1 --list
# Exact frame_dir match
python visualize/visualize_pkl.py --dataset office --split test --frame_dir "test_s1/sub_14/thermal/12_2" --save_imagesOutput is saved preserving hierarchy: visualize/output/{dataset}/{frame_dir}.mp4 (for videos) or visualize/output/{dataset}/{frame_dir}/ (for images)
Download YOLO-pose models for thermal human pose estimation from here.
Use visualize/visualize_raw.py to run YOLO pose detection on raw images from ota-balcony or ota-office:
# balcony - train/val image folder (using path arguments)
python visualize/visualize_raw.py --model PATH_TO_YOLO --dataset balcony --split train --session 2 --exp 22 --action 13
# office - test image folder (using path arguments)
python visualize/visualize_raw.py --model PATH_TO_YOLO --dataset office --split test --session 1 --subject 4 --action 12 --view 1
# Direct path mode (positional argument)
python visualize/visualize_raw.py --model PATH_TO_YOLO ota-balcony/train/session_2/exp_22/thermal/13
python visualize/visualize_raw.py --model PATH_TO_YOLO ota-office/test/test_s1/sub_4/thermal/12_1Additional options:
--conf 0.3- Confidence threshold--fps 30- Output video FPS--device cuda:0- GPU device--batch_size 32- YOLO batch size
Example outputs:
Session 1: Side jacks example
Session 2: Fake slapping example


