CHILI/
├── asset/
├── config/
│ ├── diffusion_hwdb1.yaml
│ └── vae_hwdb1.yaml
├── logs/
├── model/
│ ├── diffusion/
│ │ ├── __init__.py
│ │ ├── model.py
│ │ ├── scheduler.py
│ │ └── unet.py
│ ├── vae/
│ │ ├── __init__.py
│ │ ├── decoder.py
│ │ ├── encoder.py
│ │ └── vae.py
│ └── content_encoder.py
├── runs/
├── scripts/
│ ├── data_scope.ipynb
│ ├── infer_vae.ipynb
│ ├── test_vae_ocr.py
│ ├── train_diffusion.py
│ ├── train_vae.py
│ ├── hwdb_download.sh
│ ├── train_diffusion.slurm
│ └── train_vae.slurm
├── utils/
│ ├── config.py
│ ├── dataset.py
│ ├── log.py
│ ├── loss.py
│ ├── ocr_score.py
│ └── seed.py
├── LICENSE
├── .gitignore
├── README.md
├── requirements.txt
└── environment.yml
conda env create -f environment.yml
conda activate chili311
#OR
pip install -r requirements.txtbash scripts/hwdb_download.sh- Download HWDB1.1 and HWDB1.0 datasets from official site;
- Extract to
data/HWDB1/folder; - Folder structure:
data/CASIA/HWDB1.0/train/<char_id>-f.gntfor training set;data/CASIA/HWDB1.0/test/<char_id>-t.gntfor test set;scripts/data_scope.ipynbto explore dataset statistics.
- Config files are in
config/folder. config/vae_hwdb1.yamlfor VAE training on HWDB1 dataset.config/diffusion_hwdb1.yamlfor diffusion model training on HWDB1 dataset.
- train VAE
scripts/train_vae.py; - loads
config/vae_hwdb1.yaml; - creates
runs/<ts>/vae/; - copies config to
vae_hwdb1_<ts>.yaml; - checkpoints:
vae_hwdb1_best_<ts>.pt,vae_hwdb1_last_<ts>.pt; - logs and tensorboard under
runs/<ts>/.
- notebook for VAE inference
scripts/infer_vae.ipynb; - loads trained VAE from
model/vae/vae_hwdb1_best_<ts>.pt; - outputs to
Generated/vae_infer/.
TODO
- train diffusion model
scripts/train_diffusion.py; - loads
config/diffusion_hwdb1.yaml; - creates
runs/<ts>/diffusion/; - copies config to
diffusion_hwdb1_<ts>.yaml; - checkpoints:
diffusion_hwdb1_best_<ts>.pt,diffusion_hwdb1_last_<ts>.pt; - logs and tensorboard under
runs/<ts>/.
Gp 1: One-DM Reimplementation on HWDB1 dataset Gp 1': Gp1 ckpt continued training (lower LR) Gp 2: CHILI with DDIM
| ExpID | Comments | BS | LR | Backbone | HighNCE | LowNCE | Content | Xstart | Load Ckpt | Best Epoch | Best Step | Recon Loss |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 132836 | 完全混乱 | 1024 | 5e-4 | resnet18 | 1.0 | 1.0 | - | - | N/A | 4 | 4000 | 0.109876 |
| 133224 | 完全混乱 | 1024 | 5e-4 | resnet32 | 1.0 | 1.0 | - | - | N/A | 3 | 3000 | 0.111537 |
| 135252 | 完全混乱 | 1024 | 5e-4 | resnet32 | 1.0 | 1.0 | - | - | N/A | 4 | 4000 | 0.110261 |
| 183205 | 完全混乱 | 1024 | 2e-5 | resnet32 | 1.0 | 1.0 | - | - | N/A | 7 | 8500 | 0.119643 |
| 184510 | epoch4:乱七八糟 | 1024 | 2e-5 | resnet18 | 1.0 | 1.0 | - | - | runs/diff_hwdb1_20251212_132836/diff_epoch0003_best.pt (resume) | 4 | 4000 | 0.109466 |
| 194209 | epoch4:乱七八糟 | 1024 | 2e-5 | resnet18 | 1.0 | 1.0 | - | - | runs/diff_hwdb1_20251212_132836/diff_epoch0003_best.pt (resume) | 4 | 4000 | 0.109466 |
| 213138 | epoch1:乱七八糟 | 1024 | 5e-5 | resnet32 | 1.0 | 1.0 | - | - | runs/diff_hwdb1_20251212_183205/diff_epoch0006_best.pt (resume) | 12 | 15000 | 0.113213 |
| 004312 | epoch1:乱七八糟 | 128 | 5e-5 | resnet18 | 1.0 | 1.0 | - | 1.0 | N/A | 1 | 4000 | 0.153818 |
| 015232 | epoch4/5:个别可识别 | 1024 | 5e-5 | resnet32 | 1.0 | 1.0 | - | - | N/A | 5 | 40000 | 0.114299 |
| 121215 | epoch14:不错;复杂字体有断笔画 / evo 效果好 | 128 | 1e-5 | resnet18 | 0.5 | 0.5 | - | 0.0 | runs/diff_hwdb1_20251213_015232/diff_epoch0005_best.pt (resume) | 8 | 66000 | 0.103914 |
| 164212 | epoch1:完全混乱 | 128 | 1e-4 | resnet18 | 0.5 | 0.5 | 1.0 | 1.0 | N/A | 1 | 2000 | 0.553935 |
| 173327 | epoch14:复杂字体不清晰 / evo 有好图 | 128 | 5e-5 | resnet18 | 1.0 | 1.0 | 1.0 | 1.0 | N/A | 1 | 2000 | 0.525631 |
| 230032 | epoch1:完全混乱 | 128 | 2e-5 | resnet18 | 1.0 | 1.0 | 1.0 | 0.1 | N/A | 1 | 2000 | 0.562972 |
| 001631 | epoch1:完全混乱 | 128 | 2e-5 | resnet18 | 1.0 | 1.0 | 0.5 | 1.0 | N/A | 1 | 1000 | 0.396908 |
| 015721 | epoch4:复杂字体不清晰 | 128 | 2e-5 | resnet18 | 1.0 | 1.0 | 0.0 | 1.0 | N/A | 4 | 33000 | 0.124525 |
| 133318 | epoch6:图像很虚,笔画不连贯,无法辨认 | 256 | 1e-4 | resnet32 | 1.0 | 1.0 | 0.0 | 1.0 | N/A | 3 | 15000 | 0.126090 |
- One-DM:One-Shot Diffusion Mimicker for Handwritten Text Generation
- MetaScript: Few-Shot Handwritten Chinese Content Generation via Generative Adversarial Networks

