Interpretable Art Image Classification via Multifractal and Topological Analysis

Master's thesis project · IPN-UPIITA · Advanced Technologies

This repository implements an interpretable classification system for artwork images that combines multifractal analysis and topological data analysis (TDA) — two tools from complexity science — with XGBoost and SHAP explanations.

The key idea: instead of treating a CNN as a black box, every feature has a direct physical meaning (fractal dimension, Hurst exponent, persistence entropy…), so the model's decisions can be traced back to measurable structural properties of the painting.

Results snapshot

Task	Classes	Balanced Accuracy	AUC
Movement (multiclass)	12	0.49	—
Genre (multiclass)	10	0.49	—
Artist (multiclass)	60	0.38	—
Impressionism vs. rest	2	0.76	—
Van Gogh vs. rest	2	0.77	—
Savrasov vs. Levitan	2	0.90	0.95
Landscape vs. Portrait	2	0.90	0.97
Cubism vs. Romanticism	2	0.92	0.98
Claude Monet vs. Aivazovsky	2	—	—

Random baseline: ≈ 0.083 (movement) · 0.100 (genre) · 0.017 (artist).

Methods

Feature extraction

Each painting is normalised to 1380 × 1380 px and analysed at three spatial scales: global (1×1), quadrant (2×2) and nonet (3×3). For each segment, three descriptors are computed:

Method	Output	Key descriptors
MF-DFA 2D	14 features / segment	Hurst exponent, Δα, α*, asymmetry index, τ(q) curvature
MF-Rényi	14 features × 3 measures / segment	D₀, D₁, D₂, Δα, α* for intensity sum, variance and entropy
TDA (H0/H1)	10 features / segment	Persistence entropy, mean/std/max lifetime, component and cycle counts

Classification pipeline

Variance threshold — remove near-zero variance features
StandardScaler + Isolation Forest — outlier removal (5% contamination)
SMOTE — per-fold oversampling to handle class imbalance
XGBoost — 5-fold stratified cross-validation
SHAP TreeExplainer — out-of-fold importance aggregation

Repository structure

ArtClassification/
├── utils.py                  # Image normalisation, segmentation, scale helpers
├── mfdfa.py                  # 2-D MF-DFA (low-level, image-by-image)
├── mfrenyi.py                # MF-Rényi box-counting (intensity/variance/entropy)
├── multifractal_functions.py # Batch pipelines: mf_dfa_features, mf_renyi_features,
│                             #   mf_combined_features (single scale loop)
├── tda.py                    # Persistent homology H0/H1 (Union-Find + numba)
├── tda_functions.py          # TDA feature extraction wrappers
├── validationtda.py          # Synthetic tests for H1 cycle computation
│
├── compute_features.ipynb    # Feature extraction over the full dataset
├── classification.ipynb      # Classification experiments + SHAP analysis
├── Validacion.ipynb          # Validation and ablation studies
├── pruebas.ipynb             # Exploratory experiments
└── DATA/                     # Feature files — not tracked in git (see .gitignore)

Classification experiments (`classification.ipynb`)

The notebook runs the full pipeline for the following cases, each corresponding to a result in the thesis presentation:

Case	`validos`	Slide
Impressionism vs. rest	`['Impressionism']`	p. 24
Van Gogh vs. rest	`['vincent-van-gogh']`	p. 25
Savrasov vs. Levitan	`['aleksey-savrasov', 'isaac-levitan']`	p. 26
Landscape vs. Portrait	`['landscape', 'portrait']`	p. 27
Cubism vs. Romanticism	`['Cubism', 'Romanticism']`	p. 28
Monet vs. Aivazovsky	`['claude-monet', 'ivan-aivazovsky']`	p. 28

Switch between cases by uncommenting the corresponding validos = [...] line in Cell 14 of the notebook.

Tech stack

Python 3.10+ · numpy · scipy · numba (JIT compilation)
Computer vision: opencv-python
ML: scikit-learn · xgboost · imbalanced-learn · shap
Dataset: WikiArt (portraits, landscapes — via SNIC)

pip install numpy scipy numba opencv-python scikit-learn xgboost imbalanced-learn shap

Background

Deep learning achieves ~79% accuracy on art style classification (Ma et al., 2025) but offers little interpretability. This project trades some accuracy for full transparency: every prediction can be attributed to specific spatial regions and physical descriptors of the painting's texture.

Key findings from SHAP:

TDA captures brushstroke topology differences between artists with similar techniques.
MF-Rényi detects intensity contrast patterns characteristic of portrait backgrounds and faces.
MF-DFA reveals that Romanticism paintings have smoother textures in upper quadrants — consistent with their frequent depiction of skies and horizons.

Jardi Yulistian García Bustamante · Director: Dr. Lev Guzmán Vargas · Co-director: Dr. Daniel Aguilar Velázquez

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Interpretable Art Image Classification via Multifractal and Topological Analysis

Results snapshot

Methods

Feature extraction

Classification pipeline

Repository structure

Classification experiments (`classification.ipynb`)

Tech stack

Background

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
README.md		README.md
Validacion.ipynb		Validacion.ipynb
classification.ipynb		classification.ipynb
compute_features.ipynb		compute_features.ipynb
compute_mf.ipynb		compute_mf.ipynb
mfdfa.py		mfdfa.py
mfrenyi.py		mfrenyi.py
multifractal_functions.py		multifractal_functions.py
pruebas.ipynb		pruebas.ipynb
tda.py		tda.py
tda_functions.py		tda_functions.py
utils.py		utils.py
validationtda.py		validationtda.py

Folders and files

Latest commit

History

Repository files navigation

Interpretable Art Image Classification via Multifractal and Topological Analysis

Results snapshot

Methods

Feature extraction

Classification pipeline

Repository structure

Classification experiments (classification.ipynb)

Tech stack

Background

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Classification experiments (`classification.ipynb`)

Packages