This project evaluates state-of-the-art few-shot anomaly classification methods on the AeBAD dataset, a real-world aero-engine blade inspection dataset. We specifically implement and test WinCLIP and WinCLIP+, leveraging zero-shot and few-shot learning paradigms to improve anomaly classification and localization. The objective is to demonstrate the robustness of few-shot methods in low-data regimes and ensure fair comparison with existing benchmarks.
..
├── WinCLIP # WinCLIP implementation for anomaly detection
├── MMR_fewer_shot # MMR Benchmark (Reduced-Data Evaluation)
├── report.pdf # The main report of this project
├── poster.pdf # Poster from our poster presentation
├── LICENCE # MIT Licence
├── README.md # This README
-
WinCLIP (Zero-Shot)
WinCLIP leverages a pre-trained vision-language model for anomaly detection without any fine-tuning. It demonstrates strong generalization capabilities but lacks pixel-level alignment in certain cases. -
WinCLIP+ (Few-Shot Learning)
Incorporating a small number of normal samples (1–4 shots) significantly enhances WinCLIP's performance. This reduces false positives and improves anomaly localization, especially in scenarios involving domain shifts (e.g., illumination or background changes). -
MMR Benchmark (Reduced-Data Evaluation)
MMR, a reconstruction-based method, is evaluated on smaller subsets of AeBAD. This provides a balanced benchmark for comparison with WinCLIP+. -
Benchmark Comparisons
The project evaluates the Mean AUROC (%) of WinCLIP+ against full-data benchmarks like PatchCore, ReverseDistillation, and other state-of-the-art methods.
The experiments evaluated the performance of WinCLIP and WinCLIP+ across various settings. Key observations include:
- Zero-Shot Learning: WinCLIP achieves reasonable performance without requiring labeled data but is limited in fine-grained anomaly segmentation.
- Few-Shot Learning: WinCLIP+ achieves significant performance improvements, demonstrating the benefits of incorporating minimal training data.
- Fair Comparison: WinCLIP+ was extended to 4 shots for direct comparison with benchmarks trained on full datasets. MMR was evaluated on a reduced dataset to ensure balanced baseline evaluation.
| Source | Method | Same | Background | Illumination | View | Mean |
| Zhang et al. | PatchCore | 75.2 ± 0.3 | 74.1 ± 0.3 | 74.6 ± 0.4 | 60.1 ± 0.4 | 71.0 |
| ReverseDistillation | 82.4 ± 0.6 | 84.3 ± 0.9 | 85.5 ± 0.9 | 71.9 ± 0.8 | 81.0 | |
| DRAEM | 64.0 ± 0.4 | 62.1 ± 6.1 | 61.6 ± 2.7 | 62.3 ± 0.9 | 62.5 | |
| NSA | 66.5 ± 1.4 | 48.8 ± 3.5 | 55.5 ± 3.2 | 55.9 ± 1.1 | 56.7 | |
| RIAD | 38.6 ± 0.6 | 41.6 ± 1.3 | 46.8 ± 0.8 | 33.0 ± 0.6 | 40.0 | |
| InTra | 39.8 ± 0.8 | 46.1 ± 0.5 | 44.7 ± 0.3 | 46.3 ± 1.5 | 44.2 | |
| Our work | MMR (Recreated Benchmark) | 85.6 ± 0.5 | 84.4 ± 0.7 | 88.8 ± 0.5 | 79.9 ± 0.6 | 84.7 |
| WinCLIP+ (0-Shot) | 80.3 ± 0.2 | 82.9 ± 0.5 | 67.0 ± 0.3 | 82.0 ± 0.3 | 78.0 | |
| WinCLIP+ (1-Shot) | 80.7 ± 0.5 | 83.1 ± 0.5 | 67.4 ± 0.6 | 82.1 ± 0.4 | 78.3 | |
| WinCLIP+ (4-Shot) | 80.9 ± 0.2 | 83.7 ± 0.4 | 67.7 ± 0.4 | 81.9 ± 0.3 | 78.6 |
This work was completed as part of the ETH Zurich Data Science Lab. Special thanks to the ETH AI Center and IBM Research Europe for their support and collaboration.
This project is released under an MIT License.