Skip to content

sumith25-dev/ML-ops-Image-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

16 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎨 Adobe MLOps Image Classifier

Built for Adobe Machine Learning Engineer (2026 Batch) Application
A production-grade MLOps pipeline that trains, deploys, monitors, and auto-retrains an image classification model β€” mirroring Adobe's real infrastructure for Firefly and Creative Cloud AI features.

CI/CD Python PyTorch MLflow Docker


πŸ† What I Built

A complete end-to-end MLOps system with:

  • Image Classification API β€” EfficientNet-B0 model trained on 14,000 real images achieving 86.85% validation accuracy
  • A/B Testing β€” Deterministic per-user routing (80% stable / 20% canary) with instant rollback
  • MLflow Tracking β€” Full experiment tracking, model registry, and versioning
  • Drift Detection β€” Automatic data drift monitoring with Evidently AI
  • CI/CD Pipeline β€” GitHub Actions with lint, test, and Docker build stages βœ… Passing

Here click my [@project-website] (https://ml-ops-image-classification-production.up.railway.app/docs#/default/predict_predict_post) as of now I deployed this using railway.app I trained of 5 epochs, if it trained around 10 epochs then it confidence will increase from 20 to 80-90 percent

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    GitHub Actions CI/CD                      β”‚
β”‚         push β†’ lint β†’ test β†’ Docker build β†’ deploy         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β–Ό               β–Ό               β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚  FastAPI    β”‚ β”‚   MLflow     β”‚ β”‚   PostgreSQL     β”‚
 β”‚  A/B Test   β”‚ β”‚  Registry    β”‚ β”‚   Database       β”‚
 β”‚  port 8000  β”‚ β”‚  port 5001   β”‚ β”‚   port 5432      β”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ Tech Stack

Category Tools
ML Framework PyTorch 2.3, EfficientNet-B0, Scikit-learn
MLOps MLflow 2.13 (experiment tracking, model registry)
Serving FastAPI, Uvicorn
Containerisation Docker, Docker Compose
CI/CD GitHub Actions βœ…
Drift Detection Evidently AI
Database PostgreSQL
Cloud Ready Terraform (AWS ECR, S3, SageMaker)

πŸ“Š Model Performance

Metric Value
Model EfficientNet-B0 (ImageNet pretrained)
Dataset Intel Image Classification (14,000 real images)
Classes buildings, forest, glacier, mountain, sea, street
Training Epochs 10
Best Val Accuracy 86.85%
Model Version v2 (registered in MLflow)

πŸš€ Quick Start

Prerequisites

  • Docker Desktop installed
  • 4GB free disk space

Run locally

git clone https://github.com/sumith25-dev/ML-ops-Image-Classification.git
cd ML-ops-Image-Classification
docker-compose up -d api mlflow postgres

Open the UIs

Service URL
FastAPI docs http://localhost:8000/docs
MLflow UI http://localhost:5001

πŸ§ͺ Test the API

curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://images.pexels.com/photos/1547813/pexels-photo-1547813.jpeg",
    "user_id": "user_001"
  }'

Response:

{
  "label": "forest",
  "confidence": 0.87,
  "model_version": "stable",
  "latency_ms": 245.32
}

🎯 Key Features

1. A/B Testing

User ID β†’ MD5 Hash β†’ Bucket (0-9999)
  0-7999  β†’ stable model  (80% traffic)
  8000-9999 β†’ canary model (20% traffic)

Same user always hits the same model β€” consistent experience.

2. Model Lifecycle

Train β†’ MLflow Registry β†’ Assign 'stable' alias β†’ API loads automatically
                       β†’ Assign 'canary' alias  β†’ 20% traffic routed

3. Drift Detection

Run drift_detector.py
β†’ If drift > 30% OR accuracy < 80%
β†’ Flag for retraining
β†’ Promote new model to canary

4. CI/CD Pipeline

git push β†’ GitHub Actions
         β†’ Lint (ruff)      βœ…
         β†’ Unit Tests       βœ…
         β†’ Docker Build     βœ…

πŸ“ Project Structure

ML-ops-Image-Classification/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ main.py              # FastAPI β€” predict, health, metrics, rollback
β”‚   └── model_manager.py     # MLflow loading, A/B routing, inference
β”œβ”€β”€ model/
β”‚   └── train.py             # EfficientNet training with MLflow tracking
β”œβ”€β”€ monitoring/
β”‚   └── drift_detector.py    # Evidently drift analysis
β”œβ”€β”€ pipeline/
β”‚   └── mlops_dag.py         # Airflow DAG β€” drift β†’ retrain β†’ promote
β”œβ”€β”€ tests/
β”‚   └── test_api.py          # Unit tests
β”œβ”€β”€ infra/
β”‚   β”œβ”€β”€ main.tf              # Terraform β€” AWS ECR, S3, SageMaker
β”‚   └── prometheus.yml       # Prometheus config (future deployment)
β”œβ”€β”€ notebooks/
β”‚   └── exploration.ipynb    # EDA and drift visualisation
β”œβ”€β”€ .github/workflows/
β”‚   └── ci_cd.yml            # GitHub Actions CI/CD βœ… passing
β”œβ”€β”€ docker-compose.yml       # Full local stack
β”œβ”€β”€ Dockerfile               # API container
└── requirements.txt

🌊 Drift Detection

# Run manually or schedule via Airflow
python monitoring/drift_detector.py

# Auto-retraining triggers when:
# - drift_share >= 30% of features drifted
# - model accuracy < 80%

☁️ AWS Deployment (Ready)

cd infra
terraform init
terraform apply -var="environment=prod"

Provisions:

  • ECR repository for Docker images
  • S3 bucket for MLflow artifacts
  • SageMaker endpoint with blue/green deployment
  • IAM roles and policies

  1. 86.85% accuracy on 14,000 real images using EfficientNet-B0

  2. A/B testing β€” Deterministic per-user hash routing for consistent user experience, configurable split at runtime

  3. CI/CD β€” Every commit automatically lints, tests, and builds Docker image via GitHub Actions βœ…

  4. Drift detection β€” Evidently AI detects when production data drifts from training distribution, triggering automatic retraining

  5. MLflow registry β€” Complete model versioning with aliases, metrics tracking, and artifact storage

  6. Zero-downtime deployment β€” Blue/green deployment with instant rollback via single API call


πŸ‘¨β€πŸ’» Author

Sumith B R β€” Applying for Adobe Machine Learning Engineer (2026 Batch)

GitHub: @sumith25-dev


πŸ“„ License

MIT License

About

Production-grade MLOps pipeline for image classification. EfficientNet-B0 trained on 14,000 real images (86.85% accuracy) with FastAPI serving, A/B testing, MLflow model registry, drift detection, and GitHub Actions CI/CD. Deployed on Railway. Built for Adobe ML Engineer application.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors