Skip to content

RitAreaSciencePark/sem-image-classifier-api

Repository files navigation

Reusable Async ML API Platform

Codegen-driven async ML API platform on Kubernetes. Define a service in services/<id>/service.yaml; the generator renders identical deployment shapes for dev (Stencil K3s) and prod (manual kubectl). Ships with SEM image classifiers as reference implementations.

DOI: 10.5281/zenodo.20702007

Stack: BentoML + Redis + KrakenD + PostgreSQL usage tracking. JWT RS256 via shared Authentik (authentik-reusable-ml-services).

Maintainer journey

1. make onboard SERVICE=x MODEL_ID=org/model     → scaffold service.yaml + model stub
2. Edit service.yaml, src/models/, secrets        → hand-written only
3. make render / make render-prod                 → generated/ (never edit)
4. make deploy (dev) or prod-pack (prod)          → deploy bundle
5. usage-report/run.sh                            → HTML usage report in browser

Hand-written files: service.yaml, src/models/, secrets.local.yaml, prod.overlay.yaml. Everything under generated/ is codegen output.

Architecture

flowchart LR
  Client -->|JWT| KrakenD
  KrakenD --> BentoML
  BentoML --> Redis
  KrakenD --> PostgreSQL
  Authentik -->|M2M tokens| Client
Loading
Service Namespace Dev ports (API / Auth PF)
sem-classifier sem-classifier 8080 / 9001
sem-scale-classifier sem-scale-classifier 8082 / 9002

Quick start (dev)

cp k8s/.env.example k8s/.env          # set GHCR_TOKEN (write:packages PAT)
# Optional: k8s/env/dev/cluster.local.env — SSH/tunnel overrides (see dev-environment-setup.md)

make check-prereqs
make infra-deploy
make deploy-all DEPLOY_ARGS=--rebuild
make configure-all
make access SERVICE=sem-classifier    # repeat per service or see docs
make test-all
make usage-report SERVICE=sem-classifier   # HTML report → /tmp/

Registry handoff (forks)

  1. Set ghcr_owner in ml_platform/config.yaml.
  2. make render-all ENV=dev && make render-prod-all
  3. First make deploy SERVICE=x DEPLOY_ARGS=--rebuild creates the GHCR package.
  4. Set each package to Public on GitHub (K3s pulls without imagePullSecrets).

Image path (dev = prod): ghcr.io/<ghcr_owner>/<service-id>:latest

Repository layout

sem-classifier-api/
├── ml_platform/           # Generator, templates, config.yaml
├── services/<id>/         # service.yaml, secrets, prod.overlay.yaml
│   └── generated/         # Rendered dev/prod artifacts (do not edit)
├── src/core/              # Shared async pipeline
├── src/models/            # Per-model BentoML services
├── gateway/               # KrakenD flexible config + usage plugin
├── k8s/app.sh             # Dev deploy only
├── k8s/infra.sh           # Shared Authentik (dev)
├── Makefile               # Primary operator interface
└── docs/README.md         # Documentation index

Makefile targets

Run make help for the full list. Common targets:

Target Purpose
make onboard SERVICE=x MODEL_ID=org/model Scaffold + render + validate + secrets
make render SERVICE=x Generate services/x/generated/dev/
make deploy SERVICE=x DEPLOY_ARGS=--rebuild Render, build, push GHCR, deploy
make fresh SERVICE=x Delete namespace + rebuild + configure
make test-all E2E all services (reads secrets.local.yaml)
make render-prod SERVICE=x Generate prod bundle
make verify-prod SERVICE=x Prod preflight (must pass)
make prod-pack SERVICE=x Tarball for prod operator
make usage-report SERVICE=x Last 24h API usage HTML report

New services: docs/adding-a-service.md.

Public API

KrakenD gateway (port from service.yaml dev_access.api_port, default 8080):

Method Endpoint Auth
GET /__health No
GET /health No
POST /api/v1/inference JWT
POST /api/v1/jobs/status JWT
POST /api/v1/jobs/results JWT
GET /api/v1/version No

Production

Production does not use k8s/app.sh. See docs/production-deployment.md:

make render-prod SERVICE=sem-classifier
make verify-prod SERVICE=sem-classifier
make prod-pack SERVICE=sem-classifier
# Operator applies generated/prod/apply-order.txt
# Usage report: cd usage-report && ./run.sh --namespace $NS

Related projects

Documentation

Full index: docs/README.md — setup, architecture, workers, autoscaling, production, usage reports.

About

Codegen-driven async ML API platform on Kubernetes: BentoML, Redis, KrakenD, PostgreSQL usage tracking. Multi-service SEM image classification.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors