Image Classification (ImageNet) in DuckDB

A Query.Farm VGI worker for DuckDB.

Image Classification (ImageNet) in DuckDB

vgi-vision · a Query.Farm VGI worker · powered by ONNX Runtime

A VGI worker that brings image classification to DuckDB/SQL. Point it at a column of image bytes (BLOB) — or a file path — and get ImageNet-1k predictions back: a single top label per row, or the top-K labels with confidence scores, as plain SQL scalar and table functions. Inference runs out-of-process on MobileNetV2 via ONNX Runtime (no PyTorch); the model and labels download on first use and cache locally. It is built to be commercial-marketplace-safe — deliberately no AGPL-licensed Ultralytics/YOLO — so every dependency is permissively licensed.

INSTALL vgi FROM community; LOAD vgi;
ATTACH 'vision' (TYPE vgi, LOCATION 'uv run vision_worker.py');

SELECT id, vision.top_label(image)                       FROM photos;
SELECT * FROM vision.classify((SELECT image FROM photos LIMIT 1));      -- top 5
SELECT * FROM vision.classify((SELECT image FROM photos LIMIT 1), 10);  -- top 10
SELECT * FROM vision.image_classes();                                   -- 1000 labels

Inference runs out-of-process on a permissively-licensed ONNX model via onnxruntime. The model + labels are downloaded on first use and cached; nothing large is committed to the repo.

Models & licensing

This worker is built specifically to be safe for a commercial marketplace. It deliberately does not use Ultralytics / YOLOv8, whose code is AGPL-3.0 — unacceptable for proprietary/commercial distribution (the same reason vgi-pdf rejected PyMuPDF). Everything below is permissively licensed.

Component	What	License
Classifier weights	MobileNetV2 (opset 12), trained on ImageNet-1k (1000 classes). Sourced from the ONNX Model Zoo, mirrored on Hugging Face as `onnxmodelzoo/mobilenetv2-12`.	Apache-2.0
Inference runtime	onnxruntime	MIT
Image decoding	Pillow	MIT-CMU (HPND)
Numerics	NumPy	BSD-3-Clause
Labels	ImageNet-1k synset list (`synset.txt`)	Apache-2.0 (ONNX Model Zoo)
This worker	vgi-vision	MIT (see `LICENSE`)

The exact model file is mobilenetv2-12.onnx (~13.3 MB). It is downloaded on first use to ~/.cache/vgi-vision/ (override with VGI_VISION_CACHE_DIR, or point VGI_VISION_MODEL / VGI_VISION_LABELS at local copies). Pre-fetch it with make fetch-model. The weights are never committed (gitignored).

Object detection — future work

Object detection (detect_objects, count_objects, has_object) is not shipped in v1. The popular detection models (YOLOv5/v8) carry AGPL-3.0 or otherwise non-permissive terms, and the clearly-permissive alternatives (e.g. an SSD-MobileNet exported from the Apache-2.0 TensorFlow Object Detection API) lack a single stable, well-labelled ONNX artifact we were comfortable bundling for a commercial product. Rather than ship a model with murky provenance, v1 ships classification only. Detection can be added later behind the same classify-style surface once a vetted Apache/BSD/MIT detection model + labels are pinned.

Function surface

Classification

Function	Kind	Returns	Notes
`classify(image[, top_k])`	table	`(label VARCHAR, confidence DOUBLE)`	Top-k predictions (default `top_k=5`), ordered by `confidence` desc. `image` is a BLOB; a `classify(path[, top_k])` VARCHAR overload reads a file off disk.
`top_label(image)`	scalar	`VARCHAR`	The #1 predicted label. BLOB input; `top_label(path)` VARCHAR overload reads a file.
`image_classes()`	table	`(idx INT, label VARCHAR)`	The model's full 1000-class ImageNet label set.

confidence values are softmax probabilities in [0, 1].

NULL / robustness semantics

Images are untrusted. The worker never crashes on bad input:

NULL image → NULL (top_label) / no rows (classify).
Malformed / non-image / empty bytes → NULL / no rows.
Over-large images (a decompression-bomb guard caps decode at 64 MP) → NULL / no rows.
A genuine model-load failure (offline, disk) raises a clear, actionable error telling you how to fetch the model — it is not silently swallowed per row.

Performance / warm-up

The ONNX session + labels load once per worker process and are cached for its lifetime, so the cost is amortised across every row of every query. The worker also warms the model at startup (vision_worker.py's run() calls model.warm_up()), so the first query of an ATTACH is fast and the end-to-end SQL suite stays deterministic under load.

Development

export PATH="$HOME/.local/bin:$PATH"
uv sync --extra dev
make fetch-model            # download + cache the model and labels (one-off)
uv run --no-sync pytest -q  # unit + integration tests
make test-sql               # DuckDB end-to-end (.test) over a tiny generated PNG
uv run --no-sync ruff check . && uv run --no-sync mypy vgi_vision/

Model-dependent tests skip cleanly when the model isn't present (e.g. offline), so a bare checkout still goes green; they run for real once make fetch-model (or any first query) has populated the cache.

Layout

vision_worker.py        # worker entry point (PEP 723 header) + startup warm-up
vgi_vision/
  model.py              # pure inference: download/cache, warm-up, preprocess, classify
  scalars.py            # top_label(image) / top_label(path)
  tables.py             # classify(...), image_classes()
  schema_utils.py       # field() column-comment helper
tests/                  # pytest: pure logic + in-process scalar/table + Client E2E
test/sql/               # DuckDB sqllogictest .test files

Authorship & License

Written by Query.Farm.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
ci		ci
docs		docs
test/sql		test/sql
tests		tests
vgi_vision		vgi_vision
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
conftest.py		conftest.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock
vision_worker.py		vision_worker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Classification (ImageNet) in DuckDB

Models & licensing

Object detection — future work

Function surface

Classification

NULL / robustness semantics

Performance / warm-up

Development

Layout

Authorship & License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Image Classification (ImageNet) in DuckDB

Models & licensing

Object detection — future work

Function surface

Classification

NULL / robustness semantics

Performance / warm-up

Development

Layout

Authorship & License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages