doc

PDF → Markdown via docling, with a live progress bar and MiniCPM-V 4.6 figure captions.

Two scripts, ~250 lines. No package, no framework.

doc — drives docling page-by-page so you get a real progress bar (page 47/232 · ETA 3m) and crash-resume. Output lands next to the source PDF.
minicpm-describe.py — captions every figure with MiniCPM-V 4.6 (MLX, Apple Silicon). Runs in its own ephemeral uv env because mlx-vlm and docling pin incompatible transformers; doc --describe invokes it as a subprocess.

Usage

doc book.pdf                 # standard pipeline: layout + OCR (Apple Vision if scanned) + tables
doc --vlm book.pdf           # Granite-Docling MLX VLM — one model does layout+OCR+tables
doc --vlm --resume book.pdf  # continue an interrupted run
doc --describe book.pdf      # standard OCR + a MiniCPM-V 4.6 caption of each figure, inline
doc --describe --short book.pdf   # terser captions (~1.9x faster)
doc --out DIR book.pdf       # output directory (default: next to the source PDF)

Setup

# docling, with the python that doc's shebang points at:
uv tool install docling

# symlink the command (this repo is the source of truth):
ln -s "$PWD/doc" ~/.local/bin/doc

doc --describe and --vlm need Apple Silicon (MLX). MiniCPM-V 4.6 weights download on first use into HF_HUB_CACHE (defaults to ~/models). The captioner env is resolved automatically by uv run from the PEP-723 header in minicpm-describe.py — nothing to install.

Why MiniCPM-V 4.6

Replaced moondream2 after a head-to-head on real decks: both transcribe chart numbers, but MiniCPM stays grounded where moondream confabulates a business narrative and misreads a matrix as a "bar chart". MiniCPM is built for text-bearing images and runs on MLX (no Ollama, no API).

Layout

doc                   # the CLI (docling page-by-page + resume + describe flow)
minicpm-describe.py   # PEP-723 mlx-vlm figure captioner (isolated env)

This repo was once a multi-provider pdftoolkit package with a benchmark harness; it collapsed to the two scripts that are the actual tool. That history is in git (git log --follow).

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
README.md		README.md
doc		doc
minicpm-describe.py		minicpm-describe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

doc

Usage

Setup

Why MiniCPM-V 4.6

Layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

doc

Usage

Setup

Why MiniCPM-V 4.6

Layout

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages