Releases are updated with every new version --> https://github.com/LegeApp/Lege/releases/
Lege is a document-processing program (CLI + desktop GUI) that converts scanned documents into reader-optimized PDF or DjVu, focusing on better readability, smaller output size, and fast page turns on e-ink devices. It uses optional layout-aware processing to detect image areas so that they can be excluded from the text binarization process, which makes the original scanned documents readable on e-ink readers with small file size.
There are 2 generally intended usages for the program; outputs of commercial book scanning utilities such as image folders of JPEG or PNG, and outputs of the Internet Archive in either PDF or JP2 zip or image folder, since the Internet Archive is the largest digital repository of scanned digital books and documents. If there is something old you want to read on e-ink, it is probably on Archive.org but it has yellowed aged page scans and the size of the book is 500MB. Lege is for those files. Further information is in the in-program documentation file.
- CLI: guided interactive mode (no args) + direct command modes
- GUI: Freya desktop app using the same processing core; queue-based workflow with progress + cancel
git clone https://github.com/LegeApp/Lege.git
cd Lege
cargo build --releaseYou’ll get:
- CLI:
target/release/lege - GUI:
target/release/lege-gui
# simplest: optimized PDF output
lege input.pdf
# DjVu output (optionally with OCR)
lege input.pdf --output-format djvu --ocr
# process a page range
lege input.pdf --pages 10-50the CLI also supports an interactive guided mode when run without arguments.
- PDF files (with optional page range selection)
- Image-folder mode for sequential page images (used for batch/page-image workflows)
- Debug modes for exporting rendered pages / crops (useful for model and pipeline inspection)
- PDF: mixed region encoding (compressed bi-level text + preserved image regions as overlays)
- DjVu: native Rust encoder with JB2 (bi-level) + IW44 (continuous-tone) layering
Lege requires several external files to be placed alongside the executables:
ONNX Models (AI inference):
yolo-layout.onnx- Layout detection (Linux production model)paddle-layout.onnx- Layout detection (Windows and MacOS model)paddle-rotate.onnx- Page orientation detectionpaddle-deskew.onnx- Page deskew correctionsauvola.onnx- Heavy neural binarization model
Platform-specific GPU libraries:
Windows:
DirectML.dll- DirectML acceleration provideronnxruntime.dll- ONNX Runtime main libraryonnxruntime_providers_shared.dll- Shared provider librarypdfium.dll- PDF rendering engine
Linux:
libonnxruntime.so- ONNX Runtimelibonnxruntime_providers_shared.so- Provider librarylibwebgpu_dawn.so- WebGPU/Vulkan backendlibpdfium.so- PDF rendering engineeng.traineddata- Tesseract English language data (for OCR)
macOS:
libonnxruntime.dylib- ONNX Runtimelibpdfium.dylib- PDF rendering engine- Tesseract language data (system installation)
Lege is an end-to-end document transformation system with distinct pipelines for PDF and DjVu output.
-
Render pages (PDF → images) using PDFium (with thread-safety guardrails).
-
Layout inference (optional): run an ONNX layout model on a low-res render; map detections into text-like vs image-like buckets.
-
Region processing
- Text regions: binarize + encode with bi-level codecs
- Image regions: preserve/encode separately; composite as overlays where applicable
- Optional OCR integration at region or page level
-
Assemble output
- PDF writer actor: ordered page finalize into a single PDF
- DjVu writer actor: out-of-order page submission + multipage finalize
Implemented as a multi-stage async pipeline with bounded channels and configurable concurrency:
- render → inference → CPU page processing → ordered writer/finalizer
- supports page ranges and optional two-pass margin normalization
Separate pipeline to match DjVu constraints:
- similar render/inference conceptually
- produces DjVu page payloads submitted to a DjVu writer actor
- supports layered JB2/IW44 output, and optional hidden text
Lege can run layout detection to segment a page into regions and apply different encoding strategies. The exact classes depend on the model used (the existing README references a PaddleX-style detector).
When layout detection is disabled, Lege follows a more uniform “whole-page” processing strategy.
- Text-like regions are typically converted to 1-bit (bi-level) using adaptive binarization logic in the encoding layer.
- Image-like regions can be preserved/encoded separately and overlaid onto the output (so photos/diagrams don’t get crushed into 1-bit).
Dithering can be used for halftone/image handling depending on the chosen mode and encoder strategy.
OCR is optional:
- Linux/macOS: Tesseract
- Windows: WinRT OCR
Strategy:
- prefer bounded region OCR when layout segmentation is workable
- fall back to tiled or full-page OCR as needed
- when OCR is disabled, Lege can optionally reuse/extract text from PDFs that already have a text layer to synthesize a text overlay where possible
Lege uses a dedicated encoding crate (Legencode) for in-memory processing and multiple output encoders, and a dedicated native DjVu encoder (DJVULibRust) for DjVu generation.
- JBIG2 (via a Rust port under
Legencode) - CCITT Group 4 (fax-style bi-level compression)
- JPEG2000 (used for cover/photo regions in common paths)
- DjVu IW44 (continuous-tone layer inside DjVu)
- Concurrent pipeline with bounded channels/backpressure
- Cancellation + progress tracking shared by CLI and GUI
- Runtime dependency discovery (models/libs) via executable-adjacent paths, env vars, and platform fallback dirs
Lege is a Rust workspace with multiple crates:
src/— main app + pipeline orchestration (CLI core)Legencode/— encoding + binarization + region utilitiesDJVULibRust/— native DjVu encoder crateGUI/Freya/— desktop GUI frontend
AGPL-3.0. See LICENSE. Third-party licenses are documented under docs/.

