docOCR is a macOS command-line OCR tool that converts document images into Markdown text. It can run as a batch CLI tool or as a local HTTP server for browser uploads and API clients.
- Converts image files to Markdown text.
- Writes batch OCR output next to each source image using the same basename and a
.mdextension. - Converts detected paragraphs, lists, and tables into Markdown when Apple's document recognition API identifies them.
- Provides a local web UI for uploading an image and viewing OCR output.
- Provides a JSON API for image upload and OCR response.
- Uses Apple's
RecognizeDocumentsRequestAPI, available on macOS 26+. - Performs OCR locally on the Mac. OCR recognition does not require sending images to an external network service.
The HTTP server is implemented with Vapor.
- macOS 26 or later.
- Xcode / Swift toolchain that supports the package's Swift tools version.
- Network access may be required during the first build so Swift Package Manager can fetch dependencies such as Vapor.
Show help:
docOCR -h
docOCR --helpShow version:
docOCR -V
docOCR --versionConvert image files to Markdown:
docOCR ~/Desktop/book_imgs/*.jpgThis prints the OCR Markdown text to the terminal.
Write Markdown files next to the source images:
docOCR -o ~/Desktop/book_imgs/*.jpgEach input file is written as a Markdown file next to the image:
~/Desktop/book_imgs/01.jpg -> ~/Desktop/book_imgs/01.md
~/Desktop/book_imgs/02.jpg -> ~/Desktop/book_imgs/02.md
Existing .md files with the same name are overwritten.
Start the HTTP server:
docOCR -sBy default, the server listens on port 8080.
Use a custom port:
docOCR -s -p 8000The -s and -o modes are mutually exclusive.
When the server is running, open:
http://0.0.0.0:8080
If you start the server with a custom port, use that port instead:
http://0.0.0.0:8000
The web page uses:
POST /upload
This route is intended for browser form uploads and returns an HTML result page.
Use the JSON API endpoint:
POST /api/ocr
Example:
curl -X POST http://127.0.0.1:8000/api/ocr \
-F "[email protected]"The API also accepts image as the multipart field name:
curl -X POST http://127.0.0.1:8000/api/ocr \
-F "[email protected]"Successful response:
{
"success": true,
"message": "OK",
"text": "OCR text..."
}Error response:
{
"success": false,
"message": "Error message",
"text": ""
}Build a debug executable:
swift buildBuild a release executable:
swift build -c releaseThe release binary is generated at:
.build/release/docOCR
Build the release binary:
swift build -c releaseInstall it somewhere on your PATH, for example:
install -m 755 .build/release/docOCR /usr/local/bin/docOCRThen run:
docOCR -hIf /usr/local/bin is not writable or not on your PATH, choose another directory such as ~/bin and make sure that directory is included in your shell PATH.
Run directly with SwiftPM:
swift run docOCR -o ~/Desktop/book_imgs/*.jpg
swift run docOCR -s -p 8000docOCR can also be used with the Shortcuts app on macOS to turn a screenshot into Markdown text.
In this workflow, the shortcut captures a screen selection, passes the screenshot image path to docOCR, reads the Markdown text from stdout, copies it to the clipboard, and then lets you paste the result into any text editor.
Then run the macOS shortcut:
The shortcut flow is:
- Capture a screenshot.
- Save the screenshot as a temporary image file.
- Run
docOCR <screenshot-image-path>. - Read the OCR Markdown text from stdout and copy it to the clipboard.
Paste the result into your editor.
Alternatively, the shortcut can call the /api/ocr API instead of running docOCR directly. Start the local server first:
docOCR -sIf you use Codex, you can install the companion skill for docOCR: dococr-skill
The skill gives Codex reusable context for docOCR CLI usage, local HTTP API calls, OCR execution, and troubleshooting.



