Skip to content

ColCh/paperless-ngx-skill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

paperless-ngx-skill

A reusable skill + Python helpers for driving the Paperless-NGX document-management REST API from the command line via openapi-to-cli (ocli).

Paperless-NGX exposes a full OpenAPI schema, and ocli turns that schema into a set of generated CLI commands automatically. This repo wraps ocli in a thin Python CLI for command discovery, inspection, and execution, and adds standalone urllib helpers for the few operations ocli cannot handle correctly — document upload and array-field PATCH. It's aimed at anyone scripting Paperless (bulk imports, metadata cleanup, task triage) and at AI agents that need a reliable, schema-driven interface to a Paperless instance.

Everything is configured from the environment — nothing is hardcoded, and there is no real instance URL anywhere in this repo.

✨ Features

  • 🔎 Command discovery — search the generated command surface by natural language (BM25 ranking) or by regex, then inspect any command's parameters before running it.
  • ▶️ Execute any endpoint — run any Paperless API command through ocli with --page, --page_size, --ordering, and Django-ORM-style filters (--name__icontains, --tags__id__all, …). All responses are JSON, ready to pipe through jq.
  • ⬆️ Document upload (single + batch)pcli_upload.py posts one or many files via real multipart/form-data, applying metadata (title, correspondent, document type, tags, storage path, ASN, created date) at upload time. Polls the consume task and prints the resulting document_id, or returns the task UUID immediately with --no-wait.
  • 🩹 Reliable array-field PATCHpcli_update.py sends a proper JSON PATCH for fields ocli mangles (notably the tags array), replaces the full tag list, and verifies the stored state with a follow-up GET (order-insensitive on tags).
  • 🗑️ Delete — single-document delete works directly through ocli; the skill documents the HTTP 204 "empty body = success" behavior and the verify-with-GET-404 follow-up.
  • 🧾 Task & duplicate handling — read the background task queue (/api/tasks/), distinguish a real consume FAILURE from a SHA256 duplicate, and acknowledge failed-task rows (with the right permission) — including the array-serialization workaround ocli needs for the acknowledge call.

🤔 Why this exists

ocli is excellent for read-style and scalar-argument endpoints, but it has two hard limitations against the Paperless API:

  1. It cannot serialize arrays in PATCH/PUT bodies. A command like pn_api_documents_id_patch passes tags as a string rather than a JSON integer array. The API often returns 200 OK while the data silently does not change.
  2. It cannot build multipart bodies, so it cannot upload a document at all.

Rather than abandon ocli (which gives you the whole schema for free), this repo keeps ocli for everything it does well and routes only those two cases through small, dependency-free urllib helpers — pcli_upload.py for multipart uploads and pcli_update.py for JSON PATCH with read-back verification. You get full API coverage without silent data-loss footguns.

📦 Requirements

Dependency Notes
ocli Install globally: npm install -g openapi-to-cli. Must be on PATH.
Python ≥ 3.10 Standard library only — the helper scripts have no third-party deps.
jq Optional, but recommended for filtering the JSON output in examples.
keyring Optional — only if you store the API token in the system keyring instead of an env var.

⚙️ Configuration

Connection details come entirely from the environment — nothing is hardcoded, and there is no baked-in instance URL. PAPERLESS_URL has no default: if it is not set, the scripts exit with guidance rather than guessing.

Variable Required Default Purpose
PAPERLESS_URL yes (none — must be set) Base URL of your instance, e.g. https://paperless.example.com. Used as the API base and the OpenAPI schema source (<URL>/api/schema/).
PAPERLESS_API_TOKEN yes* (none) API token, sent as Authorization: Token <value>. Create it in Paperless under Settings → My Account → API Token.
PAPERLESS_KEYRING_SERVICE no paperless Keyring service name the token is read from. The keyring username is always PAPERLESS_API_TOKEN.

* The token may instead live in the system keyring (keyring set <service> PAPERLESS_API_TOKEN). When both the env var and a keyring entry exist, the env var takes precedence. The token value is never printed.

See .env.example for a copy-ready template.

🚀 Quick start

# 1. Point at your instance and provide a token (env var OR keyring).
export PAPERLESS_URL=https://paperless.example.com
export PAPERLESS_API_TOKEN=<your-token>
#   …or store the token in the keyring instead:
#   keyring set paperless PAPERLESS_API_TOKEN

# 2. Install ocli (one time, global).
npm install -g openapi-to-cli

# 3. Configure the ocli profile from the live schema (one time).
python3 scripts/pcli.py setup

# 4. Discover, inspect, and run commands.
python3 scripts/pcli.py search "list documents by tag"
python3 scripts/pcli.py inspect pn_api_documents
python3 scripts/pcli.py run pn_api_documents --ordering -created --page_size 10 \
  | jq -r '.results[] | "#\(.id) \(.title)"'

# Upload a document with metadata (multipart — NOT ocli).
python3 scripts/pcli_upload.py --file invoice.pdf \
    --title "Invoice 2026-05" --correspondent 40 --tag 54 --tag 55

# Update tags reliably (JSON PATCH with read-back verify — NOT ocli).
python3 scripts/pcli_update.py --id 49 --tags 54,55

pcli.py setup removes any existing paperless profile, re-creates it against $PAPERLESS_URL/api/schema/ with the pn_ command prefix and the Authorization: Token … header, and activates the profile. Re-running setup is the safe fix if the wrong ocli profile is ever active.

For the full workflow — discovery, common tasks, bulk/async uploads, duplicate detection, deletion, and task acknowledgement — read SKILL.md and references/api_overview.md.

🤖 Using as an AI agent skill

SKILL.md is a Claude / agent skill file: it has skill frontmatter (name, description, trigger keywords) followed by task-oriented instructions an agent can act on directly. Drop this repo into your agent's skills directory (e.g. a Claude Code / agent skills/ folder) and the agent can discover it via the paperless trigger and use the helper scripts the same way you would from the shell.

The skill encodes the operational knowledge that's easy to get wrong: which operations must bypass ocli, why a 200 OK PATCH can silently no-op, how SHA256 dedup misses semantic duplicates, why a delete returns an empty body, and how to drain the serial consume queue on a bulk import. When invoking the scripts from outside the skill directory, use their absolute paths.

🔒 Security

  • Never commit your .ocli/ profile cache or a filled-in .env. The .ocli/ cache contains your live API base URL, the Authorization token header, and a full dump of your instance's OpenAPI schema; a filled .env contains your token and instance URL. Both are already covered by .gitignore (.ocli/, *.ocli/, .env, .env.* except .env.example) — keep it that way.
  • The helper scripts never print the token value, and the only URL that appears anywhere in this repo is the non-resolvable placeholder https://paperless.example.com.
  • Prefer the system keyring over a plaintext env file for the token when you can.

📄 License

MIT.

About

Reusable CLI + agent skill for the Paperless-NGX REST API via openapi-to-cli, with urllib helpers for multipart upload and array-field PATCH. Fully env-driven, nothing hardcoded.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages