Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
337 changes: 337 additions & 0 deletions SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,337 @@
name: mindee-v2-node.js
description:
---

# Mindee API – Node.js SDK Skill Guide

## Overview

The `mindee` npm package is the official Node.js client library for the
[Mindee API](https://app.mindee.com). It lets you send documents (PDFs, images,
URLs) to Mindee's AI-powered document-processing platform and get back
structured, machine-readable data.

This guide covers **API v2**, the current generation of the Mindee platform.

Key capabilities exposed by the library:

- **Extraction** – pull structured fields out of any document type using a
model you configure in the Mindee console.
- **Classification** – route documents to the right workflow by categorising
them automatically.
- **OCR** – retrieve the full plain-text content of a document.
- **Crop** – detect and isolate sub-regions of interest within a page.
- **Split** – segment a multi-page document into logical sub-documents.

All operations follow the same asynchronous enqueue-and-poll pattern: a
document is submitted to a processing queue and the result is retrieved once
the job is complete. The library handles the polling loop for you, or you can
manage it manually when you need finer control.

## Requirements

### Node.js

Node.js **20.1 or later** is required.

### API Key

You need a Mindee API key.

Refer to the [API Keys documentation](https://docs.mindee.com/integrations/api-keys) for
instructions on creating one.

## Installation

Install the package via npm:

```bash
npm install mindee
```

Optional dependencies are included by default. They enable additional features:

| Feature | Package(s) |
|----------------------------------|-----------------------------------------|
| PDF manipulation & compression | `@cantoo/pdf-lib`, `node-poppler` |
| PDF text extraction | `pdf.js-extract` |
| Image compression | `sharp` |

To skip them for a lighter install:

```bash
npm install mindee --omit=optional
```

## Getting Started

### Creating a Client

All interactions with the Mindee API go through the `Client` class. Instantiate
it with your API key:

```typescript
import * as mindee from "mindee";

// API key is read from the MINDEE_V2_API_KEY environment variable
const mindeeClient = new mindee.Client({ apiKey: "MY_API_KEY" });
```

### Client Options

The constructor accepts an optional `ClientOptions` object:

| Option | Type | Default | Description |
|--------------|--------------|-------------|--------------------------------------------------|
| `apiKey` | `string` | `undefined` | Your Mindee API key. |
| `debug` | `boolean` | `false` | Enable debug-level logging. |
| `dispatcher` | `Dispatcher` | `undefined` | Custom `undici` dispatcher (e.g. for proxy use). |

### API Key via Environment Variable

Instead of hardcoding the key, you can set the `MINDEE_V2_API_KEY` environment
variable and omit `apiKey` from the constructor entirely:

```bash
export MINDEE_V2_API_KEY="MY_API_KEY"
```

## Loading an Input Document

Before sending a document to the API, wrap it in one of the input source
classes exported from the `mindee` package. All input sources are passed
directly to the client methods — you never need to call `init()` manually.

### File Path

The most common option. Pass a path string to the file on disk:

```typescript
const filePath = "/path/to/the/file.ext";
const inputSource = new mindee.PathInput({ inputPath: filePath });
```

### Buffer

Use when you already have the file contents in memory as a Node.js `Buffer`.
A filename (with extension) is required:

```typescript
const buffer = Buffer.from(
await fs.promises.readFile("/path/to/the/file.ext")
);

const inputSource = new mindee.BufferInput({
buffer: buffer,
filename: "file.ext",
});
```

### Base64 String

Use when the file is provided as a Base64-encoded string, e.g. from a web upload or an external API.
A filename (with extension) is required:

```typescript
const b64String = "iVBORw0KGgoAAAANSUhEUgAAABgAAA ...";

const inputSource = new mindee.Base64Input({
inputString: b64String,
filename: "base64_file.txt",
});
```

### Bytes (Uint8Array)
Use when the file content is available as raw bytes.
A filename (with extension) is required:

```typescript
const inputBytes = new Uint8Array(
await fs.promises.readFile("/path/to/the/file.ext")
);

const inputSource = new mindee.BytesInput({
inputBytes: inputBytes,
filename: "file.ext",
});
```

### Readable Stream
Use when reading from a Node.js `Readable` stream.
A filename (with extension) is required:

```typescript
const stream = fs.createReadStream("/path/to/the/file.ext");

const inputSource = new mindee.StreamInput({
inputStream: stream,
filename: "file.ext",
});
```

### Remote URL
Use to pass an HTTPS URL directly to the Mindee API without downloading the file locally first.
Only HTTPS URLs are accepted:
```typescript
const inputSource = new mindee.UrlInput({ url: "https://example.com/file.ext" });
```

## Choosing a Product

A product class tells the client which Mindee AI pipeline to use. All product
classes are exported from the `mindee` package under the `product` namespace.

| Product class | Import | Use case |
|----------------------|-------------------------------|----------------------------------------------------------------|
| `Extraction` | `mindee.product.Extraction` | Pull structured fields from any document using a custom model. |
| `Classification` | `mindee.product.Classification` | Sort documents into categories. |
| `Ocr` | `mindee.product.Ocr` | Extract raw text from any image or scanned document. |
| `Crop` | `mindee.product.Crop` | Detect and isolate document borders on each page. |
| `Split` | `mindee.product.Split` | Break a multi-page file into separate logical documents. |

The product class is passed as the first argument to all client methods. You
never instantiate it directly:

```typescript
import * as mindee from "mindee";

const response = await mindeeClient.enqueueAndGetResult(
mindee.product.Extraction, // <-- product class
inputSource,
params,
);
```

## Sending a Request

### Parameters

Every request requires a `params` object. At minimum, `modelId` must be set.
The `modelId` is the ID of the model you configured in the Mindee console:

```typescript
const params = {
modelId: "MY_MODEL_ID",
};
```

For `Extraction`, additional options are available:
```typescript
const params = {
modelId: "MY_MODEL_ID",

// Options: set to `true` or `false` to override defaults

// Enhance extraction accuracy with Retrieval-Augmented Generation.
rag: undefined,
// Extract the full text content from the document as strings.
rawText: undefined,
// Calculate bounding box polygons for all fields.
polygon: undefined,
// Calculate confidence scores for all fields.
confidence: undefined,
};
```

### All-in-One: `enqueueAndGetResult()`
The simplest way to process a document.
The library enqueues the document, polls until the result is ready, and returns it:

```typescript
import * as mindee from "mindee";

const apiKey = "MY_API_KEY";
const filePath = "/path/to/the/file.ext";
const modelId = "MY_MODEL_ID";

// Init a new client
const mindeeClient = new mindee.Client({ apiKey: apiKey });

// Set product parameters
const params = {
modelId: modelId,
};

// Load a file from disk
const inputSource = new mindee.PathInput({ inputPath: filePath });

// Send for processing and wait for the result
const response = await mindeeClient.enqueueAndGetResult(
mindee.product.Extraction, // <-- product class
inputSource,
params,
);

// Print a string summary
console.log(response.inference.toString());
```

### Webhook Workflow

If you have configured a webhook in the Mindee console, you can pass its ID in the `params` object.
The document is enqueued and Mindee will call your webhook URL when processing is complete, no polling required on your side:

```typescript
import * as mindee from "mindee";

const apiKey = "MY_API_KEY";
const filePath = "/path/to/the/file.ext";
const modelId = "MY_MODEL_ID";
const webhookId = "MY_WEBHOOK_ID";

// Init a new client
const mindeeClient = new mindee.Client({ apiKey: apiKey });

// Set product parameters, including the webhook ID
const params = {
modelId: modelId,
webhookIds: [webhookId],
};

// Load a file from disk
const inputSource = new mindee.PathInput({ inputPath: filePath });

// Enqueue the document — the result will be delivered to your webhook
const jobResponse = await mindeeClient.enqueue(
mindee.product.Extraction, // <-- product class
inputSource,
params,
);

// Save the job ID, you can use it to check status if the webhook doesn't arrive
const jobId = jobResponse.job.id;
console.log(`Enqueued with job ID: ${jobId}`);
```

Tip: Persist jobId in your database alongside your own record for this document.
If the webhook callback never arrives (e.g. due to a network issue or a misconfigured endpoint), you can use it to check the job status at any time:
```typescript
const jobResponse = await mindeeClient.getJob(jobId);
console.log(`Job status: ${jobResponse.job.status}`);
// Possible statuses: "Processing", "Processed", "Failed"
```

When Mindee calls your webhook, the payload it sends can be deserialized using `LocalResponse`:
```typescript
import * as mindee from "mindee";

// `rawPayload` is the raw JSON string body received by your webhook handler
const localResponse = new mindee.v2.LocalResponse(rawPayload);
await localResponse.init();

const response = await localResponse.deserializeResponse(
mindee.product.ExtractionResponse, // <-- product response class
);

console.log(response.inference.toString());
```

You can also verify the HMAC signature of the webhook payload to ensure it genuinely came from Mindee:
```typescript
const secretKey = "MY_WEBHOOK_SECRET_KEY";
const signature = request.headers["x-mindee-hmac-signature"];

if (!localResponse.isValidHmacSignature(secretKey, signature)) {
throw new Error("Invalid webhook signature");
}
```
19 changes: 9 additions & 10 deletions tests/input/sources.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,9 @@ import { RESOURCE_PATH, V1_PRODUCT_PATH } from "../index.js";
describe("Input Sources - load different types of input", () => {

it("should accept base64 inputs", async () => {
const b64Input = await fs.promises.readFile(
path.join(RESOURCE_PATH, "file_types/receipt.txt")
const b64String = await fs.promises.readFile(
path.join(RESOURCE_PATH, "file_types/receipt.txt"), "ascii"
);
const b64String = b64Input.toString();
// don't provide an extension to see if we can detect MIME
// type based on contents
const filename = "receipt";
Expand All @@ -49,15 +48,15 @@ describe("Input Sources - load different types of input", () => {

it("should accept JPEG files from a path", async () => {
const inputSource = new PathInput({
inputPath: path.join(V1_PRODUCT_PATH, "expense_receipts/default_sample.jpg"),
inputPath: path.join(RESOURCE_PATH, "file_types/receipt.jpg"),
});
await inputSource.init();

const expectedResult = await fs.promises.readFile(
path.join(V1_PRODUCT_PATH, "expense_receipts/default_sample.jpg")
path.join(RESOURCE_PATH, "file_types/receipt.jpg")
);
assert.strictEqual(inputSource.inputType, INPUT_TYPE_PATH);
assert.strictEqual(inputSource.filename, "default_sample.jpg");
assert.strictEqual(inputSource.filename, "receipt.jpg");
assert.strictEqual(inputSource.mimeType, "image/jpeg");
assert.ok(!inputSource.isPdf());
assert.strictEqual(await inputSource.getPageCount(), 1);
Expand Down Expand Up @@ -113,9 +112,9 @@ describe("Input Sources - load different types of input", () => {
});

it("should accept read streams", async () => {
const filePath = path.join(V1_PRODUCT_PATH, "expense_receipts/default_sample.jpg");
const filePath = path.join(RESOURCE_PATH, "file_types/receipt.jpg");
const stream = fs.createReadStream(filePath);
const filename = "default_sample.jpg";
const filename = "receipt.jpg";
const inputSource = new StreamInput({
inputStream: stream,
filename: filename,
Expand Down Expand Up @@ -207,8 +206,8 @@ describe("Input Sources - load different types of input", () => {
});

it("should accept raw bytes", async () => {
const filePath = path.join(V1_PRODUCT_PATH, "expense_receipts/default_sample.jpg");
const inputBytes = await fs.promises.readFile(filePath);
const filePath = path.join(RESOURCE_PATH, "file_types/receipt.jpg");
const inputBytes = new Uint8Array(await fs.promises.readFile(filePath));
// don't provide an extension to see if we can detect MIME
// type based on contents
const filename = "receipt";
Expand Down
Loading