diff --git a/SKILL.md b/SKILL.md new file mode 100644 index 00000000..ed9ea361 --- /dev/null +++ b/SKILL.md @@ -0,0 +1,337 @@ +name: mindee-v2-node.js +description: +--- + +# Mindee API – Node.js SDK Skill Guide + +## Overview + +The `mindee` npm package is the official Node.js client library for the +[Mindee API](https://app.mindee.com). It lets you send documents (PDFs, images, +URLs) to Mindee's AI-powered document-processing platform and get back +structured, machine-readable data. + +This guide covers **API v2**, the current generation of the Mindee platform. + +Key capabilities exposed by the library: + +- **Extraction** – pull structured fields out of any document type using a + model you configure in the Mindee console. +- **Classification** – route documents to the right workflow by categorising + them automatically. +- **OCR** – retrieve the full plain-text content of a document. +- **Crop** – detect and isolate sub-regions of interest within a page. +- **Split** – segment a multi-page document into logical sub-documents. + +All operations follow the same asynchronous enqueue-and-poll pattern: a +document is submitted to a processing queue and the result is retrieved once +the job is complete. The library handles the polling loop for you, or you can +manage it manually when you need finer control. + +## Requirements + +### Node.js + +Node.js **20.1 or later** is required. + +### API Key + +You need a Mindee API key. + +Refer to the [API Keys documentation](https://docs.mindee.com/integrations/api-keys) for +instructions on creating one. + +## Installation + +Install the package via npm: + +```bash +npm install mindee +``` + +Optional dependencies are included by default. They enable additional features: + +| Feature | Package(s) | +|----------------------------------|-----------------------------------------| +| PDF manipulation & compression | `@cantoo/pdf-lib`, `node-poppler` | +| PDF text extraction | `pdf.js-extract` | +| Image compression | `sharp` | + +To skip them for a lighter install: + +```bash +npm install mindee --omit=optional +``` + +## Getting Started + +### Creating a Client + +All interactions with the Mindee API go through the `Client` class. Instantiate +it with your API key: + +```typescript +import * as mindee from "mindee"; + +// API key is read from the MINDEE_V2_API_KEY environment variable +const mindeeClient = new mindee.Client({ apiKey: "MY_API_KEY" }); +``` + +### Client Options + +The constructor accepts an optional `ClientOptions` object: + +| Option | Type | Default | Description | +|--------------|--------------|-------------|--------------------------------------------------| +| `apiKey` | `string` | `undefined` | Your Mindee API key. | +| `debug` | `boolean` | `false` | Enable debug-level logging. | +| `dispatcher` | `Dispatcher` | `undefined` | Custom `undici` dispatcher (e.g. for proxy use). | + +### API Key via Environment Variable + +Instead of hardcoding the key, you can set the `MINDEE_V2_API_KEY` environment +variable and omit `apiKey` from the constructor entirely: + +```bash +export MINDEE_V2_API_KEY="MY_API_KEY" +``` + +## Loading an Input Document + +Before sending a document to the API, wrap it in one of the input source +classes exported from the `mindee` package. All input sources are passed +directly to the client methods — you never need to call `init()` manually. + +### File Path + +The most common option. Pass a path string to the file on disk: + +```typescript +const filePath = "/path/to/the/file.ext"; +const inputSource = new mindee.PathInput({ inputPath: filePath }); +``` + +### Buffer + +Use when you already have the file contents in memory as a Node.js `Buffer`. +A filename (with extension) is required: + +```typescript +const buffer = Buffer.from( + await fs.promises.readFile("/path/to/the/file.ext") +); + +const inputSource = new mindee.BufferInput({ + buffer: buffer, + filename: "file.ext", +}); +``` + +### Base64 String + +Use when the file is provided as a Base64-encoded string, e.g. from a web upload or an external API. +A filename (with extension) is required: + +```typescript +const b64String = "iVBORw0KGgoAAAANSUhEUgAAABgAAA ..."; + +const inputSource = new mindee.Base64Input({ + inputString: b64String, + filename: "base64_file.txt", +}); +``` + +### Bytes (Uint8Array) +Use when the file content is available as raw bytes. +A filename (with extension) is required: + +```typescript +const inputBytes = new Uint8Array( + await fs.promises.readFile("/path/to/the/file.ext") +); + +const inputSource = new mindee.BytesInput({ + inputBytes: inputBytes, + filename: "file.ext", +}); +``` + +### Readable Stream +Use when reading from a Node.js `Readable` stream. +A filename (with extension) is required: + +```typescript +const stream = fs.createReadStream("/path/to/the/file.ext"); + +const inputSource = new mindee.StreamInput({ + inputStream: stream, + filename: "file.ext", +}); +``` + +### Remote URL +Use to pass an HTTPS URL directly to the Mindee API without downloading the file locally first. +Only HTTPS URLs are accepted: +```typescript +const inputSource = new mindee.UrlInput({ url: "https://example.com/file.ext" }); +``` + +## Choosing a Product + +A product class tells the client which Mindee AI pipeline to use. All product +classes are exported from the `mindee` package under the `product` namespace. + +| Product class | Import | Use case | +|----------------------|-------------------------------|----------------------------------------------------------------| +| `Extraction` | `mindee.product.Extraction` | Pull structured fields from any document using a custom model. | +| `Classification` | `mindee.product.Classification` | Sort documents into categories. | +| `Ocr` | `mindee.product.Ocr` | Extract raw text from any image or scanned document. | +| `Crop` | `mindee.product.Crop` | Detect and isolate document borders on each page. | +| `Split` | `mindee.product.Split` | Break a multi-page file into separate logical documents. | + +The product class is passed as the first argument to all client methods. You +never instantiate it directly: + +```typescript +import * as mindee from "mindee"; + +const response = await mindeeClient.enqueueAndGetResult( + mindee.product.Extraction, // <-- product class + inputSource, + params, +); +``` + +## Sending a Request + +### Parameters + +Every request requires a `params` object. At minimum, `modelId` must be set. +The `modelId` is the ID of the model you configured in the Mindee console: + +```typescript +const params = { + modelId: "MY_MODEL_ID", +}; +``` + +For `Extraction`, additional options are available: +```typescript +const params = { + modelId: "MY_MODEL_ID", + + // Options: set to `true` or `false` to override defaults + + // Enhance extraction accuracy with Retrieval-Augmented Generation. + rag: undefined, + // Extract the full text content from the document as strings. + rawText: undefined, + // Calculate bounding box polygons for all fields. + polygon: undefined, + // Calculate confidence scores for all fields. + confidence: undefined, +}; +``` + +### All-in-One: `enqueueAndGetResult()` +The simplest way to process a document. +The library enqueues the document, polls until the result is ready, and returns it: + +```typescript +import * as mindee from "mindee"; + +const apiKey = "MY_API_KEY"; +const filePath = "/path/to/the/file.ext"; +const modelId = "MY_MODEL_ID"; + +// Init a new client +const mindeeClient = new mindee.Client({ apiKey: apiKey }); + +// Set product parameters +const params = { + modelId: modelId, +}; + +// Load a file from disk +const inputSource = new mindee.PathInput({ inputPath: filePath }); + +// Send for processing and wait for the result +const response = await mindeeClient.enqueueAndGetResult( + mindee.product.Extraction, // <-- product class + inputSource, + params, +); + +// Print a string summary +console.log(response.inference.toString()); +``` + +### Webhook Workflow + +If you have configured a webhook in the Mindee console, you can pass its ID in the `params` object. +The document is enqueued and Mindee will call your webhook URL when processing is complete, no polling required on your side: + +```typescript +import * as mindee from "mindee"; + +const apiKey = "MY_API_KEY"; +const filePath = "/path/to/the/file.ext"; +const modelId = "MY_MODEL_ID"; +const webhookId = "MY_WEBHOOK_ID"; + +// Init a new client +const mindeeClient = new mindee.Client({ apiKey: apiKey }); + +// Set product parameters, including the webhook ID +const params = { + modelId: modelId, + webhookIds: [webhookId], +}; + +// Load a file from disk +const inputSource = new mindee.PathInput({ inputPath: filePath }); + +// Enqueue the document — the result will be delivered to your webhook +const jobResponse = await mindeeClient.enqueue( + mindee.product.Extraction, // <-- product class + inputSource, + params, +); + +// Save the job ID, you can use it to check status if the webhook doesn't arrive +const jobId = jobResponse.job.id; +console.log(`Enqueued with job ID: ${jobId}`); +``` + +Tip: Persist jobId in your database alongside your own record for this document. +If the webhook callback never arrives (e.g. due to a network issue or a misconfigured endpoint), you can use it to check the job status at any time: +```typescript +const jobResponse = await mindeeClient.getJob(jobId); +console.log(`Job status: ${jobResponse.job.status}`); +// Possible statuses: "Processing", "Processed", "Failed" +``` + +When Mindee calls your webhook, the payload it sends can be deserialized using `LocalResponse`: +```typescript +import * as mindee from "mindee"; + +// `rawPayload` is the raw JSON string body received by your webhook handler +const localResponse = new mindee.v2.LocalResponse(rawPayload); +await localResponse.init(); + +const response = await localResponse.deserializeResponse( + mindee.product.ExtractionResponse, // <-- product response class +); + +console.log(response.inference.toString()); +``` + +You can also verify the HMAC signature of the webhook payload to ensure it genuinely came from Mindee: +```typescript +const secretKey = "MY_WEBHOOK_SECRET_KEY"; +const signature = request.headers["x-mindee-hmac-signature"]; + +if (!localResponse.isValidHmacSignature(secretKey, signature)) { + throw new Error("Invalid webhook signature"); +} +``` diff --git a/tests/input/sources.spec.ts b/tests/input/sources.spec.ts index 717ea008..26968e77 100644 --- a/tests/input/sources.spec.ts +++ b/tests/input/sources.spec.ts @@ -22,10 +22,9 @@ import { RESOURCE_PATH, V1_PRODUCT_PATH } from "../index.js"; describe("Input Sources - load different types of input", () => { it("should accept base64 inputs", async () => { - const b64Input = await fs.promises.readFile( - path.join(RESOURCE_PATH, "file_types/receipt.txt") + const b64String = await fs.promises.readFile( + path.join(RESOURCE_PATH, "file_types/receipt.txt"), "ascii" ); - const b64String = b64Input.toString(); // don't provide an extension to see if we can detect MIME // type based on contents const filename = "receipt"; @@ -49,15 +48,15 @@ describe("Input Sources - load different types of input", () => { it("should accept JPEG files from a path", async () => { const inputSource = new PathInput({ - inputPath: path.join(V1_PRODUCT_PATH, "expense_receipts/default_sample.jpg"), + inputPath: path.join(RESOURCE_PATH, "file_types/receipt.jpg"), }); await inputSource.init(); const expectedResult = await fs.promises.readFile( - path.join(V1_PRODUCT_PATH, "expense_receipts/default_sample.jpg") + path.join(RESOURCE_PATH, "file_types/receipt.jpg") ); assert.strictEqual(inputSource.inputType, INPUT_TYPE_PATH); - assert.strictEqual(inputSource.filename, "default_sample.jpg"); + assert.strictEqual(inputSource.filename, "receipt.jpg"); assert.strictEqual(inputSource.mimeType, "image/jpeg"); assert.ok(!inputSource.isPdf()); assert.strictEqual(await inputSource.getPageCount(), 1); @@ -113,9 +112,9 @@ describe("Input Sources - load different types of input", () => { }); it("should accept read streams", async () => { - const filePath = path.join(V1_PRODUCT_PATH, "expense_receipts/default_sample.jpg"); + const filePath = path.join(RESOURCE_PATH, "file_types/receipt.jpg"); const stream = fs.createReadStream(filePath); - const filename = "default_sample.jpg"; + const filename = "receipt.jpg"; const inputSource = new StreamInput({ inputStream: stream, filename: filename, @@ -207,8 +206,8 @@ describe("Input Sources - load different types of input", () => { }); it("should accept raw bytes", async () => { - const filePath = path.join(V1_PRODUCT_PATH, "expense_receipts/default_sample.jpg"); - const inputBytes = await fs.promises.readFile(filePath); + const filePath = path.join(RESOURCE_PATH, "file_types/receipt.jpg"); + const inputBytes = new Uint8Array(await fs.promises.readFile(filePath)); // don't provide an extension to see if we can detect MIME // type based on contents const filename = "receipt";