You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For browser build, set the background web worker explicitly. (#37)
* Add TypeDoc config, docs renames, and LoadParameters type
Introduces a TypeDoc configuration file and updates .gitignore for typedoc output. Renames documentation files for clarity and updates references in README.md. Adds a new LoadParameters type definition for PDF loading options. Adds typedoc as a dev dependency and a build script. Includes new test data and a test for HTML PDF parsing. Updates Vite config to use the correct entry point.
* Rename DocumentInitParameters to LoadParameters
Replaces all usage and documentation of DocumentInitParameters with LoadParameters for clarity and consistency. Updates type exports, API docs, examples, and internal references. Also improves TypeDoc config, adds type documentation link to reports, and fixes a typo in the report:build script.
* Update API extractor configs and demo usage
Changed API extractor config files to use 'undocumented' report filenames and disabled doc model and TSDoc metadata generation. Removed generated API docs. Updated demo HTML files to explicitly set the worker path for PDFParse and improved CDN import examples. Bumped package version to 2.4.5.
* Refactor API Extractor configs and update build scripts
Renamed API Extractor config files to the 'configs/' directory and updated their paths and token usage for improved maintainability. Updated build scripts in package.json to reference the new config locations. Added generated API documentation files for node, pdf-parse, and worker builds.
* Add example test script and update test workflow
Introduces scripts/example.test.mjs to run example scripts for testing. Updates package.json with new test:e and test:all commands, adds tsx as a dev dependency, and modifies the GitHub Actions workflow to use npm run test:all. Also updates example HTML files to set the correct worker path and improves exception handling in exception-handling.ts.
* Remove extra logging and update test scripts
Eliminated unnecessary console output in example and integration test scripts for cleaner logs. Updated 'test:u' npm script to use the 'dot' reporter for unsupported tests. Commented out a log in the worker and clarified error output for PDF worker loading.
* Update CDN links, add funding info, and improve pack script
Updated README to use new CDN links and worker configuration for pdf-parse v2.4.5. Added .github/FUNDING.yml and funding field in package.json for GitHub sponsorship. Modified npm pack script to check for outdated packages before packing, and corrected unpkg field to use UMD build.
* Refactor worker config docs and add troubleshooting guide
Streamlined worker configuration instructions in README.md and moved detailed troubleshooting steps to a new docs/troubleshooting.md file. The new guide covers common errors, platform-specific setup, Node.js version compatibility, and manual worker configuration for custom environments.
- Retrieve headers and validate PDF : [`getHeader`](#getheader--node-utility-pdf-header-retrieval-and-validation)
53
52
- Extract document info : [`getInfo`](#getinfo--extract-metadata-and-document-information)
@@ -57,7 +56,7 @@ run();
57
56
- Detect and extract tabular data : [`getTable`](#gettable--extract-tabular-data)
58
57
- Well-covered with [`unit tests`](./tests)
59
58
-[`Integration tests`](./tests/integration) to validate end-to-end behavior across environments.
60
-
- See [DocumentInitParameters](./docs/README.options.md#documentinitparameters) and [ParseParameters](./docs/README.options.md#parseparameters) for all available options.
59
+
- See [LoadParameters](./docs/options.md#load-parameters) and [ParseParameters](./docs/options.md#parse-parameters) for all available options.
61
60
- Examples: [`live demo`](./reports/demo/), [`examples`](./examples/), [`tests`](./tests/unit/) and [`tests example`](./tests/unit/test-example/) folders.
Next.js& Vercel, Edge Functions, Serverless Functions, AWS Lambda, Netlify Functions, or Cloudflare Workers may require additional worker configuration.
324
-
325
-
This will most likely resolve all worker-related issues.
326
-
```js
327
-
import 'pdf-parse/worker'; // Import this before importing "pdf-parse"
328
-
import {PDFParse} from 'pdf-parse';
329
-
330
-
// or CommonJS
331
-
require ('pdf-parse/worker'); // Import this before importing "pdf-parse"
332
-
const {PDFParse} = require('pdf-parse');
333
-
```
334
-
335
-
To ensure `pdf-parse` works correctly withNext.js (especially on serverless platforms like Vercel), add the following configuration to your `next.config.ts`file. This allows Next.js to include `pdf-parse` as an external packagefor server-side usage:
>**Note:** Similar configuration may be required for other serverless platforms (such as AWS Lambda, Netlify, or Cloudflare Workers) to ensure that `pdf-parse` and its worker files are properly included and executed in your deployment environment.
329
+
**Worker Options:**
349
330
350
-
Custom builds, Electron/NW.js, or specific deployment environments—you may need to manually configure the worker source.
See [docs/troubleshooting.md](./docs/troubleshooting.md) for detailed troubleshooting steps and worker configuration forNode.js and serverless environments.
Requires additional setup — import and configure a compatible CanvasFactory or worker implementation before initializing pdf-parse; see the examples below.
388
-
389
-
ESM
390
-
```js
391
-
// Import this before importing "pdf-parse"
392
-
import { CanvasFactory } from 'pdf-parse/worker';
393
-
import { PDFParse } from 'pdf-parse';
394
-
395
-
const parser = new PDFParse({ data: buffer, CanvasFactory });
0 commit comments