Skip to content

epam/badgerdoc

Repository files navigation

Badgerdoc 2

Badgerdoc is a human-in-the-loop tool designed for working with documents that have been analyzed by AI. It provides a platform for users to review, validate, and interact with the output of various AI tools, including OCR, table and chart extractions, and more.

How to install Badgerdoc

  1. Clone the repository:
git clone <repository-url>
cd badgerdoc-2
  1. Configure environment variables:
cp .env_example .env
  1. Start all services:
make build_all
docker compose up --build
  1. Access the application:

Post-Setup Configuration

After the first run:

  1. Create a superuser:
docker compose exec web uv run python manage.py createsuperuser
  1. Generate token for the superuser:
docker compose exec web uv run python manage.py drf_create_token admin
  1. Put the token in the .env file:
BADGERDOC_TOKEN=<token>
  1. Navigate to http://localhost:9001/, login with minioadmin, create a bucket named badgerdoc to enable upload of the documents.

MLX on MacOS

MLX (Apple Silicon Machine Learning Framework) is available on MacOS for running VLM (Vision Language Model) inference locally. This project uses MLX-VLM to run OCR models like DeepSeek-OCR-2 and PaddleOCR-VL.

Note: MinIO runs inside Docker and is referenced by the hostname minio in pre-signed URLs returned by the API. When using MLX locally, the host machine must be able to resolve that hostname. Add the following entry to /etc/hosts:

127.0.0.1 minio

Installation

Install the MLX dependency group using uv:

uv sync --group mlx

Or install it along with dev dependencies:

uv sync --group dev --group mlx

Starting MLX VLM Servers

After installation, start the MLX VLM servers using:

make start_mlx

This will start two VLM servers:

  • Port 11434: DeepSeek-OCR-2-bf16
  • Port 11435: PaddleOCR-VL-1.5-bf16

Stop the servers using Ctrl+C.

How to Contribute

Setting up Badgerdoc locally (see How to install Badgerdoc above) is a mandatory part of contribution. Once the application is running, the contribution guidelines are available at How to Contribute.

How to Make Pull Request

  1. Carefully read the How to Contribute documentation.
  2. Create a fork of the Badgerdoc GitHub repository.
  3. Make all your changes in your own fork.
  4. Create a Pull Request to the Badgerdoc repository targeting the main branch.
  5. Squash your changes into a single commit before submitting — PRs must contain exactly 1 commit.
  6. One of the core developers will review the PR, approve it, and merge it.

About

Badgerdoc — human-in-the-loop for AI

Resources

License

Stars

Watchers

Forks

Contributors