Clean Data Export API turns incomplete FieldOps Desk job exports into files a team can review: clean CSV, matching JSON, rejected rows, run history, and a short run summary.
The project is local by design. It uses fictional data, reads only jobs and
job_updates, and never connects to a real customer system.
The demo export for 2026-06-01 through 2026-06-05 returns:
source_count=9
clean_count=3
duplicate_count=1
rejected_count=6
This is one export workflow, not a broad backend platform. It:
- pulls a limited set of source records.
- maps source fields into clean output columns.
- keeps invalid records out of the clean export without hiding them.
- removes later duplicate records for the same
job_id. - writes files that can be reviewed in a spreadsheet or handed to another system.
- records run history in SQLite.
FieldOps Desk is a fictional operations tool for field jobs. Its export is limited, and manual spreadsheet edits have left missing fields and duplicate rows.
The local workflow produces:
- a clean job export for review.
- a JSON copy with the same accepted records.
- a rejected-row file that explains what failed.
- a short run summary with counts and file paths.
sequenceDiagram
actor User
participant Entry as CLI or FastAPI
participant Service as Export service
participant Source as FieldOps fixture
participant Rules as Mapping and validation
participant Reports as Output files
participant History as SQLite run history
User->>Entry: Request jobs export
Entry->>Service: Submit date range
Service->>Source: Fetch paginated jobs
Source-->>Service: Return source jobs
Service->>Source: Fetch related job updates
Source-->>Service: Return job updates
Service->>Rules: Map fields and validate rows
Rules-->>Service: Return clean rows and rejected rows
Service->>Rules: Remove later duplicates
Rules-->>Service: Return final clean rows and duplicate rejects
Service->>Reports: Write CSV, JSON, rejected rows, and summary
Service->>History: Store run counts and output paths
Service-->>Entry: Return export summary
Entry-->>User: Print or return summary
The FastAPI endpoint and CLI use the same export service, so both entry points produce the same results.
outputs/
clean_jobs.csv
clean_jobs.json
rejected_jobs.csv
run_summary.md
clean_jobs.csv is the spreadsheet export. clean_jobs.json contains the same
accepted records in JSON form. rejected_jobs.csv keeps invalid and duplicate
records visible with reason codes. run_summary.md records the run counts and
file paths.
clean-data-export-api/
├── README.md
├── pyproject.toml
├── uv.lock
├── src/clean_data_export_api/
│ ├── app.py # FastAPI entry point
│ ├── cli.py # Typer CLI
│ ├── config.py # local runtime paths
│ ├── models.py # Pydantic contracts
│ ├── source_api.py # fixture-backed source API
│ ├── export_service.py # shared export workflow
│ ├── mapping.py # source-to-output mapping
│ ├── validation.py # required-field and duplicate rules
│ ├── repository.py # SQLite run history
│ └── reports.py # CSV, JSON, and summary writers
├── sample_data/
│ ├── source_jobs.json
│ └── source_job_updates.json
├── outputs/
│ ├── clean_jobs.csv
│ ├── clean_jobs.json
│ ├── rejected_jobs.csv
│ └── run_summary.md
├── docs/
│ ├── PRD.md
│ ├── ARCHITECTURE.md
│ ├── DELIVERY.md
│ └── adr/
│ └── 0001-project-scope.md
└── tests/
Run the CLI export:
uv sync
uv run clean-data-export-api export jobs --from-date 2026-06-01 --to-date 2026-06-05Serve the local API:
uv run clean-data-export-api servePOST /exports/jobs
Content-Type: application/json
{
"from_date": "2026-06-01",
"to_date": "2026-06-05"
}The API uses the configured local output directory. The CLI also supports
explicit --output-dir, --database-path, and --sample-data-dir options for
local runs.
This project does not use real credentials, scraping, login bypass, paid APIs, or customer data. All sample data is fictional.
This project should not claim readiness for live operations, guaranteed business outcomes, advanced security guarantees, or support for a real vendor API before that API has been reviewed.