PMX Research Scaffold

Research-grade Python scaffold for deterministic and auditable market forecasting workflows.

See CONTRIBUTING.md for verification and merge policy.

Scope of this milestone

Repository scaffold and quality guardrails.
CI checks (lint, typecheck, tests).
Local PostgreSQL for development.
Run context primitives for reproducibility (run_id, config_hash, code_version).

No real ingestion/model/backtest implementation is included in this milestone.

Quickstart

Copy environment variables:
```
Copy-Item .env.example .env
```
Start PostgreSQL:
```
docker compose up -d
docker compose ps
```

Install dependencies (default):

python -m pip install -e ".[dev]"

Faster alternative with uv:

uv pip install --system -e ".[dev]"

Run quality gates:

ruff format --check .
ruff check .
mypy src
pytest -q

Single entrypoint:

python scripts/verify.py

Make / Just shortcuts

make install-pip or just install-pip
make install or just install
make lint or just lint
make typecheck or just typecheck
make test or just test
make verify or just verify
make check or just check

Migrations bootstrap

Alembic is initialized under migrations/.
For DB-backed tests, set DATABASE_URL or APP_DATABASE_URL (APP_DATABASE_URL recommended).
DB tests resolve DSN in this order: DATABASE_URL then APP_DATABASE_URL.
Example DSN: postgresql+psycopg://pmx:pmx_dev_password@localhost:5432/pmx
Apply schema:
```
python -m alembic upgrade head
```
See docs/db.md for partition strategy and future monthly partition creation.

Gamma catalog refresh

Required env:
- APP_DATABASE_URL or DATABASE_URL (lookup order in code: DATABASE_URL then APP_DATABASE_URL)
- GAMMA_BASE_URL (default: https://gamma-api.polymarket.com)
- GAMMA_TIMEOUT_SECONDS (default: 20)
- GAMMA_PAGE_SIZE (default: 200)

Run full refresh:

python -m pmx.jobs.gamma_catalog_refresh

Run incremental refresh:

python -m pmx.jobs.gamma_catalog_refresh --since-updated-at 2026-01-01T00:00:00Z

Dev limit:

python -m pmx.jobs.gamma_catalog_refresh --max-markets 100

Timestamp semantics:
- Catalog ingestion writes a run-level ingestion timestamp (audit.ingested_at) for reproducibility and audit.
- As-of gating (event/published <= decision_ts and ingested_at <= decision_ts + epsilon) is enforced later by backtest/forecast selection logic, not by this catalog ingestion job.
Temporary audit namespace:
- markets.rule_parse_json currently stores both parser stub output and audit payload under audit.* (for example audit.gamma_raw).

News ingestion v1 (GDELT + whitelist crawler)

Required env:
- APP_DATABASE_URL or DATABASE_URL
- GDELT_BASE_URL (default: https://api.gdeltproject.org/api/v2/doc/doc)
- GDELT_TIMEOUT_SECONDS (default: 20)
- GDELT_MAX_RETRIES (default: 4)
- GDELT_BACKOFF_SECONDS (default: 0.5)
- GDELT_MAX_RECORDS (default: 250)
- NEWS_PRIMARY_SOURCES_CONFIG (default: config/primary_sources.yaml)
- NEWS_CRAWLER_CONNECT_TIMEOUT_SECONDS (default: 5)
- NEWS_CRAWLER_READ_TIMEOUT_SECONDS (default: 15)
- NEWS_CRAWLER_MAX_RETRIES (default: 3)
- NEWS_CRAWLER_BACKOFF_SECONDS (default: 0.5)

Run:

python -m pmx.jobs.news_ingest --since-published 2026-01-01T00:00:00Z --max-articles 200 --max-per-domain 20 --crawl-primary

Disable crawler:

python -m pmx.jobs.news_ingest --since-published 2026-01-01T00:00:00Z --no-crawl-primary

Notes:
- Primary-source whitelist is loaded from config/primary_sources.yaml and upserted into sources.
- URL canonicalization is conservative and strips tracking params (utm_*, fbclid, gclid, etc.).
- Dedupe policy:
  - hard dedupe on canonical_url;
  - soft dedupe on content_hash updates the existing row (no insert);
  - fallback soft dedupe on title_hash + same domain + published_at within 24h updates the existing row (no insert).
  - every dedupe update logs dedupe_reason in structured logs: canonical_url, content_hash, title_window.
- Raw audit payloads are persisted in articles.raw:
  - raw.gdelt for source payload;
  - raw.crawler for crawler response/extraction metadata;
  - raw.ingest for timestamp policy metadata (published_at_source, unknown_published_at).
- published_at policy is explicit:
  - prefer crawler published_at;
  - fallback to GDELT published_at;
  - if both missing, fallback to run ingested_at and mark raw.ingest.unknown_published_at=true.
- As-of safety is preserved by always storing published_at and run-level ingested_at.
- Market linking is deterministic (score = 2*title_hits + body_hits + 0.5*slug_hits, topK=5, stable tie-break) with a fixed stopword list in tokenization.

CLOB REST ingestion (time-series)

Required env:
- APP_DATABASE_URL or DATABASE_URL (lookup order in code: DATABASE_URL then APP_DATABASE_URL)
- CLOB_BASE_URL (default: https://clob.polymarket.com)
- CLOB_TIMEOUT_SECONDS (default: 20)
- CLOB_RATE_LIMIT_RPS (default: 5)
- CLOB_ORDERBOOK_DEPTH (optional deterministic top-of-book depth)
- CLOB_API_KEY (optional, only if endpoint requires auth)

Run:

python -m pmx.jobs.clob_ingest_rest --max-tokens 100 --since-ts 2026-01-01T00:00:00Z --interval 1m

Notes:
- Token ingestion order is deterministic (token_id sorted).
- --since-ts is inclusive:
  - trades: include rows with event_ts >= since_ts
  - candles: include rows with start_ts >= since_ts
- ingested_at assignment mode is explicit and deterministic:
  - default --ingested-at-mode now: single run-level timestamp applied to orderbook_snapshots, trades, and candles
  - optional --ingested-at-mode event_ts: per-row ingested_at = event_ts + --ingest-latency-seconds (candles anchor on start_ts)
- Orderbook snapshot event_ts policy:
  - use API timestamp when available and parseable
  - fallback to run ingestion timestamp (run_ingested_at) when missing.
- Orderbook normalization is deterministic:
  - bids sorted by price desc, asks sorted by price asc
  - price/size quantized to 8 decimals
  - optional depth cap via CLOB_ORDERBOOK_DEPTH.
- Trade identity fallback:
  - when both seq and upstream hash/id are missing, a deterministic SHA-256 hash is derived from canonical trade fields to stabilize idempotent upsert keys.
- As-of gating is applied downstream by decision_ts + epsilon filters.

CLOB WSS listener + reconcile

Required env:
- APP_DATABASE_URL or DATABASE_URL
- CLOB_WSS_URL (default: wss://clob.polymarket.com/ws)
- CLOB_WSS_TIMEOUT_SECONDS (default: 20)
- CLOB_WSS_MAX_RECONNECTS (default: 8)
- CLOB_WSS_BACKOFF_SECONDS (default: 0.5)
- CLOB_WSS_MAX_BACKOFF_SECONDS (default: 30)
- CLOB_WSS_SEQ_FIELDS (default: seq,sequence,offset)
- CLOB_WSS_SEQ_FIELD (legacy override; if set, forces a single field, empty disables seq-based detection)
- CLOB_RECONCILE_GAP_SECONDS (default: 60)
- CLOB_RECONCILE_MISMATCH_BPS (default: 10)
- REST envs from the previous section (used by reconciler).

Run:

python -m pmx.jobs.clob_wss_listener `
  --max-tokens 100 `
  --reconcile-every-seconds 60 `
  --since-ts 2026-01-01T00:00:00Z `
  --run-seconds 300

Explicit token list:

python -m pmx.jobs.clob_wss_listener --token-ids tokenA,tokenB --reconcile-every-seconds 60

Notes:
- WSS ingestion is best-effort and can be out-of-order or incomplete.
- REST is treated as source of truth during reconcile windows.
- Gap detection strategy:
  - seq-based mode (active when configured seq fields include monotonic fields): extraction order is CLOB_WSS_SEQ_FIELD (legacy, if set) otherwise CLOB_WSS_SEQ_FIELDS (default seq,sequence,offset), and gap when seq != last_seq + 1. If CLOB_WSS_SEQ_FIELD is explicitly set to empty/whitespace, it disables seq-based detection (it is never interpreted as a field name).
  - heuristic mode (no configured seq fields, or event has no seq-like value): gap when stream is stale versus CLOB_RECONCILE_GAP_SECONDS; if no stream trade timestamp is available, reconcile still runs every tick.
  - msg_id/event_id are diagnostic-only and are not used for gap detection.
- Mismatch strategy:
  - top-of-book mismatch when REST mid diverges from stream mid by more than CLOB_RECONCILE_MISMATCH_BPS basis points.
- Per-token state is monotonic and logged periodically:
  - last_seq, last_trade_ts, last_book_ts, last_reconcile_ts
- Reconcile emits structured audit events:
  - reconcile_gap
  - reconcile_mismatch
  - wss_reconnect
- Reconcile gap/mismatch logs include:
  - action_taken, window_start, window_end, rest_calls, rows_upserted
- Repairs are idempotent because writes use DB upsert constraints.

Micro feature store v1 (as-of snapshots)

Required env:
- APP_DATABASE_URL or DATABASE_URL
- INGEST_EPSILON_SECONDS (default: 300)

Run:

python -m pmx.jobs.compute_micro_features --decision-ts 2026-02-11T12:00:00Z --max-tokens 200 --epsilon-seconds 300

Explicit token list:

python -m pmx.jobs.compute_micro_features --token-ids tokenA,tokenB --decision-ts 2026-02-11T12:00:00Z

Notes:
- As-of input gating is enforced on every query:
  - event_ts <= decision_ts
  - ingested_at <= decision_ts + epsilon
- Persisted feature set is micro_v1, token-scoped via deterministic feature_set_version (micro_v1:token:<token_id>) to keep idempotent snapshots per token without schema changes.
- Current microstructure features:
  - mid_price, spread_bps
  - top_depth_bid, top_depth_ask, book_imbalance_1
  - last_trade_price, last_trade_size
  - trade_count_5m, volume_5m, return_5m
  - realized_vol_1h, stale_seconds_last_trade, stale_seconds_last_book

Backtest baselines A/B (walk-forward, as-of)

Baseline A:
- p = price_prob where price_prob is selected as-of from market data (mid from orderbook, fallback to last trade price).
Baseline B:
- deterministic logistic scorer over micro_v1 features (no training step).

Run:

python -m pmx.jobs.backtest_baselines `
  --token-ids tokenA,tokenB `
  --from 2026-01-01T00:00:00Z `
  --to 2026-01-03T00:00:00Z `
  --step-hours 4 `
  --epsilon-seconds 300 `
  --feature-set micro_v1

Artifact:
- artifacts/backtests/<run_id>.json
- includes aggregate/per-token metrics (Brier, ECE, sharpness), plus dataset_hash, config_hash, code_version, and skip counters.
See docs/backtest.md for details.

Forecast baseline ensemble v1 (calibrated + intervals)

Deterministic components:
- Baseline A (p = price_prob)
- Baseline B (deterministic logistic on micro_v1)
- Ensemble stacker (p_raw)
- Calibration (isotonic, Platt fallback)
- Split-conformal uncertainty intervals (50/90)

Run:

python -m pmx.jobs.forecast_baseline_ensemble `
  --token-ids tokenA,tokenB `
  --from 2026-01-01T00:00:00Z `
  --to 2026-01-03T00:00:00Z `
  --step-hours 4 `
  --epsilon-seconds 300 `
  --feature-set micro_v1 `
  --max-tokens 200

Artifact:
- artifacts/forecasts/<run_id>.json
- includes p_raw, p_cal, interval_50, interval_90, deterministic drivers, no_trade_flags, and reproducibility hashes: dataset_hash, model_hash, calibration_hash, uncertainty_hash, forecast_payload_hash.
See docs/forecast.md for details.

Market selector v1 (accuracy-first)

Run selector:

python -m pmx.jobs.select_markets `
  --decision-ts 2026-02-11T12:00:00Z `
  --epsilon-seconds 300 `
  --max-candidates 1500 `
  --k-deep 200

Selector outputs:
- deterministic selector_v1 deep-dive set (final rank by deep_score after diversification constraints)
- baseline comparator sets:
  - baseline_top_volume
  - baseline_random_stratified
Persistence:
- writes selection_runs + selection_items
- writes selector audit artifact in artifacts/selector/<run_id>.json
- candidate score fallback artifact used if dedicated candidate table is absent.

Evaluate selector vs baselines:

python -m pmx.jobs.eval_selector `
  --decision-ts 2026-02-11T12:00:00Z `
  --epsilon-seconds 300 `
  --window-hours 72

writes artifacts/selector_eval/<run_id>.json with comparative Brier/ECE report for selector_v1 and baselines.

See docs/selector.md for scoring components, bucket mix, and constraints.

WSS protocol probe (no DB writes)

Goal:
- capture real WSS message shapes and confirm seq-like fields.
Current expectation from shipped fixtures:
- supports_seq = True based on tests/fixtures/wss/*.json.

Probe with explicit token ids:

python scripts/wss_probe.py --token-ids tokenA,tokenB --max-messages 200 --out wss_probe.jsonl

Probe with TOKEN_IDS env:

# POSIX (bash/zsh)
TOKEN_IDS="tokenA,tokenB" python scripts/wss_probe.py --max-messages 200 --out wss_probe.jsonl

# PowerShell
$env:TOKEN_IDS="tokenA,tokenB"; python scripts/wss_probe.py --max-messages 200 --out wss_probe.jsonl

Probe with custom subscribe payload file:

python scripts/wss_probe.py --subscribe-payload-json path/to/subscribe_payload.json --max-messages 200 --out wss_probe.jsonl

Fixture refresh workflow:
- rerun probe when upstream protocol changes;
- update tests/fixtures/wss/*.json with redacted representative messages;
- rerun tests/test_wss_protocol_fixtures.py;
- if protocol support changes, update expected supports_seq.

Repository layout

src/pmx/
  config/
  db/
  ingest/
  jobs/
  news/
  features/
  forecast/
  models/
  backtest/
  audit/
tests/
migrations/
docs/

Determinism and guardrails

See docs/guardrails.md for as-of policy, anti-leak rules, idempotency, and audit expectations.

Verification contract

If tools are unavailable locally, checks are marked as not run/ skipped.
CI in .github/workflows/ci.yml is the source of truth for merge readiness.
CI must pass before merge (lint, typecheck, tests).

Docker availability note

docker-compose.yml is provided for local/dev environments where Docker is available.
If Docker is unavailable locally, use an external Postgres and set DATABASE_URL or APP_DATABASE_URL.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.docker_tmp		.docker_tmp
.github		.github
artifacts		artifacts
config		config
docs		docs
migrations		migrations
schemas		schemas
scripts		scripts
src/pmx		src/pmx
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
Makefile		Makefile
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
first		first
justfile		justfile
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PMX Research Scaffold

Scope of this milestone

Quickstart

Make / Just shortcuts

Migrations bootstrap

Gamma catalog refresh

News ingestion v1 (GDELT + whitelist crawler)

CLOB REST ingestion (time-series)

CLOB WSS listener + reconcile

Micro feature store v1 (as-of snapshots)

Backtest baselines A/B (walk-forward, as-of)

Forecast baseline ensemble v1 (calibrated + intervals)

Market selector v1 (accuracy-first)

WSS protocol probe (no DB writes)

Repository layout

Determinism and guardrails

Verification contract

Docker availability note

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PMX Research Scaffold

Scope of this milestone

Quickstart

Make / Just shortcuts

Migrations bootstrap

Gamma catalog refresh

News ingestion v1 (GDELT + whitelist crawler)

CLOB REST ingestion (time-series)

CLOB WSS listener + reconcile

Micro feature store v1 (as-of snapshots)

Backtest baselines A/B (walk-forward, as-of)

Forecast baseline ensemble v1 (calibrated + intervals)

Market selector v1 (accuracy-first)

WSS protocol probe (no DB writes)

Repository layout

Determinism and guardrails

Verification contract

Docker availability note

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages