Skip to content

docs: comprehensive TROUBLESHOOTING.md, fix deploy extension, add Databricks section to AGENTS.md#1

Merged
darkmatter2222 merged 5 commits into
mainfrom
release-v2-dgx-spark
Jun 29, 2026
Merged

docs: comprehensive TROUBLESHOOTING.md, fix deploy extension, add Databricks section to AGENTS.md#1
darkmatter2222 merged 5 commits into
mainfrom
release-v2-dgx-spark

Conversation

@darkmatter2222

Copy link
Copy Markdown
Owner

Summary

Comprehensive documentation, deployment automation fixes, and validation tools for Databricks dashboard behind nginx reverse proxy.

What Changed

  1. TROUBLESHOOTING.md (new) — Complete troubleshooting guide covering:

    • The nginx /copilot/ reverse proxy blank page bug (root cause analysis)
    • All 3 deployment methods (Deploy button, Sync + Restart, direct docker run)
    • Common mistakes table with fixes
    • Verification commands for each layer
    • Prevention rules to avoid regression
  2. AGENTS.md — Added Databricks dashboard deployment section with:

    • Architecture diagram showing traffic flow
    • Deploy checklist (8 checklist steps, 4 critical gotchas)
    • Troubleshooting quick reference
  3. deploy-dgx/extension.mjs — Fixed two critical tools:

    • validate-databricks-dashboard: Now tests container status, path normalization, __BASE_PATH injection, fetch __bp count, grip icons, live data via both direct port AND nginx /copilot/ endpoints, and container logs for errors
    • sync-dashboard-databricks: Fixed to use --no-cache docker build, PROXY_BACKEND (not PROXY_URL), docucraft_docucraft-network (not host network), and added PROXY_PATH_PREFIX=/copilot env var

Bug Summary

The Databricks dashboard showed empty data behind nginx because:

  • JS fetch calls used absolute paths — fixed by injecting window.__BASE_PATH="/copilot" and prefixing all fetch calls with __bp
  • serve.py didn't normalize query strings before matching API prefix routes — fixed with _norm_path = self.path.split("?")[0]
  • Docker builds cached old index.html — fixed by adding --no-cache to build step
  • Wrong env var name (PROXY_URL instead of PROXY_BACKEND) caused 502 "not configured" errors

Verification

All nginx endpoints verified working:

  • /copilot/ → dashboard loads
  • /copilot/copilot-dashboard/index.html → 200
  • /copilot/v1/models → proxy models list
  • /copilot/stats → live token stats
  • /copilot/api/stats/summary → MongoDB summary

darkmatter2222 and others added 5 commits June 29, 2026 12:31
…on, fix deploy.py unicode

- Add always-visible ☰ grip handle icon to all 6 sortable live-grid panels
- Grip icon appears in panel header with subtle opacity (0.3), highlights on hover
- Sortable.js already wired to .ph.grab handle — grip is visual affordance only
- Create .github/extensions/deploy-dgx with 4 deployment tools:
  - deploy-proxy-dgx: Build and restart proxy container on DGX Spark
  - validate-dgx-proxy: Health check /health, /v1/models, /api/models/running
  - validate-databricks-dashboard: Check Think refs, grip icons, container status
  - sync-dashboard-databricks: SCP + rebuild dashboard container on Databricks
- Fix deploy.py UnicodeEncodeError on Windows cp1252 console (✓ -> [OK])

Co-authored-by: Copilot <[email protected]>
… 3002

- Remove sudo from docker commands (user in docker group)
- Use DASHBOARD_PORT=3002 instead of 3000 (seismometer owns 3000)
- Sync serving.py from Databricks to local repo
- Update validation checks to use port 3002

Co-authored-by: Copilot <[email protected]>
… also fix DB red x badge

The dashboard was rendering an empty page with a red X next to the DB badge in the top right when served through susmannet's nginx at /copilot/ path prefix. The root cause was twofold:

1. JavaScript fetch() calls hardcoded absolute paths (e.g., '/stats') but the browser is on origin 'https://susmannet.duckdns.org/copilot/' so they hit nginx as '/copilot/stats'. Added '__bp' base path variable to dynamically prefix all 11 fetch endpoints when serve.py injects window.__BASE_PATH='/copilot'

2. serve.py path normalization did not handle query strings (e.g., /api/stats/summary?days=1 failed to match API_PREFIX because .startswith checks 'self.path' which includes '?...'). Fixed by splitting on '?' first, then stripping prefix from clean path before comparing with API prefixes. Container also now correctly normalizes paths when served via nginx proxy_pass that strips '/copilot/'

Validation: All 6 browser fetch endpoints return live data through /copilot/ path prefix on Databricks nginx (stats DB models history usage daily cost). Container logs all 200 OK responses with no errors.

Co-authored-by: Copilot <[email protected]>
… checklist, fix deploy extension for reliable validation

- TROUBLESHOOTING.md: comprehensive doc of nginx /copilot/ reverse proxy bug
  covering __bp fetch prefix, serve.py _norm_path normalization, deployment workflow,
  common mistakes table, and verification commands
- AGENTS.md: added Databricks dashboard deployment section with architecture diagram,
  deploy checklist, critical gotchas, and extension references
- deploy-dgx/extension.mjs: rewrote validate-databricks-dashboard to test container status,
  path normalization, __BASE_PATH injection, fetch(__bp) count, grip icons, live data via
  nginx /copilot/ endpoints, and container logs for errors. Fixed sync-dashboard-databricks
  to use --no-cache build, PROXY_BACKEND (not PROXY_URL), docucraft_docucraft-network,
  and PROXY_PATH_PREFIX=/copilot

Co-authored-by: Copilot <[email protected]>
# Conflicts:
#	AGENTS.md
#	dashboard/index.html
#	dashboard/serve.py
@darkmatter2222 darkmatter2222 merged commit 10564c2 into main Jun 29, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant