This repository contains the Tilebox workflow behind the US data center growth tracker.
The tracker compares satellite imagery around known data center sites, ranks where visible construction change happened, and publishes the evidence needed for a browsable data product: before/after previews, change scores, scene metadata, and a final ranking JSON.
It was built with an agent on Tilebox. The point of the demo is simple: do not just ask an agent a geospatial question and accept a static answer. Give the agent data, compute, workflow code, logs, artifacts, and observability so it can build a product you can inspect and rerun.
- Result: https://datacenterbuildout.com/
- Original agent transcript: https://ampcode.com/threads/T-019eaba7-ef4b-718e-b850-97f5df3baf0f
The root task is:
tilebox.com/datacenters/[email protected]
For each candidate site, it:
- loads a CSV of known or proposed data center locations
- filters and merges nearby duplicate points
- finds a clear Sentinel-2 L2A scene before and after the target dates
- crops the imagery around the site
- computes construction-oriented change signals
- compares the before/after imagery with Clay foundation model embeddings
- writes ranked results to
outputs/ranking.jsonin the Tilebox job cache
You do not need to be a geospatial expert to start. Think of the workflow as: “take a list of places, get satellite images before and after, score which places changed most, and save evidence for review.”
The root task accepts these common parameters:
{
"csv_url": "https://docs.google.com/spreadsheets/d/1JJ6kcVo-NjlAYtznwHOki2DVl4WWV6lhy-eXhFCdKKU/export?format=csv&gid=386766486",
"max_sites": 3,
"random_seed": 1337,
"before_date": "2024-05-01",
"after_date": "2026-05-01",
"window_days": 60,
"crop_size_m": 3000,
"scene_cloud_cover_max": 30.0,
"crop_cloud_cover_max": 1.0,
"status_filter": [
"Approved/Permitted/Under construction",
"Expanding",
"Proposed"
]
}Notes:
max_sitesis the easiest way to keep early runs cheap and fast.before_dateandafter_datedefine the comparison period.window_dayslets the workflow search around those dates for usable low-cloud imagery.crop_size_mcontrols how much area around each site is analyzed.- If
status_filteris omitted, the workflow defaults to approved, expanding, and proposed sites.
This project is meant to be changed by coding agents. Good agent instructions are product-oriented and include how to verify the result.
Try prompts like:
Read this repository and explain the workflow in plain English. Then run a 3-site smoke test, inspect the Tilebox job, and summarize whether the outputs are usable.
Publish and deploy this workflow to my Tilebox cluster. Submit a small job with max_sites=5, wait for it to finish, inspect failures or low-quality results, and make the smallest code changes needed.
Adapt this data center workflow to rank visible construction at solar farm sites. Replace the input CSV schema as needed, keep the Sentinel-2 before/after scene selection, and adjust the scoring so large new bright panel-like areas rank higher.
Make this workflow easier to use for non-geospatial users. Add clearer task display names, better log messages, and output fields that explain why a site ranked highly.
Use the latest completed job to build a small static website from outputs/ranking.json and the cached preview images. Keep the page simple: map, ranked list, before/after evidence, and score details.
Run the workflow on 30 sampled sites, compare the top-ranked results manually from the previews, and propose scoring changes to reduce vegetation-only false positives.
Data centers are just one example. The same pattern works for ports, agriculture, mining, energy, disaster recovery, parking lots, supply chains, or any question that depends on how places change over time.
Tilebox gives the agent the loop it needs to build something real:
prompt → workflow code → deployed compute → observable job → inspectable outputs → iteration → data product
Clone the repo, run a small version, deploy it to a runner, or point your own agent at it and adapt it to a different idea.
Do not just ask a question — build the product.
- Python 3.12 and
uv - Tilebox CLI, and a Tilebox API key from the Tilebox Console - sign up for free!
- Copernicus Data Space S3 credentials
export COPERNICUS_ACCESS_KEY="..."
export COPERNICUS_SECRET_KEY="..."The workflow lazily downloads the Clay v1.5 checkpoint on first use and caches it under ~/.cache/tilebox/models/.
Install dependencies:
uv syncPublish and deploy the workflow release:
RELEASE_ID=$(tilebox workflow publish-release --json | jq -r '.id')
tilebox workflow deploy-release --release "$RELEASE_ID" --cluster "<your-cluster>" --jsonSubmit a small smoke test first:
tilebox job submit \
--name datacenter-buildout-smoke \
--task tilebox.com/datacenters/RankDataCenterBuildout \
--version v1.14 \
--cluster "<your-cluster>" \
--input '{
"max_sites": 3,
"random_seed": 1337,
"before_date": "2024-05-01",
"after_date": "2026-05-01",
"window_days": 60,
"crop_size_m": 3000,
"scene_cloud_cover_max": 30.0,
"crop_cloud_cover_max": 1.0
}' \
--wait \
--jsonAfter the job runs, inspect it in the Tilebox Console. Look at task logs, spans, inputs, cached previews, scene metadata, and outputs/ranking.json. That inspection loop is the important part: the agent can use the same evidence to debug and improve the workflow.