Skip to content

livenson/mxmap

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

362 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MX Map — Email Providers of Municipalities Worldwide

Nightly

An interactive map showing where ~46,000 municipalities across 166 countries host their official email — whether with US hyperscalers (Microsoft, Google, AWS), local providers, or self-hosted solutions.

Coverage spans Europe (48 countries), Africa (54), the Americas (21), Asia (30), Oceania (8), and the Middle East (11).

View the live map

Screenshot of MX Map

How it works

The data pipeline has three steps:

  1. Preprocess — Loads ~46,000 municipalities from curated seed data across 166 countries, performs MX, SPF, CNAME, DKIM, autodiscover, and TXT DNS lookups on their official domains (with domain guessing for missing entries), detects email security gateways (SeppMail, Barracuda, Hornetsecurity, etc.), and classifies each municipality's email provider. TXT domain verification tokens (e.g., MS= for Microsoft 365, google-site-verification= for Google Workspace) serve as tiebreakers when other signals are ambiguous.
  2. Postprocess — Applies manual overrides for edge cases, retries DNS for unresolved domains, checks SMTP banners of independent MX hosts for hidden providers, then scrapes websites of still-unclassified municipalities for email addresses.
  3. Validate — Cross-validates MX and SPF records, assigns a confidence score (0–100) to each entry, and generates a validation report.
flowchart TD
    trigger["Nightly trigger"] --> seed

    subgraph pre ["1 · Preprocess"]
        seed[/"Seed data (166 countries)"/] --> fetch["Load ~46,000 municipalities"]
        fetch --> domains["Extract domains +<br/>guess candidates"]
        domains --> dns["MX + TXT lookups<br/>(3 resolvers)"]
        dns --> spf_resolve["Resolve SPF includes<br/>& redirects"]
        spf_resolve --> cname["Follow CNAME chains"]
        cname --> asn["ASN lookups<br/>(Team Cymru)"]
        asn --> autodiscover["Autodiscover DNS<br/>(CNAME + SRV)"]
        autodiscover --> dkim["DKIM selector<br/>CNAME lookups"]
        dkim --> gateway["Detect gateways<br/>(SeppMail, Barracuda,<br/>Proofpoint, Sophos ...)"]
        gateway --> classify["Classify providers<br/>MX → CNAME → SPF →<br/>Autodiscover → DKIM → TXT"]
    end

    classify --> overrides

    subgraph post ["2 · Postprocess"]
        overrides["Apply manual overrides"] --> retry["Retry DNS<br/>for unknowns"]
        retry --> smtp["SMTP banner check<br/>(EHLO on port 25)"]
        smtp --> scrape_urls["Probe municipal websites<br/>(/kontaktid, /kontakti, /kontaktai …)"]
        scrape_urls --> extract["Extract emails<br/>+ decrypt TYPO3 obfuscation"]
        extract --> scrape_dns["DNS lookup on<br/>email domains"]
        scrape_dns --> reclassify["Reclassify<br/>resolved entries"]
    end

    reclassify --> data[("data.json")]
    data --> score

    subgraph val ["3 · Validate"]
        score["Confidence scoring · 0–100"] --> gwarn["Flag potential<br/>unknown gateways"]
        gwarn --> gate{"Quality gate<br/>avg ≥ 70 · high-conf ≥ 80%"}
    end

    gate -- "Pass" --> deploy["Commit & deploy to Pages"]
    gate -- "Fail" --> issue["Open GitHub issue"]

    style trigger fill:#e8f4fd,stroke:#4a90d9,color:#1a5276
    style seed fill:#e8f4fd,stroke:#4a90d9,color:#1a5276
    style data fill:#d5f5e3,stroke:#27ae60,color:#1e8449
    style deploy fill:#d5f5e3,stroke:#27ae60,color:#1e8449
    style issue fill:#fadbd8,stroke:#e74c3c,color:#922b21
    style gate fill:#fdebd0,stroke:#e67e22,color:#935116
Loading

Coverage

Region Countries Municipalities With domains
Europe 48 ~25,000 ~22,000
Africa 54 ~8,500 ~2,400
Americas 21 ~7,700 ~5,500
Asia 30 ~4,200 ~2,100
Middle East 11 ~700 ~200
Oceania 8 ~400 ~200
Total 166 ~46,000 ~32,000

Seed data is sourced from Wikidata (SPARQL queries for municipal entities per country) and supplemented with domain pattern discovery (e.g., {name}.go.ke for Kenya, {name}dc.go.tz for Tanzania, muni{name}.go.cr for Costa Rica, {name}.municipios.gob.pa for Panama).

Quick start

uv sync

uv run preprocess          # All countries
uv run preprocess DE       # Single country
uv run preprocess DE:BY    # Single Bundesland

uv run postprocess
uv run validate

# Serve the map locally
python -m http.server

Development

uv sync --group dev

# Run tests with coverage
uv run pytest --cov --cov-report=term-missing

# Lint the codebase
uv run ruff check src tests
uv run ruff format src tests

Nightly pipeline

A GitHub Actions workflow runs every night at 04:00 UTC:

  • Small/medium countries (<1,000 municipalities) are scanned every night
  • Large countries (>=1,000 municipalities: BR, CA, DZ, MA, etc.) rotate on a 3-day cycle
  • Germany (11K Gemeinden) rotates 3 Bundesländer per night on a 6-day cycle
  • Results are validated, committed, and deployed to GitHub Pages

Attribution

This project is a fork of mxmap.ch by David Huser, which maps email providers of Swiss municipalities. Extended to 166 countries worldwide with region-specific provider detection, gateway look-through, DKIM/TXT verification-based classification, curated seed data, and per-country TopoJSON geodata.

Related work

Contributing

If you spot a misclassification, please open an issue with the municipality ID and the correct provider. For municipalities where automated detection fails, corrections can be added to the MANUAL_OVERRIDES dict in src/mail_sovereignty/postprocess.py.

About

Email providers of Baltic municipalities displayed on a map

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 80.3%
  • HTML 19.7%