Public IP intel repacked for fast offline use. Three crates: mmap binary
DBs (geo.bin, proxy.bin, geofeed.bin) and plain-text proxy views
(≤ 38 MB each). Sources: IP2Location LITE, MaxMind GeoLite2, RIR geofeeds.
wget https://github.com/tn3w/IP2X/releases/latest/download/geo.bin
wget https://github.com/tn3w/IP2X/releases/latest/download/proxy.bin
wget https://github.com/tn3w/IP2X/releases/latest/download/geofeed.bin
wget https://github.com/tn3w/IP2X/releases/latest/download/proxy_pub.netset
wget https://github.com/tn3w/IP2X/releases/latest/download/usage.buckets
wget https://github.com/tn3w/IP2X/releases/latest/download/threat.buckets
wget https://github.com/tn3w/IP2X/releases/latest/download/isp.tsv
wget https://github.com/tn3w/IP2X/releases/latest/download/domain.tsv
wget https://github.com/tn3w/IP2X/releases/latest/download/last_seen.tsv
wget https://github.com/tn3w/IP2X/releases/latest/download/provider.tsv
wget https://github.com/tn3w/IP2X/releases/latest/download/fraud_score.tsvUpdated daily via GitHub Actions.
| file | role | size |
|---|---|---|
geo.bin |
mmap DB, IP → (lat, lon) at 0.001° | ~42 MB |
proxy.bin |
mmap DB, IP → (isp, domain) | ~12 MB |
geofeed.bin |
mmap DB, IP → (country, region, city, postal, feed) | ~11 MB |
proxy_pub.netset |
CIDR netset, public proxies (proxy_type == PUB) | ~31 MB |
usage.buckets |
IP → usage (bucketed per value) | ~27 MB |
threat.buckets |
IP → threat (bucketed per value) | ~0.5 MB |
isp.tsv |
IP → ISP (dict + ranges) | ~34 MB |
domain.tsv |
IP → domain (dict + ranges) | ~33 MB |
last_seen.tsv |
IP → last-seen days (dict + ranges) | ~38 MB |
provider.tsv |
IP → VPN provider (dict + ranges) | ~0.3 MB |
fraud_score.tsv |
IP → fraud score (dict + ranges) | ~37 MB |
Built by geox/ from IP2Location DB11 LITE (preferred) +
MaxMind GeoLite2-City (fallback). Coordinates quantised to 0.001°
(~111 m, village-scale). Self-describing little-endian, magic GEO1.
24 B header. IPv4 stored as (base u32) + (delta u24) blocks of ≤ 256
rows; IPv6 keyed on the upper 64 bits. Bit-packed point indices into a
deduped (lat, lon) table of i24/1000.
| offset | size | field |
|---|---|---|
| 0 | 4 | magic GEO1 |
| 4 | u8 | version (1) |
| 5 | u8 | minor (3) |
| 6 | u8 | idx_bits |
| 7 | u8 | reserved |
| 8 | u32 | point_count |
| 12 | u32 | v4_row_count |
| 16 | u32 | v6_row_count |
| 20 | u32 | v4_block_count |
Then: points (6 B × point_count), v4 bases (4 B × blocks), v4 offsets
(4 B × (blocks+1)), v4 deltas (3 B × rows), v4 packed idx,
v6 keys (8 B × rows), v6 packed idx.
Lookup v4: bisect v4_bases, bisect deltas inside the matched block,
read packed idx, decode point. Lookup v6: bisect upper-64 keys, read
packed idx, decode point. ~0.2 MB resident at open; pages fault on demand.
cd geox
cargo build --release
./target/release/geox build \
--ip2l IP2LOCATION-LITE-DB11.IPV6.BIN \
--mmdb GeoLite2-City.mmdb \
--out geo.bin
./target/release/geox lookup --db geo.bin 8.8.8.8
# 37.386, -122.084Python lookup (geo_lookup.py)
mmap + numpy searchsorted on v4 bases / v6 upper-64 keys; manual
bit-packed idx + i24 decode. No preload, near-instant startup.
python3 geo_lookup.py 8.8.8.8 2001:4860:4860::8888
# 8.8.8.8 37.386, -122.084
# 2001:4860:4860::8888 37.386, -122.084--db PATH to point at a non-default geo.bin.
Built by proxyx/ from IP2Location IP2PROXY-LITE-PX12.
Compact mmap DB, IP → (isp, domain). Magic PRX2, little-endian, ~12 MB
for the full PX12 dataset (3.88M v4 rows + 7.8k v6 rows after
adjacent-equal merge).
36 B header. Strings interned once into a single offset/blob table; (isp_idx, dom_idx) pairs interned into a pair table, freq-sorted so hot pairs get tiny indices. IPv4 stored as fixed-size blocks of 256 rows with per-block variable bit-width deltas and pair-index packing; IPv6 keyed on the upper 64 bits.
| offset | size | field |
|---|---|---|
| 0 | 4 | magic PRX2 |
| 4 | u8 | version (2) |
| 5 | u8 | block_shift (8 → 256 rows) |
| 6 | u8 | v6_bits |
| 7 | u8 | reserved |
| 8 | u32 | pair_count |
| 12 | u32 | str_count |
| 16 | u32 | v4_row_count |
| 20 | u32 | v6_row_count |
| 24 | u32 | v4_block_count |
| 28 | u32 | v4_delta_blob_len |
| 32 | u32 | v4_idx_blob_len |
Then: pairs (6 B × n_pairs, u24 isp_idx + u24 dom_idx), str offsets
(4 B × (n_strs+1)), str blob, v4 bases (4 B × blocks), per-block
dbits / ibits (1 B × blocks each), v4 delta byte-offsets and
idx byte-offsets (4 B × (blocks+1) each), v4 delta blob + 8 B pad,
v4 idx blob + 8 B pad, v6 keys (8 B × rows), v6 packed idx + 8 B pad.
Avg per-block widths on full PX12: ~14 delta-bits, ~8 idx-bits.
Lookup v4: bisect bases4, bisect deltas in the matched block at that
block's dbits, read packed pair-idx at that block's ibits, resolve
pair → (isp, domain). Lookup v6: bisect upper-64 keys, read packed idx,
resolve pair. Native lookup ~170 ns v4 / ~80 ns v6; load ~10 µs;
resident struct 208 B (mmap shared, paged on demand).
cd proxyx
cargo build --release
./target/release/proxyx build-db \
--px12 IP2PROXY-LITE-PX12.BIN \
--out proxy.bin
./target/release/proxyx lookup --db proxy.bin 1.0.19.98
# isp I2TS Inc.
# domain mediaindex.co.jpPython lookup (proxy_db_lookup.py)
mmap + numpy searchsorted on bases4 / v6 upper-64 keys; manual
bit-packed delta + idx decode against per-block widths. No preload,
near-instant startup.
python3 proxy_db_lookup.py 1.0.19.98 2001:dead::1
# 1.0.19.98 isp=I2TS Inc. domain=mediaindex.co.jp
# 2001:dead::1 isp=FDCservers.net LLC domain=fdcservers.net--db PATH to point at a non-default proxy.bin.
Built by geofeedx/ from operator-published geolocation.
The builder downloads the RIR bulk WHOIS dumps (RIPE, APNIC, AFRINIC),
extracts every geofeed: / remarks: Geofeed reference, fetches each
referenced RFC 8805 feed
concurrently, and merges the LACNIC consolidated feed. Self-describing
little-endian, magic GFD3, IPv4 + IPv6.
Feed rows are accepted only when contained in the authority range of the
RIR object that referenced them. Each row contributes
(country, region, city, postal, feed, rir); feed is the source URL.
28 B header. (country, region, city, postal, feed, rir) tuples are
interned into a freq-sorted record table (hot records get small ids), and
every string is interned once into an offset/blob table. IPv4 and IPv6
ranges are each flattened into a sorted breakpoint array (start → record id); adjacent-equal ids are merged. Id and field-index widths are the
minimum bytes the cardinalities require (typically 2 B each).
| offset | size | field |
|---|---|---|
| 0 | 4 | magic GFD3 |
| 4 | u8 | version (3) |
| 5 | u8 | id_width |
| 6 | u8 | field_count (6) |
| 7 | u8 | field_width |
| 8 | u32 | v4_break_count |
| 12 | u32 | v6_break_count |
| 16 | u32 | record_count |
| 20 | u32 | string_count |
| 24 | u32 | blob_len |
Then: v4 starts (4 B × v4_breaks), v4 ids (id_width × v4_breaks),
v6 starts (16 B × v6_breaks), v6 ids (id_width × v6_breaks),
records (field_count × field_width × records), string offsets
(4 B × (strings+1)), string blob.
Lookup: bisect the matching family's starts, read the packed record id, resolve the tuple. Native load ~6 µs (mmap, ~0 resident); ~120 ns/lookup over ~1.2 M v4 breakpoints.
cd geofeedx
cargo build --release
./target/release/geofeedx fetch --out geofeeds_data.csv
./target/release/geofeedx build --data geofeeds_data.csv --out geofeed.bin
./target/release/geofeedx lookup --db geofeed.bin 213.21.192.5
# country LV
# region LV-RIX
# city Riga
# ...fetch caches the RIR bulk dumps under .cache/rir-bulk and re-downloads
only what is missing. geofeeds_data.csv is the intermediate
cidr,country,region,city,postal,feed,rir join, regenerated on each fetch.
Python lookup (geofeed_lookup.py)
mmap + bisect on the v4 / v6 start arrays; variable-width record and
field decode. No preload, near-instant startup. v4 + v6 in one call.
python3 geofeed_lookup.py 213.21.192.5 2001:ad0::1--db PATH to point at a non-default geofeed.bin.
Built by proxyx/ from IP2Location IP2PROXY-LITE-PX12.
All files plain UTF-8, #-prefixed metadata header, ≤ 38 MB each
(no compression, no splitting). Empty source fields dropped; adjacent
ranges with identical value merged.
Three shapes used across the files:
Standard CIDR list, one network per line, single IPs as bare addresses.
#-prefixed metadata header. Drop-in for ipset hash:net,
iptables/nftables, ufw, pfSense and similar.
ipset create proxy_pub hash:net family inet
awk '!/^#/ && /\./' proxy_pub.netset | xargs -n1 ipset add proxy_pub[VALUE]
<start_ip>[+<span>]
<start_ip>[+<span>]
[NEXT_VALUE]
...
For low-cardinality categorical fields. IP → value = scan sections, bisect ranges. The string is written once per category, not per range.
#dict
<idx>\t<value>
<idx>\t<value>
#data
<start_ip>[+<span>]\t<idx>
#dict is frequency-sorted (smaller idx = more common, so popular
values cost 1-2 chars per row). #data is v4 block then v6, ascending.
Lookup: load the dict into a Vec<String>, bisect #data by start_ip,
index into the dict.
PX12 columns kept by proxyx (others ignored):
| file | PX12 column |
|---|---|
proxy_pub.netset |
proxy_type filtered to PUB |
usage.buckets |
usage_type |
threat.buckets |
threat |
isp.tsv |
isp |
domain.tsv |
domain |
last_seen.tsv |
last_seen (days) |
provider.tsv |
provider |
fraud_score.tsv |
fraud_score (0-99) |
Country/region/city/ASN/AS-name are intentionally omitted — geo.bin
already covers location, ASN lives elsewhere.
cd proxyx
cargo build --release
./target/release/proxyx build \
--px12 IP2PROXY-LITE-PX12.BIN \
--out out/
ls -lh out/Python lookup (proxy_lookup.py)
Parses all 8 outputs once into sorted (start, end, val) arrays; bisects per file on query. v4 + v6 in one call. Load ~8 s for the full bundle, lookup O(log n) per file thereafter.
python3 proxy_lookup.py 1.0.19.98
# proxy_pub True
# isp I2TS Inc.
# domain mediaindex.co.jp
# last_seen 30
# fraud_score 80
# usage DCH
# ...--dir PATH to point at a directory other than ..
Maps a cloud datacenter region to an ISO 3166-1 alpha-2 country code.
Covers AWS, GCP and Azure naming (ap-east-1, europe-west3, eastasia,
…) via a built-in region→country table, then falls back to parsing the
region string: ISO codes, country names (pycountry)
and city names (geonamescache),
with cardinal/ordinal suffixes (north, west, trailing digits) stripped.
from region_country import country
country("ap-east-1") # HK
country("europe-west3") # DE
country("eastasia") # HK
country("us-frankfurt") # DEReturns None when no country can be inferred.
pip install pycountry geonamescacheflowchart LR
D1[IP2Location DB11 LITE] --> G[geox/]
D2[GeoLite2-City] --> G
G --> GB[geo.bin]
D3[IP2Location PX12 LITE] --> P[proxyx/]
P --> PB[proxy.bin]
P --> R[proxy_pub.netset]
P --> U[usage.buckets]
P --> T[threat.buckets]
P --> TSV[isp / domain / last_seen / provider / fraud_score .tsv]
D4[RIR bulk WHOIS] --> F[geofeedx/]
D5[RFC 8805 feeds + LACNIC] --> F
F --> FB[geofeed.bin]
- Loops over IP2Location LITE downloads (
DB11LITEBINIPV6,PX12LITEBIN) usingIP2LOCATION_TOKEN. - Pulls
GeoLite2-City.mmdbfrom a public mirror. - Builds
geo.binwithgeox, plusproxy.binand the eight plain-text views withproxyx. - Runs
geofeedx fetch(RIR bulk + RFC 8805 feeds) thengeofeedx buildto producegeofeed.bin. - Publishes a timestamped release with all eleven assets; prunes to the latest 5.
Geo data: IP2Location LITE DB11 + MaxMind GeoLite2. Proxy data: IP2Location LITE PX12. Geofeed data: RIR bulk WHOIS (RIPE, APNIC, AFRINIC, LACNIC) + operator-published RFC 8805 feeds.