Skip to content

/txs/compare: multi-currency transaction summaries & wallet fingerprinting#98

Open
frederik-raphael wants to merge 15 commits into
developfrom
feature/fingerprinting
Open

/txs/compare: multi-currency transaction summaries & wallet fingerprinting#98
frederik-raphael wants to merge 15 commits into
developfrom
feature/fingerprinting

Conversation

@frederik-raphael

Copy link
Copy Markdown
Contributor

TL;DR

A new read-only endpoint, GET /{currency}/txs/compare, takes 2 to 100
transaction hashes and ships two features behind one call:

  1. Transaction summary (all chains) — an aggregate rollup over the whole
    set: total value in native units and in USD, total fee, tx count, and the
    block/timestamp range (plus input/output counts for UTXO). Works for UTXO
    and account chains (ETH/TRX). Because it sums USD across every
    transfer (incl. tokens), mixed-asset sets stay comparable. This answers
    "tell me about this set of txs" and is always returned.

  2. Wallet fingerprinting (UTXO-only, opt-in) — answers are these
    transactions likely produced by the same actor?
    by extracting per-tx
    fingerprint characteristics, running pairwise signals, and rolling them
    into a single relation verdict (linked ... unlinked) with confidence
    and notes. On-chain spending links between the compared txs are returned as
    lineage edges.

The two are gated by include_analysis. With include_analysis=false the
endpoint skips all the expensive cluster/spending/exchange lookups and returns
only the summary — the lightweight, multi-currency path. With
include_analysis=true (UTXO only) it additionally runs the full
fingerprinting analysis.

Currency support. The summary is multi-currency (UTXO + ETH/TRX). The full
fingerprinting analysis (signals, lineage, verdict) is UTXO-only; requesting
the analysis on an account chain returns a 400.

Feature 1 — transaction summary (all chains)

field meaning
total_value native-unit sum (satoshi / wei / sun); native transfers only (token transfers carry no native amount)
total_value_usd USD fiat summed across all transfers incl. tokens, so it is comparable across assets; partial + flagged in notes when a rate is missing
total_fee summed fee in the chain's base unit
total_inputs / total_outputs UTXO-only, omitted for account chains
tx_count, block/timestamp range always present
notes flags caveats (tokens excluded from total_value, partial USD totals)

Feature 2 — wallet fingerprinting (UTXO-only, opt-in)

area what it does
characteristics script types, witness presence, tx version, RBF, locktime pattern, BIP69 output ordering, coinjoin flag
signals 13 pairwise signals across 3 kinds: discriminator / score / linkage
verdict 7-tier relation from linked to unlinked, plus confidence + notes
lineage output_spent_by_input edges between the compared txs (with output/input indices)

The scoring spec (signal weights and the verdict decision tree) is maintained
internally as the single source of truth for the rules implemented here.

image

Calibration note. The signal weights, the verdict confidence, and
score_total are tentative. They are tuned to the spec, not yet calibrated
against ground-truth data. The verdict tiers are stable; treat the
numbers as heuristic for now.


Request / response shape

GET /{currency}/txs/compare
    ?tx_hash=<h1>&tx_hash=<h2>[&tx_hash=...]      # 2..100 hashes
    &include_details=false                         # embed full per-tx details
    &include_characteristics=true                  # embed per-tx characteristics
    &include_signals=true                          # embed the signals table
    &include_analysis=true                         # run the fingerprinting analysis

Response (TransactionComparison):

  • txs[] : per-tx items, each with optional characteristics and details
  • signals[] : the pairwise signal table (omitted body when include_signals=false)
  • lineage[] : output_spent_by_input edges between compared txs
  • summary : aggregate stats, always present and currency-aware:
    • total_value / total_fee in the chain's base unit (satoshi for UTXO,
      wei/sun for account). total_value sums native transfers only, since
      token transfers carry no native-unit amount.
    • total_value_usd sums USD fiat across all transfers (incl. tokens), so
      it is comparable across assets; partial-summed and flagged in notes when
      some txs lack a rate.
    • total_inputs / total_outputs are UTXO-only, omitted for account chains.
    • notes flags caveats (token transfers excluded from total_value, partial
      USD totals).
    • plus tx_count and the block/timestamp range.
  • verdict : the relation/confidence rollup, omitted when include_analysis=false

In the docs, /txs/compare is intentionally ordered after the regular
/txs/{tx_hash} endpoints (it has to be registered before them so Starlette
does not match compare as a tx_hash; the order is corrected in the OpenAPI
post-processing).


How it works

The request flows through the standard four-layer REST pipeline:

  1. Route (web/routes/txs.py) declares the query params and delegates.
  2. Web service (web/service/txs_service.py) calls the DB-layer engine and
    runs the translator.
  3. DB service (db/asynchronous/services/comparison_service.py) is the
    engine: fetch txs, extract characteristics, compute signals, aggregate the
    verdict, build lineage.
  4. Translator (web/translators.py) maps the internal models to the slim
    API models.

The summary (always computed, all chains)

build_summary is currency-aware and runs for every request regardless of
include_analysis. For account txs (ETH/TRX) it sums the flat transfer
value, takes fee straight off each tx, and leaves the UTXO input/output
counts unset (no UTXO IO is fetched). UTXO txs keep the input/output
decomposition (value = summed outputs, fee = inputs − outputs on non-coinbase
txs). total_value_usd is summed across all transfers incl. tokens via the
rates service, so mixed-asset sets stay comparable; it is partial-summed and
flagged in notes when some txs lack a rate.

A missing tx hash — including a hash from another chain that is absent from the
queried keyspace — aborts the whole comparison with a 404
(TransactionNotFoundException) rather than returning a partial summary.

The fingerprinting analysis (UTXO-only, include_analysis=true)

The four stages below run only when the analysis is requested.

Stage 1: per-tx characteristics

extract_characteristics(tx) produces one TxCharacteristicsInternal per tx
(pure, no extra DB calls): script types, witness presence (ground-truth
has_witness preferred, script-type inference as fallback), tx version, RBF
signaling, locktime classification, BIP69 output ordering, and the coinjoin
flag.

Stage 2: signals

Each signal_* function takes the whole list of characteristics and
returns one ComparisonSignalInternal whose verdict is the comparison
result and whose per_tx column shows each tx's value. Signals come in three
kinds:

signal kind match / mismatch weight
script_type discriminator +5 / -80
tx_version discriminator +5 / -30
rbf discriminator +3 / -25
locktime_pattern discriminator +4 / -15
witness_present score +3 / -20
output_count_shape score +3 / -10
bip69_outputs_sorted score +2 / -10
shared_cluster linkage gate (weight 0)
direct_input_overlap linkage gate (weight 0)
change_chain linkage gate (weight 0)
common_ancestor linkage gate (weight 0)
utxo_linkage linkage gate (weight 0)
exchange_input_overlap linkage qualifier (weight 0)

Discriminators and scores contribute to a weighted mismatch/match sum; linkage
signals are categorical gates (counted, not weighted). exchange_input_overlap
is a demoting qualifier: shared exchange-tagged inputs weaken cluster overlap
as evidence.

Stage 3: verdict

aggregate_verdict(...) selects one of seven tiers from (linkage gates, weighted mismatch sum, weighted match sum, cluster verdict):

linked > likely_linked > potential_link > inconclusive
       > potential_unlink > likely_unlinked > unlinked

cluster_verdict (same / different / unknown) is the single source of
truth for the shared_cluster gate. Thresholds on the weighted mismatch sum:
<= -60 reaches likely_unlinked, < -30 reaches potential_unlink.

Stage 4: lineage

output_spent_by_input edges are built from the get_spending_txs references
already fetched during orchestration, restricted to pairs within the compared
set. Each edge carries from_idx/to_idx (positions in the compared list) and
out_index/in_index (the spent output and the spending input on those txs).

Skipping the analysis (include_analysis=false)

include_analysis=false short-circuits the engine: it skips
_fetch_input_address_clusters, _fetch_parent_refs, and
_fetch_input_address_exchange_flags, computes no signals/verdict/lineage, and
returns only the summary. If include_characteristics=false as well, the
per-tx fetch drops IO and heuristics entirely (header-only), which is the
cheapest path. This is also the only mode allowed for account chains (ETH/TRX).


Worked examples

All hashes below are real, public chain data (ORBITAAL-tagged entities and a
public Kraken exchange address); none are from any live investigation. The two
BTC examples embed captured responses from the live backend; the ETH summary
is illustrative (see the note under that example).

Different actors, correctly separated (BTC)

Two txs from unrelated tagged entities, from the cross-entity benchmark:

  • 1f8f3416e06984f5d4470dad4e637ab55caeb421dc6f04a7e75ede2c5f8779aa (HappyCoins.com, exchange)
  • ef5d192f3cc3e2a746671f3402707d5baacaa393408009544a7df05aef31ba98 (AlphaBayMarket, darknet market)
GET /btc/txs/compare?tx_hash=1f8f3416…&tx_hash=ef5d192f…
→ relation: unlinked, confidence 95, cluster_verdict: different

Different input clusters, and the script_type (-80) and tx_version (-30)
discriminators contradict, so the weighted score is -118 and the verdict is
unlinked.

Captured response (verdict + signals)
{
  "verdict": {
    "relation": "unlinked", "confidence": 95, "cluster_verdict": "different",
    "discriminator_hits": ["script_type", "tx_version"], "score_total": -118.0,
    "notes": ["Cluster splits these txs and discriminators contradict — strong evidence of separate actors."]
  },
  "signals": [
    {"name": "script_type",          "kind": "discriminator", "verdict": "mismatch",     "weight": -80, "per_tx": ["P2PKH,P2SH", "P2PKH"]},
    {"name": "witness_present",       "kind": "score",         "verdict": "mismatch",     "weight": -20, "per_tx": ["true", "false"]},
    {"name": "tx_version",            "kind": "discriminator", "verdict": "mismatch",     "weight": -30, "per_tx": ["v2", "v1"]},
    {"name": "rbf",                   "kind": "discriminator", "verdict": "match",        "weight": 3,   "per_tx": ["final", "final"]},
    {"name": "locktime_pattern",      "kind": "discriminator", "verdict": "match",        "weight": 4,   "per_tx": ["anti_sniping", "anti_sniping"]},
    {"name": "bip69_outputs_sorted",  "kind": "score",         "verdict": "match",        "weight": 2,   "per_tx": ["sorted", "sorted"]},
    {"name": "output_count_shape",    "kind": "score",         "verdict": "match",        "weight": 3,   "per_tx": ["pay_plus_change", "pay_plus_change"]},
    {"name": "shared_cluster",        "kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": ["5963948", "92291698"]},
    {"name": "exchange_input_overlap","kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": ["exchange", "non_exchange"]},
    {"name": "direct_input_overlap",  "kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": [null, null]},
    {"name": "change_chain",          "kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": [null, null]},
    {"name": "common_ancestor",       "kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": [null, null]},
    {"name": "utxo_linkage",          "kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": [null, null]}
  ]
}

Same actor, correctly linked (BTC)

Two txs that both spend from the same input address
1EVvJ9uKhHPtWodk1QFvESmLUeB2RimzfA (cluster 34430663), so they are
genuinely the same actor:

  • 121b74f14630ed865899f07bb7d53bb48510604d9a68cd7671edc54600570567
  • 056d557b39ff137935c6051aa1807efd55188b549ee00e62c1483bc7fba231d3
GET /btc/txs/compare?tx_hash=121b74f1…&tx_hash=056d557b…
→ relation: linked, confidence 96, cluster_verdict: same

Same input cluster (shared_cluster and direct_input_overlap both match) and
every discriminator agrees, so the weighted score is +25 and the verdict is
linked.

Captured response (verdict + signals)
{
  "verdict": {
    "relation": "linked", "confidence": 96, "cluster_verdict": "same",
    "discriminator_hits": [], "score_total": 25.0,
    "notes": ["All compared txs share at least one input cluster."]
  },
  "signals": [
    {"name": "script_type",          "kind": "discriminator", "verdict": "match",    "weight": 5, "per_tx": ["P2PKH", "P2PKH"]},
    {"name": "witness_present",       "kind": "score",         "verdict": "match",    "weight": 3, "per_tx": ["false", "false"]},
    {"name": "tx_version",            "kind": "discriminator", "verdict": "match",    "weight": 5, "per_tx": ["v1", "v1"]},
    {"name": "rbf",                   "kind": "discriminator", "verdict": "match",    "weight": 3, "per_tx": ["final", "final"]},
    {"name": "locktime_pattern",      "kind": "discriminator", "verdict": "match",    "weight": 4, "per_tx": ["anti_sniping", "anti_sniping"]},
    {"name": "bip69_outputs_sorted",  "kind": "score",         "verdict": "match",    "weight": 2, "per_tx": ["sorted", "sorted"]},
    {"name": "output_count_shape",    "kind": "score",         "verdict": "match",    "weight": 3, "per_tx": ["pay_plus_change", "pay_plus_change"]},
    {"name": "shared_cluster",        "kind": "linkage",       "verdict": "match",    "weight": 0, "per_tx": ["34430663", "34430663"]},
    {"name": "exchange_input_overlap","kind": "linkage",       "verdict": "mismatch", "weight": 0, "per_tx": ["non_exchange", "non_exchange"]},
    {"name": "direct_input_overlap",  "kind": "linkage",       "verdict": "match",    "weight": 0, "per_tx": ["1EVvJ9uKhHPtWodk1QFvESmLUeB2RimzfA", "1EVvJ9uKhHPtWodk1QFvESmLUeB2RimzfA"]},
    {"name": "change_chain",          "kind": "linkage",       "verdict": "mismatch", "weight": 0, "per_tx": [null, null]},
    {"name": "common_ancestor",       "kind": "linkage",       "verdict": "mismatch", "weight": 0, "per_tx": [null, null]},
    {"name": "utxo_linkage",          "kind": "linkage",       "verdict": "mismatch", "weight": 0, "per_tx": [null, null]}
  ]
}

Multi-asset summary (ETH, summary-only)

Four ETH transactions, one of them a USDT token transfer, in summary-only mode
(account chains do not run the fingerprinting analysis):

GET /eth/txs/compare?include_analysis=false
  &tx_hash=e1fc854dfa3f02064085e81fa47d5cdba66df7be6f2d58d86388210f089f08c4   # USDT transfer (token)
  &tx_hash=66fbddbd450a81117de6b6372d685e6074556c3020d3da18d897d7eb1140efd6   # 1.11 ETH
  &tx_hash=b86c1ed6d7c6aab36b2037adc504f44e47c3dd19fcb048b6cc843913a165de88   # 17,810 ETH
  &tx_hash=5f32cdd351000604551c78b91a1b3ebc26aa2ea1d66f73ccb70d3cb5a717f71b   # contract call, 0 ETH

The USDT transfer adds 0 to total_value (it moves no native ETH), but its
fiat value still counts toward total_value_usd, keeping that figure comparable
across native and token activity; notes records the exclusion. UTXO-only
fields (total_inputs / total_outputs) and any unset native-unit fields
(total_fee when no rate was applied) are omitted on account chains.
tx_count counts every aggregated leg, so 4 input hashes become 5 legs (4
native bases + 1 USDT token transfer).

Summary response
{
  "summary": {
    "tx_count": 5,
    "currency": "eth",
    "total_value": 17811110000000000000000,
    "total_value_usd": 22365601.2,
    "block_min": 15954101,
    "block_max": 25142863,
    "timestamp_min": 1668258119,
    "timestamp_max": 1779357383,
    "notes": [
      "total_value covers native transfers only; 1 token transfer(s) excluded (their value is in total_value_usd)"
    ]
  }
}

Benchmarks

Same-cluster consistency (should be linked)

Spends sampled across known small clusters (50-200 addresses); every pair
should read as linked (2026-05-26, k=20 spends/cluster, 8 workers).

metric value
clusters evaluated 18 (0 skipped)
spend-txs sampled 246
pairs compared 2,142
linked + likely_linked 1,325 + 817 = 100%
inconclusive / likely_unlinked / unlinked 0 / 0 / 0

Every evaluated same-cluster pair reads linked or likely_linked, with
zero false negatives across 2,142 pairs spanning 18 clusters. The minimum
verdict on any pair was likely_linked — never inconclusive.

Cross-entity separation (should be unlinked)

Pairs an address from entity A with one from entity B (A != B) — known
negatives (e.g. Binance vs Kraken), so anything more positive than
likely_unlinked is a false positive (2026-05-21; 100 pairs, 30 entities,
5 addrs/entity, seed 42, 0 errors).

verdict.relation value
unlinked 82 (82%)
likely_unlinked 18 (18%)
likely_linked (false positive) 0 (0%)

All 100 known-negative pairs land on the correct negative side, with zero
false-positive likely_linked.

Performance

Measured against a live instance (May 2026). Sweeps draw random tx hashes
from random blocks, group them into N-sized
hash-sets, and call /{currency}/txs/compare repeatedly. The four include_
tiers are reused across the same hash-sets so the tiers are comparable on
identical inputs.

tier flags
summary analysis off, characteristics off (cheapest)
characteristics analysis off, per-tx characteristics on
full full fingerprinting analysis + signals (default)
full_details full plus embedded per-tx details

BTC — summary and characteristics tiers (median / mean / p95 ms, 100 calls per cell)

tier N=2 N=5 N=10 N=20 N=50 N=100
summary 18 / 18 / 24 20 / 20 / 25 27 / 27 / 34 33 / 34 / 45 45 / 54 / 93 66 / 85 / 241
characteristics 46 / 50 / 77 54 / 59 / 82 66 / 73 / 103 91 / 105 / 175 158 / 248 / 883 256 / 379 / 882

summary and characteristics scale near-linearly with N up to N=100. The
full N-sweep ran end-to-end with 0 errors.

BTC — full tier, before and after the cliff fix (median / mean / p95 / max ms)

The original full-tier implementation paid get_best_cluster_tag per input
address, which scales with cluster size: any compared tx whose input belonged
to a huge cluster (e.g. a major exchange, ~23M addresses) added ~2 s. Latency
was strongly bimodal rather than long-tailed: each compare call landed in
either a ~150 ms fast path or a ~2.5-3.5 s cliff path. This PR replaces the
expensive ranked best-cluster-tag digest with a cheap correlated EXISTS
query (get_clusters_with_conceptwhich_clusters_have_concept,
src/graphsenselib/tagstore/db/queries.py), making the cost independent of
per-cluster tag count. See cliff fix below.

Same sweep shape (30 distinct hash-sets × 3 calls per N, fresh random seed,
540 calls total, 0 errors):

N before after median speedup
2 141 / 357 / 2228 / 2446 96 / 102 / 157 / 184 1.5×
5 182 / 957 / 2859 / 3008 114 / 118 / 153 / 187 1.6×
10 3361 / 2238 / 3554 / 3678 131 / 154 / 280 / 363 26×
20 158 / 181 / 279 / 387
50 261 / 316 / 560 / 1275
100 458 / 465 / 697 / 857

(max rows in the after column are the entire 90-call distribution's max,
not just p95.) full_detailsfull (adds <1% — per-tx details are
already-fetched data attached at the response layer).

The cliff fix — bimodal distribution gone

Per-set medians, threshold 1000 ms ("slow"):

N=2 N=5 N=10 N=20 N=50 N=100
before, slow / total sets 3 / 30 9 / 30 19 / 30
after, slow / total sets 0 / 30 0 / 30 0 / 30 0 / 30 1 / 30 0 / 30

The bimodal distribution is gone. The one slow N=50 set in the after-run
peaked at 1.3 s (vs the old cliff plateau at 3-5 s) and is consistent with
ordinary tail noise, not a cluster-size cliff. p95 and max now sit close to
the median across all N — the latency distribution is unimodal.

Summary tier across chains (median / mean / p95 ms, 0 errors)

N BTC (100 calls/cell) ETH (25 calls/cell) TRX (9 calls/cell)
2 18 / 18 / 24 73 / 81 / 136 35 / 35 / 39
5 20 / 20 / 25 95 / 97 / 112 52 / 52 / 59
10 27 / 27 / 34 109 / 120 / 169 59 / 59 / 69
20 33 / 34 / 45 157 / 160 / 203 68 / 67 / 74
50 45 / 54 / 93 240 / 241 / 303
100 66 / 85 / 241

BTC < TRX < ETH at the summary tier. ETH is ~3-4× slower than TRX and ~4-5×
slower than BTC — the overhead is the account-chain trace fetch and
multi-asset USD aggregation, not fingerprinting itself (analysis is off here).


Notes for reviewers

  • Two model sets are updated in lockstep (internal db/.../models and API
    web/models/compare.py) plus the translator, per the project's dual-model
    convention.
  • The OpenAPI post-processing in web/app.py is extended with a paths reorder
    for /txs/compare; the snake_case/union/example post-processing is preserved.
  • The Python client must be regenerated (make run-codegen) and committed
    alongside this change for the codegen pre-commit hook to pass. (The generator
    now runs as the host user via --user, so it no longer leaves root-owned
    files.)
  • Currency-aware summary. build_summary and ComparisonSummary were
    generalized beyond BTC: total_output_sat became total_value (native,
    native-transfers only); total_value_usd (USD across all transfers incl.
    tokens) and total_fee were added; total_inputs / total_outputs are now
    optional (UTXO-only); and a notes list flags caveats. A missing or
    cross-chain tx hash aborts with 404 (no partial summary). This is what enables
    summary-only support for account chains.

Plumb the new fields added by raw_utxo schema migration 2->3 through the
async service layer. TxValue gains per-input sequence; TxUtxo gains
transaction-level version and lock_time. Read sites in io_from_rows and
std_tx_from_row pull them off the row with safe getattr/get fallbacks so
older keyspaces still load.

Internal-only - no REST model changes; FastAPI response_model filtering
keeps /txs/{tx_hash} unchanged.
Implemented signals, lineage, tests, tuned weights. Comparison is now optional --> only asking for summary is now possible.
Extend /txs/compare to handle account chains (ETH/TRX) in summary-only
mode (include_analysis=false). The fingerprinting analysis stays
UTXO-only; account currencies are rejected only when analysis is
requested.

Reshape ComparisonSummary to be currency- and asset-aware:
- total_value: native base unit (satoshi/wei); sums native transfers
  only, since token transfers carry no native-unit amount
- total_value_usd: USD fiat summed across all transfers (incl. tokens),
  so it is comparable across assets; partial-summed and flagged when
  some txs lack a rate
- total_fee: native unit (gas is always native)
- total_inputs/total_outputs: now optional, null for account chains
- notes: flags excluded token transfers and partial/unavailable USD totals

Renames total_output_sat -> total_value. A missing tx hash (including a
hash from another chain, absent from the queried keyspace) aborts the
whole comparison with TransactionNotFoundException rather than yielding a
partial summary.

Regenerates the Python client for the new ComparisonSummary schema.
A repeated hash was fetched twice, double-counted in the summary, and
trivially compared as linked to itself. Dedup the hash list (order-preserving)
up front and require 2+ distinct hashes, returning 400 otherwise.
The account summary fetched only the base/native tx per hash (get_tx returns
the first trace), so token transfers (e.g. USDT) never reached build_summary:
their USD was missing from total_value_usd and the token-exclusion note never
fired. For account chains in summary-only mode, fetch the full asset-flow set
per hash (get_asset_flows_within_tx: base + token legs) so build_summary folds
token USD into total_value_usd and flags the excluded token transfers.

Adds a regression test that feeds the orchestration a token leg and asserts the
token USD is summed and the note is emitted.
get_asset_flows_within_tx(include_internal_txs=False) synthesized the
base leg from trace[0]. ETH writes a synthetic outermost trace for every
tx so this worked, but TRX only emits trace rows for internal contract
calls -- plain native transfers and most TRC20 transfers have none. The
compare endpoint therefore 404'd on those with "Found no traces in tx
X:None", which made any multi-asset TRX summary that mixed a native or
WTRX leg unusable.

Delegate the base-leg lookup to get_tx, which already encodes the
ETH-vs-TRX split (trace[0] for ETH, transaction row for TRX). Switch
get_tx's branch from `currency == "eth"` to
`currency_to_schema_type[currency] == "account"` so the rule is tied to
a config invariant rather than a literal name.

Adds a regression test that wires db.fetch_transaction_trace to raise if
hit and asserts the TRX base leg comes back from the tx row.
The fingerprinting analysis was advertised as UTXO-only, but several
signals (wasabi/whirlpool/joinmarket coinjoin detection, exchange-input
overlap, change-address heuristics) are tuned to BTC. BCH/LTC have only
the change heuristics wired up; ZEC has none. Running the analysis on
those chains passed the gate but produced degraded, partially-empty
fingerprints rather than honest "not supported" feedback.

Reject include_analysis=true for any non-BTC currency with a 400. The
summary-only mode (include_analysis=false) continues to work for every
supported chain. Updates the endpoint description, the 400 description,
and the docstring to match.

Expands the existing eth/trx rejection tests to also cover bch/ltc/zec.
Regenerates the Python client to pick up the new endpoint description.
…uery

_fetch_input_address_exchange_flags paid get_best_cluster_tag per unique
input cluster (through get_tag_summaries_by_subject_ids with
include_best_cluster_tag=true). The underlying tagstore query ranks all
cluster-definer tags by confidence and hydrates the winner -- cost
scales with per-cluster tag count, so any tx with an input in a huge
cluster (e.g. a major exchange, ~23M addresses) added ~2 s. Latency on
the full tier was strongly bimodal: each compare call landed in either a
~150 ms fast path or a ~2.5-3.5 s cliff path depending on whether any
input belonged to a heavy cluster.

The exchange flag is consumed only as a boolean
(broad_category == "exchange") to drive the signal_exchange_input_overlap
demoting qualifier; we never use the ranked best tag itself. Replace the
digest path with a focused existence check:

- _get_clusters_with_concept_stmt: correlated EXISTS subquery driven by
  unnest(cluster_ids) so Postgres short-circuits at the first matching
  tag per cluster. Cost is bounded by len(cluster_ids), not by
  per-cluster tag count.
- Tagstore.get_clusters_with_concept: thin wrapper returning the subset
  of input cluster_ids that match.
- TagsService.which_clusters_have_concept: service-layer wrapper.
- _fetch_input_address_exchange_flags now consumes addr_to_cluster from
  the caller instead of re-resolving clusters, and uses the new path.
  compare_txs chains the exchange-flag fetch after the parallel
  cluster/parent-ref resolution; the lost parallelism costs tens of ms
  vs the seconds saved.

Semantic shift: the previous check fired only when the weighted-most-
common concept across an address's tags was "exchange"; the new one
fires when any cluster-definer tag on the cluster carries the concept.
More inclusive, which is the right direction for the demoting qualifier:
if there is meaningful evidence the cluster is an exchange, the
shared-cluster linkage evidence should be weakened.

Measured impact (scripts/fingerprint_perf.py, 30 hash-sets x 3 calls per
N, tier=full, fresh seed):

  N    before (median/p95 ms)    after (median/p95 ms)
  2    141 / 2228                96 / 157
  5    182 / 2859                114 / 153
  10   3361 / 3554               131 / 280     (~26x at median)
  20   (not in prior sweep)      158 / 279
  50   (not in prior sweep)      261 / 560
  100  (not in prior sweep)      458 / 697

Per-set bimodal split (set median >= 1 s): before 3/9/19 of 30 sets at
N=2/5/10; after 0 of 30 at every tested N (one outlier at N=50 peaked at
1.27 s, consistent with ordinary tail noise rather than a cluster-size
cliff). The latency distribution is now unimodal and the cliff is gone.

Adds 5 unit tests in tests/db/test_comparison_service.py covering: no
tags_service, no addresses, exchange/non-exchange mix, unresolved
clusters (-1), cluster-id dedup.
@frederik-raphael frederik-raphael changed the title # /txs/compare: multi-currency transaction summaries & wallet fingerprinting /txs/compare: multi-currency transaction summaries & wallet fingerprinting May 26, 2026

@soad003 soad003 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here some high level notes:

  • the summary fiat currency should be configurable
  • maybe consider using a list of features to activate not a bool per feature (to be consistent with the tx_heuristics
  • Scope: since we talked about introducing a details view for multi select i would maybe consider extending the scope of the endpoint e.g. moving it from the /txs/ path to something more generic, like /{network}/subgraph/ (not sure), such that we can in the longer run push subgraphs or just a list of txs and addresses to the endpoint and get summary to display in the ui including the tx compare signals. Lets discuss next week. New suggestion for naming /{network}/graph_summary

Swap the four boolean query params (include_details,
include_characteristics, include_signals, include_analysis) on
GET /txs/compare for a single include list, mirroring the
include_heuristics pattern: include=characteristics|details|signals|
lineage|verdict, or include=all. Defaults to characteristics, signals,
lineage and verdict (details excluded). Signals, lineage and verdict are
always computed internally (the verdict depends on the signals); the
list only controls what is returned.

Drop the summary-only mode: the summary and include_analysis are removed
from the compare response, so compare is now analysis-only and therefore
BTC-only. Chain-agnostic aggregate stats move to a forthcoming
POST /{currency}/subgraph/summary endpoint. Regenerate the python client.
Add a chain-agnostic aggregate-stats endpoint over a set of transactions,
relocating the summary that /txs/compare used to return. The POST body
{ txs, addresses } defines the subgraph; the node set must hold 2-100
distinct nodes. addresses is reserved for a future extension and rejected
(400) for now, so the field name is locked in the API contract. Unlike
/txs/compare the summary works for every supported chain: UTXO header
aggregation, and account asset-flow USD folding (token legs included).

Move build_summary (+ _usd_fiat) out of comparison_service into the new
subgraph DB service; rename ComparisonSummaryInternal ->
SubgraphSummaryInternal and drop the now-orphaned ComparisonSummary API
model. Share tx-model test builders via tests/db/helpers.py.
Regenerate the python client (new SubgraphApi).
Add a fiat_currency parameter (Literal["usd","eur"], default usd) to the
POST /{currency}/subgraph/summary request body. Rename the response field
total_value_usd -> total_value_fiat and add a fiat_currency field echoing
which currency the total is in, so the name no longer hard-codes USD.

build_summary/_fiat now look the rate up by the requested code and word the
notes accordingly; the value is summed from each tx's fiat_values (usd/eur
are the only rates GraphSense stores). Regenerate the python client
Restructure the subgraph/summary response from a flat tx summary into an
envelope { currency, txs, addresses }. The tx aggregates move under a new
SubgraphTxSummary block (currency lifted to the top level); addresses is a
reserved, null-until-implemented slot so adding per-address stats later is
not a breaking change. Regenerate the python client.
@frederik-raphael

Copy link
Copy Markdown
Contributor Author

Quick overview of the changes:

  • Summary is now under the new subgraph endpoint (POST /{cur}/subgraph/summarize)
  • Summary currency is now configurable
  • extended the summary for future address summarization
  • fingerprinting features are now a list

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants