`/txs/compare`: multi-currency transaction summaries & wallet fingerprinting by frederik-raphael · Pull Request #98 · graphsense/graphsense-lib

frederik-raphael · 2026-05-26T12:02:47Z

TL;DR

A new read-only endpoint, GET /{currency}/txs/compare, takes 2 to 100
transaction hashes and ships two features behind one call:

Transaction summary (all chains) — an aggregate rollup over the whole
set: total value in native units and in USD, total fee, tx count, and the
block/timestamp range (plus input/output counts for UTXO). Works for UTXO
and account chains (ETH/TRX). Because it sums USD across every
transfer (incl. tokens), mixed-asset sets stay comparable. This answers
"tell me about this set of txs" and is always returned.
Wallet fingerprinting (UTXO-only, opt-in) — answers are these
transactions likely produced by the same actor? by extracting per-tx
fingerprint characteristics, running pairwise signals, and rolling them
into a single relation verdict (linked ... unlinked) with confidence
and notes. On-chain spending links between the compared txs are returned as
lineage edges.

The two are gated by include_analysis. With include_analysis=false the
endpoint skips all the expensive cluster/spending/exchange lookups and returns
only the summary — the lightweight, multi-currency path. With
include_analysis=true (UTXO only) it additionally runs the full
fingerprinting analysis.

Currency support. The summary is multi-currency (UTXO + ETH/TRX). The full
fingerprinting analysis (signals, lineage, verdict) is UTXO-only; requesting
the analysis on an account chain returns a 400.

Feature 1 — transaction summary (all chains)

field	meaning
`total_value`	native-unit sum (satoshi / wei / sun); native transfers only (token transfers carry no native amount)
`total_value_usd`	USD fiat summed across all transfers incl. tokens, so it is comparable across assets; partial + flagged in `notes` when a rate is missing
`total_fee`	summed fee in the chain's base unit
`total_inputs` / `total_outputs`	UTXO-only, omitted for account chains
`tx_count`, block/timestamp range	always present
`notes`	flags caveats (tokens excluded from `total_value`, partial USD totals)

Feature 2 — wallet fingerprinting (UTXO-only, opt-in)

area	what it does
characteristics	script types, witness presence, tx version, RBF, locktime pattern, BIP69 output ordering, coinjoin flag
signals	13 pairwise signals across 3 kinds: discriminator / score / linkage
verdict	7-tier relation from `linked` to `unlinked`, plus confidence + notes
lineage	`output_spent_by_input` edges between the compared txs (with output/input indices)

The scoring spec (signal weights and the verdict decision tree) is maintained
internally as the single source of truth for the rules implemented here.

Calibration note. The signal weights, the verdict confidence, and
score_total are tentative. They are tuned to the spec, not yet calibrated
against ground-truth data. The verdict tiers are stable; treat the
numbers as heuristic for now.

Request / response shape

GET /{currency}/txs/compare
    ?tx_hash=<h1>&tx_hash=<h2>[&tx_hash=...]      # 2..100 hashes
    &include_details=false                         # embed full per-tx details
    &include_characteristics=true                  # embed per-tx characteristics
    &include_signals=true                          # embed the signals table
    &include_analysis=true                         # run the fingerprinting analysis

Response (TransactionComparison):

txs[] : per-tx items, each with optional characteristics and details
signals[] : the pairwise signal table (omitted body when include_signals=false)
lineage[] : output_spent_by_input edges between compared txs
summary : aggregate stats, always present and currency-aware:
- total_value / total_fee in the chain's base unit (satoshi for UTXO,
  wei/sun for account). total_value sums native transfers only, since
  token transfers carry no native-unit amount.
- total_value_usd sums USD fiat across all transfers (incl. tokens), so
  it is comparable across assets; partial-summed and flagged in notes when
  some txs lack a rate.
- total_inputs / total_outputs are UTXO-only, omitted for account chains.
- notes flags caveats (token transfers excluded from total_value, partial
  USD totals).
- plus tx_count and the block/timestamp range.
verdict : the relation/confidence rollup, omitted when include_analysis=false

In the docs, /txs/compare is intentionally ordered after the regular
/txs/{tx_hash} endpoints (it has to be registered before them so Starlette
does not match compare as a tx_hash; the order is corrected in the OpenAPI
post-processing).

How it works

The request flows through the standard four-layer REST pipeline:

Route (web/routes/txs.py) declares the query params and delegates.
Web service (web/service/txs_service.py) calls the DB-layer engine and
runs the translator.
DB service (db/asynchronous/services/comparison_service.py) is the
engine: fetch txs, extract characteristics, compute signals, aggregate the
verdict, build lineage.
Translator (web/translators.py) maps the internal models to the slim
API models.

The summary (always computed, all chains)

build_summary is currency-aware and runs for every request regardless of
include_analysis. For account txs (ETH/TRX) it sums the flat transfer
value, takes fee straight off each tx, and leaves the UTXO input/output
counts unset (no UTXO IO is fetched). UTXO txs keep the input/output
decomposition (value = summed outputs, fee = inputs − outputs on non-coinbase
txs). total_value_usd is summed across all transfers incl. tokens via the
rates service, so mixed-asset sets stay comparable; it is partial-summed and
flagged in notes when some txs lack a rate.

A missing tx hash — including a hash from another chain that is absent from the
queried keyspace — aborts the whole comparison with a 404
(TransactionNotFoundException) rather than returning a partial summary.

The fingerprinting analysis (UTXO-only, `include_analysis=true`)

The four stages below run only when the analysis is requested.

Stage 1: per-tx characteristics

extract_characteristics(tx) produces one TxCharacteristicsInternal per tx
(pure, no extra DB calls): script types, witness presence (ground-truth
has_witness preferred, script-type inference as fallback), tx version, RBF
signaling, locktime classification, BIP69 output ordering, and the coinjoin
flag.

Stage 2: signals

Each signal_* function takes the whole list of characteristics and
returns one ComparisonSignalInternal whose verdict is the comparison
result and whose per_tx column shows each tx's value. Signals come in three
kinds:

signal	kind	match / mismatch weight
`script_type`	discriminator	+5 / -80
`tx_version`	discriminator	+5 / -30
`rbf`	discriminator	+3 / -25
`locktime_pattern`	discriminator	+4 / -15
`witness_present`	score	+3 / -20
`output_count_shape`	score	+3 / -10
`bip69_outputs_sorted`	score	+2 / -10
`shared_cluster`	linkage	gate (weight 0)
`direct_input_overlap`	linkage	gate (weight 0)
`change_chain`	linkage	gate (weight 0)
`common_ancestor`	linkage	gate (weight 0)
`utxo_linkage`	linkage	gate (weight 0)
`exchange_input_overlap`	linkage	qualifier (weight 0)

Discriminators and scores contribute to a weighted mismatch/match sum; linkage
signals are categorical gates (counted, not weighted). exchange_input_overlap
is a demoting qualifier: shared exchange-tagged inputs weaken cluster overlap
as evidence.

Stage 3: verdict

aggregate_verdict(...) selects one of seven tiers from (linkage gates, weighted mismatch sum, weighted match sum, cluster verdict):

linked > likely_linked > potential_link > inconclusive
       > potential_unlink > likely_unlinked > unlinked

cluster_verdict (same / different / unknown) is the single source of
truth for the shared_cluster gate. Thresholds on the weighted mismatch sum:
<= -60 reaches likely_unlinked, < -30 reaches potential_unlink.

Stage 4: lineage

output_spent_by_input edges are built from the get_spending_txs references
already fetched during orchestration, restricted to pairs within the compared
set. Each edge carries from_idx/to_idx (positions in the compared list) and
out_index/in_index (the spent output and the spending input on those txs).

Skipping the analysis (`include_analysis=false`)

include_analysis=false short-circuits the engine: it skips
_fetch_input_address_clusters, _fetch_parent_refs, and
_fetch_input_address_exchange_flags, computes no signals/verdict/lineage, and
returns only the summary. If include_characteristics=false as well, the
per-tx fetch drops IO and heuristics entirely (header-only), which is the
cheapest path. This is also the only mode allowed for account chains (ETH/TRX).

Worked examples

All hashes below are real, public chain data (ORBITAAL-tagged entities and a
public Kraken exchange address); none are from any live investigation. The two
BTC examples embed captured responses from the live backend; the ETH summary
is illustrative (see the note under that example).

Different actors, correctly separated (BTC)

Two txs from unrelated tagged entities, from the cross-entity benchmark:

1f8f3416e06984f5d4470dad4e637ab55caeb421dc6f04a7e75ede2c5f8779aa (HappyCoins.com, exchange)
ef5d192f3cc3e2a746671f3402707d5baacaa393408009544a7df05aef31ba98 (AlphaBayMarket, darknet market)

GET /btc/txs/compare?tx_hash=1f8f3416…&tx_hash=ef5d192f…
→ relation: unlinked, confidence 95, cluster_verdict: different

Different input clusters, and the script_type (-80) and tx_version (-30)
discriminators contradict, so the weighted score is -118 and the verdict is
unlinked.

Captured response (verdict + signals)

{
  "verdict": {
    "relation": "unlinked", "confidence": 95, "cluster_verdict": "different",
    "discriminator_hits": ["script_type", "tx_version"], "score_total": -118.0,
    "notes": ["Cluster splits these txs and discriminators contradict — strong evidence of separate actors."]
  },
  "signals": [
    {"name": "script_type",          "kind": "discriminator", "verdict": "mismatch",     "weight": -80, "per_tx": ["P2PKH,P2SH", "P2PKH"]},
    {"name": "witness_present",       "kind": "score",         "verdict": "mismatch",     "weight": -20, "per_tx": ["true", "false"]},
    {"name": "tx_version",            "kind": "discriminator", "verdict": "mismatch",     "weight": -30, "per_tx": ["v2", "v1"]},
    {"name": "rbf",                   "kind": "discriminator", "verdict": "match",        "weight": 3,   "per_tx": ["final", "final"]},
    {"name": "locktime_pattern",      "kind": "discriminator", "verdict": "match",        "weight": 4,   "per_tx": ["anti_sniping", "anti_sniping"]},
    {"name": "bip69_outputs_sorted",  "kind": "score",         "verdict": "match",        "weight": 2,   "per_tx": ["sorted", "sorted"]},
    {"name": "output_count_shape",    "kind": "score",         "verdict": "match",        "weight": 3,   "per_tx": ["pay_plus_change", "pay_plus_change"]},
    {"name": "shared_cluster",        "kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": ["5963948", "92291698"]},
    {"name": "exchange_input_overlap","kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": ["exchange", "non_exchange"]},
    {"name": "direct_input_overlap",  "kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": [null, null]},
    {"name": "change_chain",          "kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": [null, null]},
    {"name": "common_ancestor",       "kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": [null, null]},
    {"name": "utxo_linkage",          "kind": "linkage",       "verdict": "mismatch",     "weight": 0,   "per_tx": [null, null]}
  ]
}

Same actor, correctly linked (BTC)

Two txs that both spend from the same input address
1EVvJ9uKhHPtWodk1QFvESmLUeB2RimzfA (cluster 34430663), so they are
genuinely the same actor:

121b74f14630ed865899f07bb7d53bb48510604d9a68cd7671edc54600570567
056d557b39ff137935c6051aa1807efd55188b549ee00e62c1483bc7fba231d3

GET /btc/txs/compare?tx_hash=121b74f1…&tx_hash=056d557b…
→ relation: linked, confidence 96, cluster_verdict: same

Same input cluster (shared_cluster and direct_input_overlap both match) and
every discriminator agrees, so the weighted score is +25 and the verdict is
linked.

Captured response (verdict + signals)

{
  "verdict": {
    "relation": "linked", "confidence": 96, "cluster_verdict": "same",
    "discriminator_hits": [], "score_total": 25.0,
    "notes": ["All compared txs share at least one input cluster."]
  },
  "signals": [
    {"name": "script_type",          "kind": "discriminator", "verdict": "match",    "weight": 5, "per_tx": ["P2PKH", "P2PKH"]},
    {"name": "witness_present",       "kind": "score",         "verdict": "match",    "weight": 3, "per_tx": ["false", "false"]},
    {"name": "tx_version",            "kind": "discriminator", "verdict": "match",    "weight": 5, "per_tx": ["v1", "v1"]},
    {"name": "rbf",                   "kind": "discriminator", "verdict": "match",    "weight": 3, "per_tx": ["final", "final"]},
    {"name": "locktime_pattern",      "kind": "discriminator", "verdict": "match",    "weight": 4, "per_tx": ["anti_sniping", "anti_sniping"]},
    {"name": "bip69_outputs_sorted",  "kind": "score",         "verdict": "match",    "weight": 2, "per_tx": ["sorted", "sorted"]},
    {"name": "output_count_shape",    "kind": "score",         "verdict": "match",    "weight": 3, "per_tx": ["pay_plus_change", "pay_plus_change"]},
    {"name": "shared_cluster",        "kind": "linkage",       "verdict": "match",    "weight": 0, "per_tx": ["34430663", "34430663"]},
    {"name": "exchange_input_overlap","kind": "linkage",       "verdict": "mismatch", "weight": 0, "per_tx": ["non_exchange", "non_exchange"]},
    {"name": "direct_input_overlap",  "kind": "linkage",       "verdict": "match",    "weight": 0, "per_tx": ["1EVvJ9uKhHPtWodk1QFvESmLUeB2RimzfA", "1EVvJ9uKhHPtWodk1QFvESmLUeB2RimzfA"]},
    {"name": "change_chain",          "kind": "linkage",       "verdict": "mismatch", "weight": 0, "per_tx": [null, null]},
    {"name": "common_ancestor",       "kind": "linkage",       "verdict": "mismatch", "weight": 0, "per_tx": [null, null]},
    {"name": "utxo_linkage",          "kind": "linkage",       "verdict": "mismatch", "weight": 0, "per_tx": [null, null]}
  ]
}

Multi-asset summary (ETH, summary-only)

Four ETH transactions, one of them a USDT token transfer, in summary-only mode
(account chains do not run the fingerprinting analysis):

GET /eth/txs/compare?include_analysis=false
  &tx_hash=e1fc854dfa3f02064085e81fa47d5cdba66df7be6f2d58d86388210f089f08c4   # USDT transfer (token)
  &tx_hash=66fbddbd450a81117de6b6372d685e6074556c3020d3da18d897d7eb1140efd6   # 1.11 ETH
  &tx_hash=b86c1ed6d7c6aab36b2037adc504f44e47c3dd19fcb048b6cc843913a165de88   # 17,810 ETH
  &tx_hash=5f32cdd351000604551c78b91a1b3ebc26aa2ea1d66f73ccb70d3cb5a717f71b   # contract call, 0 ETH

The USDT transfer adds 0 to total_value (it moves no native ETH), but its
fiat value still counts toward total_value_usd, keeping that figure comparable
across native and token activity; notes records the exclusion. UTXO-only
fields (total_inputs / total_outputs) and any unset native-unit fields
(total_fee when no rate was applied) are omitted on account chains.
tx_count counts every aggregated leg, so 4 input hashes become 5 legs (4
native bases + 1 USDT token transfer).

Summary response

{
  "summary": {
    "tx_count": 5,
    "currency": "eth",
    "total_value": 17811110000000000000000,
    "total_value_usd": 22365601.2,
    "block_min": 15954101,
    "block_max": 25142863,
    "timestamp_min": 1668258119,
    "timestamp_max": 1779357383,
    "notes": [
      "total_value covers native transfers only; 1 token transfer(s) excluded (their value is in total_value_usd)"
    ]
  }
}

Benchmarks

Same-cluster consistency (should be `linked`)

Spends sampled across known small clusters (50-200 addresses); every pair
should read as linked (2026-05-26, k=20 spends/cluster, 8 workers).

metric	value
clusters evaluated	18 (0 skipped)
spend-txs sampled	246
pairs compared	2,142
`linked` + `likely_linked`	1,325 + 817 = 100%
`inconclusive` / `likely_unlinked` / `unlinked`	0 / 0 / 0

Every evaluated same-cluster pair reads linked or likely_linked, with
zero false negatives across 2,142 pairs spanning 18 clusters. The minimum
verdict on any pair was likely_linked — never inconclusive.

Cross-entity separation (should be `unlinked`)

Pairs an address from entity A with one from entity B (A != B) — known
negatives (e.g. Binance vs Kraken), so anything more positive than
likely_unlinked is a false positive (2026-05-21; 100 pairs, 30 entities,
5 addrs/entity, seed 42, 0 errors).

`verdict.relation`	value
`unlinked`	82 (82%)
`likely_unlinked`	18 (18%)
`likely_linked` (false positive)	0 (0%)

All 100 known-negative pairs land on the correct negative side, with zero
false-positive likely_linked.

Performance

Measured against a live instance (May 2026). Sweeps draw random tx hashes
from random blocks, group them into N-sized
hash-sets, and call /{currency}/txs/compare repeatedly. The four include_
tiers are reused across the same hash-sets so the tiers are comparable on
identical inputs.

tier	flags
`summary`	analysis off, characteristics off (cheapest)
`characteristics`	analysis off, per-tx characteristics on
`full`	full fingerprinting analysis + signals (default)
`full_details`	`full` plus embedded per-tx details

BTC — `summary` and `characteristics` tiers (`median / mean / p95` ms, 100 calls per cell)

tier	N=2	N=5	N=10	N=20	N=50	N=100
`summary`	18 / 18 / 24	20 / 20 / 25	27 / 27 / 34	33 / 34 / 45	45 / 54 / 93	66 / 85 / 241
`characteristics`	46 / 50 / 77	54 / 59 / 82	66 / 73 / 103	91 / 105 / 175	158 / 248 / 883	256 / 379 / 882

summary and characteristics scale near-linearly with N up to N=100. The
full N-sweep ran end-to-end with 0 errors.

BTC — `full` tier, before and after the cliff fix (`median / mean / p95 / max` ms)

The original full-tier implementation paid get_best_cluster_tag per input
address, which scales with cluster size: any compared tx whose input belonged
to a huge cluster (e.g. a major exchange, ~23M addresses) added ~2 s. Latency
was strongly bimodal rather than long-tailed: each compare call landed in
either a ~150 ms fast path or a ~2.5-3.5 s cliff path. This PR replaces the
expensive ranked best-cluster-tag digest with a cheap correlated EXISTS
query (get_clusters_with_concept → which_clusters_have_concept,
src/graphsenselib/tagstore/db/queries.py), making the cost independent of
per-cluster tag count. See cliff fix below.

Same sweep shape (30 distinct hash-sets × 3 calls per N, fresh random seed,
540 calls total, 0 errors):

N	before	after	median speedup
2	141 / 357 / 2228 / 2446	96 / 102 / 157 / 184	1.5×
5	182 / 957 / 2859 / 3008	114 / 118 / 153 / 187	1.6×
10	3361 / 2238 / 3554 / 3678	131 / 154 / 280 / 363	26×
20	—	158 / 181 / 279 / 387	—
50	—	261 / 316 / 560 / 1275	—
100	—	458 / 465 / 697 / 857	—

(max rows in the after column are the entire 90-call distribution's max,
not just p95.) full_details ≈ full (adds <1% — per-tx details are
already-fetched data attached at the response layer).

The cliff fix — bimodal distribution gone

Per-set medians, threshold 1000 ms ("slow"):

	N=2	N=5	N=10	N=20	N=50	N=100
before, slow / total sets	3 / 30	9 / 30	19 / 30	—	—	—
after, slow / total sets	0 / 30	0 / 30	0 / 30	0 / 30	1 / 30	0 / 30

The bimodal distribution is gone. The one slow N=50 set in the after-run
peaked at 1.3 s (vs the old cliff plateau at 3-5 s) and is consistent with
ordinary tail noise, not a cluster-size cliff. p95 and max now sit close to
the median across all N — the latency distribution is unimodal.

Summary tier across chains (`median / mean / p95` ms, 0 errors)

N	BTC (100 calls/cell)	ETH (25 calls/cell)	TRX (9 calls/cell)
2	18 / 18 / 24	73 / 81 / 136	35 / 35 / 39
5	20 / 20 / 25	95 / 97 / 112	52 / 52 / 59
10	27 / 27 / 34	109 / 120 / 169	59 / 59 / 69
20	33 / 34 / 45	157 / 160 / 203	68 / 67 / 74
50	45 / 54 / 93	240 / 241 / 303	—
100	66 / 85 / 241	—	—

BTC < TRX < ETH at the summary tier. ETH is ~3-4× slower than TRX and ~4-5×
slower than BTC — the overhead is the account-chain trace fetch and
multi-asset USD aggregation, not fingerprinting itself (analysis is off here).

Notes for reviewers

Two model sets are updated in lockstep (internal db/.../models and API
web/models/compare.py) plus the translator, per the project's dual-model
convention.
The OpenAPI post-processing in web/app.py is extended with a paths reorder
for /txs/compare; the snake_case/union/example post-processing is preserved.
The Python client must be regenerated (make run-codegen) and committed
alongside this change for the codegen pre-commit hook to pass. (The generator
now runs as the host user via --user, so it no longer leaves root-owned
files.)
Currency-aware summary. build_summary and ComparisonSummary were
generalized beyond BTC: total_output_sat became total_value (native,
native-transfers only); total_value_usd (USD across all transfers incl.
tokens) and total_fee were added; total_inputs / total_outputs are now
optional (UTXO-only); and a notes list flags caveats. A missing or
cross-chain tx hash aborts with 404 (no partial summary). This is what enables
summary-only support for account chains.

Plumb the new fields added by raw_utxo schema migration 2->3 through the async service layer. TxValue gains per-input sequence; TxUtxo gains transaction-level version and lock_time. Read sites in io_from_rows and std_tx_from_row pull them off the row with safe getattr/get fallbacks so older keyspaces still load. Internal-only - no REST model changes; FastAPI response_model filtering keeps /txs/{tx_hash} unchanged.

Implemented signals, lineage, tests, tuned weights. Comparison is now optional --> only asking for summary is now possible.

Extend /txs/compare to handle account chains (ETH/TRX) in summary-only mode (include_analysis=false). The fingerprinting analysis stays UTXO-only; account currencies are rejected only when analysis is requested. Reshape ComparisonSummary to be currency- and asset-aware: - total_value: native base unit (satoshi/wei); sums native transfers only, since token transfers carry no native-unit amount - total_value_usd: USD fiat summed across all transfers (incl. tokens), so it is comparable across assets; partial-summed and flagged when some txs lack a rate - total_fee: native unit (gas is always native) - total_inputs/total_outputs: now optional, null for account chains - notes: flags excluded token transfers and partial/unavailable USD totals Renames total_output_sat -> total_value. A missing tx hash (including a hash from another chain, absent from the queried keyspace) aborts the whole comparison with TransactionNotFoundException rather than yielding a partial summary. Regenerates the Python client for the new ComparisonSummary schema.

A repeated hash was fetched twice, double-counted in the summary, and trivially compared as linked to itself. Dedup the hash list (order-preserving) up front and require 2+ distinct hashes, returning 400 otherwise.

The account summary fetched only the base/native tx per hash (get_tx returns the first trace), so token transfers (e.g. USDT) never reached build_summary: their USD was missing from total_value_usd and the token-exclusion note never fired. For account chains in summary-only mode, fetch the full asset-flow set per hash (get_asset_flows_within_tx: base + token legs) so build_summary folds token USD into total_value_usd and flags the excluded token transfers. Adds a regression test that feeds the orchestration a token leg and asserts the token USD is summed and the note is emitted.

get_asset_flows_within_tx(include_internal_txs=False) synthesized the base leg from trace[0]. ETH writes a synthetic outermost trace for every tx so this worked, but TRX only emits trace rows for internal contract calls -- plain native transfers and most TRC20 transfers have none. The compare endpoint therefore 404'd on those with "Found no traces in tx X:None", which made any multi-asset TRX summary that mixed a native or WTRX leg unusable. Delegate the base-leg lookup to get_tx, which already encodes the ETH-vs-TRX split (trace[0] for ETH, transaction row for TRX). Switch get_tx's branch from `currency == "eth"` to `currency_to_schema_type[currency] == "account"` so the rule is tied to a config invariant rather than a literal name. Adds a regression test that wires db.fetch_transaction_trace to raise if hit and asserts the TRX base leg comes back from the tx row.

The fingerprinting analysis was advertised as UTXO-only, but several signals (wasabi/whirlpool/joinmarket coinjoin detection, exchange-input overlap, change-address heuristics) are tuned to BTC. BCH/LTC have only the change heuristics wired up; ZEC has none. Running the analysis on those chains passed the gate but produced degraded, partially-empty fingerprints rather than honest "not supported" feedback. Reject include_analysis=true for any non-BTC currency with a 400. The summary-only mode (include_analysis=false) continues to work for every supported chain. Updates the endpoint description, the 400 description, and the docstring to match. Expands the existing eth/trx rejection tests to also cover bch/ltc/zec. Regenerates the Python client to pick up the new endpoint description.

…uery _fetch_input_address_exchange_flags paid get_best_cluster_tag per unique input cluster (through get_tag_summaries_by_subject_ids with include_best_cluster_tag=true). The underlying tagstore query ranks all cluster-definer tags by confidence and hydrates the winner -- cost scales with per-cluster tag count, so any tx with an input in a huge cluster (e.g. a major exchange, ~23M addresses) added ~2 s. Latency on the full tier was strongly bimodal: each compare call landed in either a ~150 ms fast path or a ~2.5-3.5 s cliff path depending on whether any input belonged to a heavy cluster. The exchange flag is consumed only as a boolean (broad_category == "exchange") to drive the signal_exchange_input_overlap demoting qualifier; we never use the ranked best tag itself. Replace the digest path with a focused existence check: - _get_clusters_with_concept_stmt: correlated EXISTS subquery driven by unnest(cluster_ids) so Postgres short-circuits at the first matching tag per cluster. Cost is bounded by len(cluster_ids), not by per-cluster tag count. - Tagstore.get_clusters_with_concept: thin wrapper returning the subset of input cluster_ids that match. - TagsService.which_clusters_have_concept: service-layer wrapper. - _fetch_input_address_exchange_flags now consumes addr_to_cluster from the caller instead of re-resolving clusters, and uses the new path. compare_txs chains the exchange-flag fetch after the parallel cluster/parent-ref resolution; the lost parallelism costs tens of ms vs the seconds saved. Semantic shift: the previous check fired only when the weighted-most- common concept across an address's tags was "exchange"; the new one fires when any cluster-definer tag on the cluster carries the concept. More inclusive, which is the right direction for the demoting qualifier: if there is meaningful evidence the cluster is an exchange, the shared-cluster linkage evidence should be weakened. Measured impact (scripts/fingerprint_perf.py, 30 hash-sets x 3 calls per N, tier=full, fresh seed): N before (median/p95 ms) after (median/p95 ms) 2 141 / 2228 96 / 157 5 182 / 2859 114 / 153 10 3361 / 3554 131 / 280 (~26x at median) 20 (not in prior sweep) 158 / 279 50 (not in prior sweep) 261 / 560 100 (not in prior sweep) 458 / 697 Per-set bimodal split (set median >= 1 s): before 3/9/19 of 30 sets at N=2/5/10; after 0 of 30 at every tested N (one outlier at N=50 peaked at 1.27 s, consistent with ordinary tail noise rather than a cluster-size cliff). The latency distribution is now unimodal and the cliff is gone. Adds 5 unit tests in tests/db/test_comparison_service.py covering: no tags_service, no addresses, exchange/non-exchange mix, unresolved clusters (-1), cluster-id dedup.

soad003

Here some high level notes:

the summary fiat currency should be configurable
maybe consider using a list of features to activate not a bool per feature (to be consistent with the tx_heuristics
Scope: since we talked about introducing a details view for multi select i would maybe consider extending the scope of the endpoint e.g. moving it from the /txs/ path to something more generic, like /{network}/subgraph/ (not sure), such that we can in the longer run push subgraphs or just a list of txs and addresses to the endpoint and get summary to display in the ui including the tx compare signals. Lets discuss next week. New suggestion for naming /{network}/graph_summary

Swap the four boolean query params (include_details, include_characteristics, include_signals, include_analysis) on GET /txs/compare for a single include list, mirroring the include_heuristics pattern: include=characteristics|details|signals| lineage|verdict, or include=all. Defaults to characteristics, signals, lineage and verdict (details excluded). Signals, lineage and verdict are always computed internally (the verdict depends on the signals); the list only controls what is returned. Drop the summary-only mode: the summary and include_analysis are removed from the compare response, so compare is now analysis-only and therefore BTC-only. Chain-agnostic aggregate stats move to a forthcoming POST /{currency}/subgraph/summary endpoint. Regenerate the python client.

Add a chain-agnostic aggregate-stats endpoint over a set of transactions, relocating the summary that /txs/compare used to return. The POST body { txs, addresses } defines the subgraph; the node set must hold 2-100 distinct nodes. addresses is reserved for a future extension and rejected (400) for now, so the field name is locked in the API contract. Unlike /txs/compare the summary works for every supported chain: UTXO header aggregation, and account asset-flow USD folding (token legs included). Move build_summary (+ _usd_fiat) out of comparison_service into the new subgraph DB service; rename ComparisonSummaryInternal -> SubgraphSummaryInternal and drop the now-orphaned ComparisonSummary API model. Share tx-model test builders via tests/db/helpers.py. Regenerate the python client (new SubgraphApi).

Add a fiat_currency parameter (Literal["usd","eur"], default usd) to the POST /{currency}/subgraph/summary request body. Rename the response field total_value_usd -> total_value_fiat and add a fiat_currency field echoing which currency the total is in, so the name no longer hard-codes USD. build_summary/_fiat now look the rate up by the requested code and word the notes accordingly; the value is summed from each tx's fiat_values (usd/eur are the only rates GraphSense stores). Regenerate the python client

Restructure the subgraph/summary response from a flat tx summary into an envelope { currency, txs, addresses }. The tx aggregates move under a new SubgraphTxSummary block (currency lifted to the top level); addresses is a reserved, null-until-implemented slot so adding per-address stats later is not a breaking change. Regenerate the python client.

frederik-raphael · 2026-06-08T11:27:38Z

Quick overview of the changes:

Summary is now under the new subgraph endpoint (POST /{cur}/subgraph/summarize)
Summary currency is now configurable
extended the summary for future address summarization
fingerprinting features are now a list

frederik-raphael added 11 commits May 26, 2026 12:38

adding has_witness boolean for fingerprinting

6728e40

feat(web): add /txs/compare transaction comparison API

e9d1fb5

Implemented signals, lineage, tests, tuned weights. Comparison is now optional --> only asking for summary is now possible.

chore(client): regenerate python client for /txs/compare

65bde80

fix(web): dedup tx hashes in /txs/compare before aggregating

89a2231

A repeated hash was fetched twice, double-counted in the summary, and trivially compared as linked to itself. Dedup the hash list (order-preserving) up front and require 2+ distinct hashes, returning 400 otherwise.

fix: minor improvements from code review

fc3bfe3

frederik-raphael requested review from Tommel71 and soad003 May 26, 2026 12:02

frederik-raphael changed the title ~~# /txs/compare: multi-currency transaction summaries & wallet fingerprinting~~ /txs/compare: multi-currency transaction summaries & wallet fingerprinting May 26, 2026

soad003 requested changes May 29, 2026

View reviewed changes

frederik-raphael added 4 commits June 8, 2026 10:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`/txs/compare`: multi-currency transaction summaries & wallet fingerprinting#98

`/txs/compare`: multi-currency transaction summaries & wallet fingerprinting#98
frederik-raphael wants to merge 15 commits into
developfrom
feature/fingerprinting

frederik-raphael commented May 26, 2026

Uh oh!

soad003 left a comment •

edited

Loading

Uh oh!

frederik-raphael commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

frederik-raphael commented May 26, 2026

TL;DR

Feature 1 — transaction summary (all chains)

Feature 2 — wallet fingerprinting (UTXO-only, opt-in)

Request / response shape

How it works

The summary (always computed, all chains)

The fingerprinting analysis (UTXO-only, include_analysis=true)

Stage 1: per-tx characteristics

Stage 2: signals

Stage 3: verdict

Stage 4: lineage

Skipping the analysis (include_analysis=false)

Worked examples

Different actors, correctly separated (BTC)

Same actor, correctly linked (BTC)

Multi-asset summary (ETH, summary-only)

Benchmarks

Same-cluster consistency (should be linked)

Cross-entity separation (should be unlinked)

Performance

BTC — summary and characteristics tiers (median / mean / p95 ms, 100 calls per cell)

BTC — full tier, before and after the cliff fix (median / mean / p95 / max ms)

The cliff fix — bimodal distribution gone

Summary tier across chains (median / mean / p95 ms, 0 errors)

Notes for reviewers

Uh oh!

soad003 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

frederik-raphael commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

The fingerprinting analysis (UTXO-only, `include_analysis=true`)

Skipping the analysis (`include_analysis=false`)

Same-cluster consistency (should be `linked`)

Cross-entity separation (should be `unlinked`)

BTC — `summary` and `characteristics` tiers (`median / mean / p95` ms, 100 calls per cell)

BTC — `full` tier, before and after the cliff fix (`median / mean / p95 / max` ms)

Summary tier across chains (`median / mean / p95` ms, 0 errors)

soad003 left a comment •

edited

Loading