Skip to content

feat: providers result limits for JSON and NDJSON#149

Open
lidel wants to merge 1 commit into
mainfrom
fix/json-records-limit-and-overfetch
Open

feat: providers result limits for JSON and NDJSON#149
lidel wants to merge 1 commit into
mainfrom
fix/json-records-limit-and-overfetch

Conversation

@lidel
Copy link
Copy Markdown
Member

@lidel lidel commented May 14, 2026

Someguy supports JSON (buffered) and NDJSON (streaming) response types.

Problem

JSON delegated routing responses under-delivered: iiuc boxo's DefaultRecordsLimit of 20 acted as a pre-filter, and cacheFallbackIter silently dropped records without addresses.

Anecdotally (cc @aschmahmann), production at delegated-ipfs.dev returned ~4 providers on JSON (when opened in browser) vs ~17 on NDJSON for the same CID (/ipns/ipfs.tech).

As a side problem, NDJSON stream had no cap, which is a bad default.

Fix (This PR)

  • raise the JSON cap to 100 and add a streaming cap of 1000, matching HTTP Routing v1 section 4.1.5; expose both as --records-limit and --streaming-records-limit with SOMEGUY_* env vars; reject negative input, 0 disables the cap
  • over-fetch from the underlying router by 3x (cap 3000) so that, after cacheFallbackIter drops addr-less records, the surfaced count is close to the caller's limit; cap the final stream via a new limitedIter wrapper
    • not a fan of over-fetching, but these are mostly theoretical caps, most of CIDs will never hit 1k any time soon, and if we hit that ever, should be more than enough for retrieval (and a good problem to have)

JSON delegated routing responses under-delivered: boxo's
DefaultRecordsLimit of 20 acted as a pre-filter, and cacheFallbackIter
silently dropped records without addresses. Production at
delegated-ipfs.dev returned ~4 providers on JSON vs ~17 on NDJSON for
the same CID.

- raise the JSON cap to 100 and add a streaming cap of 1000,
  matching HTTP Routing v1 section 4.1.5; expose both as
  `--records-limit` and `--streaming-records-limit` with `SOMEGUY_*`
  env vars; reject negative input, 0 disables the cap
- over-fetch from the underlying router by 3x (cap 3000) so that,
  after cacheFallbackIter drops addr-less records, the surfaced count
  is close to the caller's limit; cap the final stream via a new
  limitedIter wrapper
@lidel lidel requested a review from a team May 14, 2026 22:44
Comment thread server_cached_router.go
Comment on lines +57 to +61
// cacheFallbackOverfetchMax caps the over-fetched limit so a large
// caller-side limit cannot blow up the DHT walk. Sized at 3x
// DefaultStreamingRecordsLimit so the multiplier applies on the
// streaming path too. The routing timeout bounds wall-clock.
cacheFallbackOverfetchMax = 3000
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant by blow up the DHT walk?

IIRC the limit passed to FindProviders() is just an upper-bound allowing the DHT client to terminate the DHT walk early if it has found enough providers. The DHT client will not do extra work if it didn't find enough providers, it will simply stop and return after it has queried the 20 closest peers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants