Skip to content

IVF: add IDFilter support and set_intra_query_threads in cpp runtime#322

Open
ibhati wants to merge 7 commits intomainfrom
ib/ivf_intra_filter
Open

IVF: add IDFilter support and set_intra_query_threads in cpp runtime#322
ibhati wants to merge 7 commits intomainfrom
ib/ivf_intra_filter

Conversation

@ibhati
Copy link
Copy Markdown
Member

@ibhati ibhati commented Apr 28, 2026

Brings two FAISS-required runtime features to the IVF C++ bindings:

IDFilter-based search to support SearchParameters::sel selectors in IndexSVSIVF.
The ability to change the intra-query thread count after train() / load() (mirrors Vamana).

…tch heuristic

- bindings/cpp ivf_index/dynamic_ivf_index: SearchParams gains
  filter_stop and filter_estimate_batch; search() accepts IDFilter*;
  filter loop mirrors PR #309 adaptive batch heuristic.
- Manager classes expose get/set_num_intra_query_threads.
- IVFIndex / DynamicIVFIndex: set_threadpool() also rebuilds intra-query
  thread pools; add set_num_intra_query_threads().
- Tests: 4 new cases mirroring Vamana PR#309 (restrictive filter +
  filter_stop early-exit, static & dynamic).
ibhati added 5 commits May 4, 2026 14:32
- Use std::span for batch_iterator (public API signature)
- Remove calls to set/get_num_intra_query_threads which don't exist in v0.3.0
- Store intra_query_threads locally; value used at construction time
- Update to nightly build (2026-05-06) which includes IVF intra-query threading API
- Restore calls to set/get_num_intra_query_threads on impl_
- Remove fallback logic since nightly build has these methods
Comment thread bindings/cpp/CMakeLists.txt
}

///// Intra-query (cluster-level) parallelism
[[nodiscard]] size_t get_num_intra_query_threads() const override {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious on why [[nodiscard]] is used here and not elsewhere - I am not particularly familiar with what nodiscard does

std::tie(sampled, sample_hits) = sample_filter_hits(
*filter,
max_batch_size,
[this](size_t id) { return impl_->has_id(id); },
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is nearly identical diff to the non-dynamic except this line - why is it different here?

Side note - are these classes something that we should look into having inheritance for?

Comment on lines +260 to +262
intra_query_threads_ = intra_query_threads;
// Note: impl_ uses the value set at construction time,
// changing it here requires rebuilding/reloading the index
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems not ideal - should it call impl_->set_num_intra_query_threads(intra_query_threads) ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth testing case where filter_estimate_batch=false?
And what about comparison of results for runs with different number of intra_query_threads?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants