feat: add asynchronous support to LLMRanker with run_async method#11841
Open
GovindhKishore wants to merge 2 commits into
Open
feat: add asynchronous support to LLMRanker with run_async method#11841GovindhKishore wants to merge 2 commits into
GovindhKishore wants to merge 2 commits into
Conversation
|
@GovindhKishore is attempting to deploy a commit to the deepset Team on Vercel. A member of the Team first needs to authorize it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related Issues
Proposed Changes:
Added native asynchronous support (
run_async) toLLMRanker. This allows reranking pipelines to run concurrently inside asynchronous environments (like FastMCP or FastAPI) without stalling the main event loop during LLM network requests.How it works:
hasattr(self._chat_generator, "run_async"). If it only supports synchronous execution, it automatically drops down to a safe thread-pool fallback viaasyncio.to_threadinstead of blocking the event loop.run: Deduplication, empty-query/empty-document guards, and the ranking/parsing post-processing (_get_reply_text,_rank_documents_from_reply) are unchanged and shared between the sync and async paths - only the chat generation call itself differs.raise_on_failure), falling back to the original document order when ranking or generation fails and the flag isFalse, or re-raising when it'sTrue.How did you test it?
Added a complete async test suite (
test_run_async_*) mirroring the existing sync test coverage intest_llm_ranker.py. The tests cover:run_asyncis used when the chat generator supports it, and that it is not used (withasyncio.to_threadcalling syncruninstead) when the generator lacksrun_async.Ran
hatch run test:unit test/components/rankers/test_llm_ranker.pywith a 100% clean pass,hatch run ruff format/hatch run ruff check --fix.Notes for the reviewer
The implementation structure mirrors the synchronous
runmethod identically to keep the component maintenance straightforward.Checklist
feat: add run_async support to LLMRanker