Problem Description
Summary
This issue tracks the implementation of reranker support for DashScope and Xinference platforms, as well as native reasoning model support for DashScope's Qwen models with streaming capabilities.
Features
1. Document Reranking Support
- DashScope Reranker: Implement reranker for DashScope platform to improve document relevance scoring
- Xinference Reranker: Implement reranker for Xinference platform for local/enterprise deployments
2. DashScope Native Reasoning Models
- Add support for Qwen models with native reasoning capabilities
- Implement streaming support for reasoning models
- Enable real-time reasoning output for enhanced transparency
3. Reasoning Manager Updates
- Update reasoning manager to auto-detect DashScope reasoning models
- Improve model detection logic for better routing
Motivation
- Reranking: Improves retrieval quality by re-scoring documents based on query relevance
- Native Reasoning: Qwen models with built-in reasoning provide better transparency and explainability
- Streaming: Real-time reasoning output improves user experience and allows progressive results
Components Affected
- Rerankers (DashScope, Xinference)
- LLM Models (DashScope reasoning models)
- Reasoning Manager
- Streaming infrastructure
Proposed Solution
- Reranker implementations follow the standard reranker interface
- DashScope reasoning models integrate with existing reasoning infrastructure
- Streaming support uses async generators for efficient token delivery
- Auto-detection reduces manual configuration overhead
Alternatives Considered
No response
Additional Context
No response
Would you like to work on this?
Problem Description
Summary
This issue tracks the implementation of reranker support for DashScope and Xinference platforms, as well as native reasoning model support for DashScope's Qwen models with streaming capabilities.
Features
1. Document Reranking Support
2. DashScope Native Reasoning Models
3. Reasoning Manager Updates
Motivation
Components Affected
Proposed Solution
Alternatives Considered
No response
Additional Context
No response
Would you like to work on this?