Rerank Modes Comparison¶
Lexilux supports two rerank modes: OpenAI-compatible and DashScope. This document provides a comprehensive comparison of their data formats, request/response structures, internal processing, and how they are unified into a standard RerankResult format.
Overview¶
Feature |
OpenAI-Compatible Mode |
DashScope Mode |
|---|---|---|
Mode Identifier |
|
|
API Endpoint |
|
|
Request Format |
Direct JSON object |
Wrapped in |
Response Format |
Direct JSON object |
Wrapped in |
Parameter Names |
|
|
Score Field Name |
|
|
Document Format |
Nested object: |
Nested object: |
Use Case |
Standard rerank APIs (Jina, Cohere, etc.) |
Alibaba Cloud DashScope |
Request Format Comparison¶
OpenAI-Compatible Mode¶
Endpoint: .. code-block:: http
POST {base_url}/rerank
Request Headers: .. code-block:: http
Content-Type: application/json Authorization: Bearer {api_key}
Request Body:
{
"model": "jina-reranker-v3",
"query": "python http library",
"documents": [
"urllib is a built-in Python library for HTTP requests",
"requests is a popular third-party HTTP library for Python",
"httpx is a modern async HTTP client for Python"
],
"top_n": 3,
"return_documents": true
}
Key Characteristics:
- Direct JSON object structure at top level
- Uses documents array (not candidates)
- Uses top_n parameter (not top_k)
- Uses return_documents boolean (not include_docs)
- All fields are at the top level
Internal Processing:
- Handler: OpenAICompatibleHandler
- Request building: Maps top_k → top_n, include_docs → return_documents
- Endpoint detection: Checks if /rerank is in base_url, otherwise appends it
DashScope Mode¶
Endpoint: .. code-block:: http
POST {base_url}/text-rerank/text-rerank
Request Headers: .. code-block:: http
Content-Type: application/json Authorization: Bearer {api_key}
Request Body:
{
"model": "qwen3-rerank",
"input": {
"query": "python http library",
"documents": [
"urllib is a built-in Python library for HTTP requests",
"requests is a popular third-party HTTP library for Python",
"httpx is a modern async HTTP client for Python"
]
},
"parameters": {
"top_n": 3,
"return_documents": true
}
}
Key Characteristics:
- Query and documents wrapped in input object
- Additional parameters in parameters object
- Uses same parameter names as OpenAI (top_n, return_documents)
- Endpoint is typically the full path (not appended)
Internal Processing:
- Handler: DashScopeHandler
- Request building: Wraps query/docs in input, puts options in parameters
- Endpoint: Uses base_url as-is (full path expected)
Response Format Comparison¶
OpenAI-Compatible Mode Response¶
Response Structure:
{
"results": [
{
"index": 0,
"relevance_score": 0.95,
"document": {
"text": "requests is a popular third-party HTTP library for Python"
}
},
{
"index": 1,
"relevance_score": 0.85,
"document": {
"text": "httpx is a modern async HTTP client for Python"
}
},
{
"index": 2,
"relevance_score": 0.70,
"document": {
"text": "urllib is a built-in Python library for HTTP requests"
}
}
],
"usage": {
"total_tokens": 150
}
}
Key Characteristics:
- Results array with objects containing index, relevance_score, and document
- Document is nested object with text field
- Usage information at top level
- Results already sorted by relevance (descending)
Internal Parsing:
- Extracts index and relevance_score from each result
- Extracts document text from document.text or document.content
- Handles both dict and string document formats
DashScope Mode Response¶
Response Structure:
{
"output": {
"results": [
{
"index": 0,
"relevance_score": 0.95,
"document": {
"text": "requests is a popular third-party HTTP library for Python"
}
},
{
"index": 1,
"relevance_score": 0.85,
"document": {
"text": "httpx is a modern async HTTP client for Python"
}
}
]
},
"usage": {
"total_tokens": 150
}
}
Key Characteristics:
- Results wrapped in output object
- Same structure as OpenAI inside output.results
- Usage information at top level
- Results format matches OpenAI standard
Internal Parsing:
- Extracts results from output.results
- Same parsing logic as OpenAI mode (inherits from OpenAICompatibleHandler)
- Handles document extraction identically
Unified Result Format¶
Both modes are unified into a standard RerankResult format, hiding provider differences from users.
Standard RerankResult Format¶
Result Structure:
RerankResult(
results=[
(0, 0.95), # (index, score) when include_docs=False
(1, 0.85),
(2, 0.70)
],
usage=Usage(
input_tokens=100,
output_tokens=50,
total_tokens=150
),
raw={} # Optional: full raw response if return_raw=True
)
With Documents:
RerankResult(
results=[
(0, 0.95, "requests is a popular third-party HTTP library for Python"),
(1, 0.85, "httpx is a modern async HTTP client for Python"),
(2, 0.70, "urllib is a built-in Python library for HTTP requests")
],
usage=Usage(...),
raw={}
)
Key Characteristics:
- Results are always tuples: (index, score) or (index, score, document)
- Results are sorted by score in descending order (higher is better)
- Index refers to original document position in input array
- Score is always a float (handles both positive and negative)
- Documents are optional strings (only if include_docs=True)
Unification Process¶
The unification process happens in three stages:
Request Building (
build_request): - Maps unified parameters (top_k,include_docs) to provider-specific names - Adapts request structure to provider format - Handles endpoint path differencesResponse Parsing (
parse_response): - Extracts results from provider-specific response structure - Normalizes field names (score/relevance_score→score) - Extracts document text from various formats - Maps results to original document indices - Returns intermediate format:List[Tuple[int, float, Optional[str]]]Result Normalization (
_normalize_results): - Sorts results by score (descending) - Appliestop_kfiltering if specified - Formats results based oninclude_docsflag - Returns final unified format:List[Tuple[int, float]]orList[Tuple[int, float, str]]
Unification Flow Diagram:
User Call
↓
Rerank.__call__()
↓
Handler.build_request() → Provider-specific request format
↓
Handler.make_request() → HTTP POST to provider
↓
Handler.parse_response() → Intermediate format: List[Tuple[int, float, Optional[str]]]
↓
Handler._normalize_results() → Unified format: List[Tuple[int, float]] or List[Tuple[int, float, str]]
↓
RerankResult → User receives consistent format
Internal Data Format Differences¶
Request Data Transformation¶
Unified Input (User API):
rerank(
query="python http library",
docs=["urllib", "requests", "httpx"],
top_k=2,
include_docs=True
)
OpenAI-Compatible Transformation:
{
"model": "jina-reranker-v3",
"query": "python http library",
"documents": ["urllib", "requests", "httpx"],
"top_n": 2,
"return_documents": true
}
DashScope Transformation:
{
"model": "qwen3-rerank",
"input": {
"query": "python http library",
"documents": ["urllib", "requests", "httpx"]
},
"parameters": {
"top_n": 2,
"return_documents": true
}
}
Response Data Transformation¶
OpenAI-Compatible Response:
{
"results": [
{"index": 0, "relevance_score": 0.95, "document": {"text": "requests"}},
{"index": 1, "relevance_score": 0.85, "document": {"text": "httpx"}}
],
"usage": {"total_tokens": 150}
}
DashScope Response:
{
"output": {
"results": [
{"index": 0, "relevance_score": 0.95, "document": {"text": "requests"}},
{"index": 1, "relevance_score": 0.85, "document": {"text": "httpx"}}
]
},
"usage": {"total_tokens": 150}
}
Unified Output (Both Modes):
RerankResult(
results=[
(0, 0.95, "requests"),
(1, 0.85, "httpx")
],
usage=Usage(total_tokens=150)
)
Key Differences Summary¶
Aspect |
OpenAI-Compatible |
DashScope |
|---|---|---|
Request Wrapping |
Direct JSON object |
Wrapped in |
Parameter Names |
|
|
Document Field Name |
|
|
Response Wrapping |
Direct |
Wrapped in |
Score Field Name |
|
|
Document Format |
|
|
Result Format Flexibility |
Fixed structure |
Fixed structure |
Usage Examples¶
Both modes use the same unified API:
from lexilux import Rerank
# OpenAI-compatible mode (Jina) - default
rerank_openai = Rerank(
base_url="https://api.jina.ai/v1",
api_key="jina_xxx",
model="jina-reranker-v3",
mode="openai"
)
# DashScope mode
rerank_dashscope = Rerank(
base_url="https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank",
api_key="sk-xxx",
model="qwen3-rerank",
mode="dashscope"
)
# Both modes return the same format
query = "python http library"
docs = ["urllib", "requests", "httpx"]
result_openai = rerank_openai(query, docs, top_k=2)
result_dashscope = rerank_dashscope(query, docs, top_k=2)
# Both results have the same structure
assert isinstance(result_openai.results[0], tuple)
assert isinstance(result_dashscope.results[0], tuple)
# Both results are sorted by score (descending)
assert result_openai.results[0][1] >= result_openai.results[1][1]
assert result_dashscope.results[0][1] >= result_dashscope.results[1][1]
Benefits of Unification¶
Consistent API: Users don’t need to learn different APIs for different providers
Easy Provider Switching: Change mode without changing business code
Unified Result Format: Same data structure regardless of backend
Score Consistency: All scores sorted descending (higher is better)
Index Mapping: Original document indices preserved correctly
Error Handling: Consistent error messages across all modes
Usage Statistics: Unified
Usageobject format
Migration Guide¶
From OpenAI-Compatible to DashScope:
- Change mode="openai" to mode="dashscope"
- Update base_url to DashScope endpoint
- Update model to DashScope model name
- No code changes needed (same API)
From Any Mode to Another: - Only change initialization parameters - Business logic remains identical - Result format is always the same