Skip to content

IXRag Package

RAG (Retrieval-Augmented Generation) with LightRAG integration, document limiting, and reranking.

Overview

The ixrag package provides the retrieval layer for the Rose platform. It integrates with LightRAG to provide hybrid retrieval combining:

  • Text Chunks: Traditional vector similarity search
  • Entities: Knowledge graph entities extracted from documents
  • Relationships: Connections between entities in the knowledge graph

Key Components

  • LangGraph Retriever: Main retrieval orchestration with reranking
  • Document Limiter: Intelligent document allocation and limiting
  • Document Processor: Converts LightRAG responses to LangChain Documents
  • Multi-Tenant Support: Isolated RAG instances per tenant

Reranking & Limiting Configuration

Configuration is defined in environment config files (e.g., development.toml, staging.toml, production.toml).

Config Options

Option Type Default Description
mode string "mix" RAG mode: global, local, hybrid, mix
rerank_enabled bool false Enable Cohere/Jina reranking
limiter_enabled bool true Enable type-based limiting (fallback when reranking disabled)
rerank_top_k int 20 Max documents returned (used by both reranking and limiting)
rerank_provider string "cohere" Reranking provider: cohere or jina
rerank_model string "rerank-v3.5" Model name for reranking
relationships_enabled bool true Include relationship documents in results
graph_top_k int 60 Initial retrieval limit for entities/relationships
chunk_top_k int 20 Initial retrieval limit for chunks

Example Configuration

[lightrag]
mode = "mix"
rerank_enabled = true
limiter_enabled = true
rerank_top_k = 12
rerank_provider = "cohere"
rerank_model = "rerank-v3.5"
relationships_enabled = true
graph_top_k = 10
chunk_top_k = 15

Behavior Matrix

rerank_enabled limiter_enabled Behavior
true - Cohere/Jina reranks all documents together, returns top N by relevance
false true Type-based allocation by RAG mode (see below)
false false All documents returned without limiting

Type-Based Allocation

When rerank_enabled=false and limiter_enabled=true, documents are allocated by type based on the RAG mode:

Mode Chunks Entities Relationships
global 30% 35% 35%
local 60% 20% 20%
hybrid / mix 30% 30% 40%

The allocation percentages determine how the rerank_top_k budget is distributed across document types.

Technical Details

Why We Call Cohere Ourselves

LightRAG has built-in reranking support, but it's bypassed when using only_need_context=True (which we use to get raw context without LLM generation). The retrieval flow is:

  1. LightRAG Query: Call with only_need_context=True to get raw documents
  2. Document Processing: Convert LightRAG response to LangChain Documents
  3. Reranking (if enabled): Call Cohere/Jina to rerank all documents by query relevance
  4. Limiting (fallback): Apply type-based allocation if reranking unavailable

This approach gives us:

  • Full control over the reranking process
  • Unified reranking across all document types (chunks, entities, relationships)
  • Ability to use the latest Cohere/Jina models

Document Types

Each document returned has a document_type in its metadata:

Type Source Description
chunk Vector search Text chunks from indexed documents
entity Knowledge graph Extracted entities (people, companies, concepts)
relationship Knowledge graph Connections between entities

Key Files

File Description
ixrag/lightrag/langgraph_retriever.py Main retriever with reranking logic
ixrag/lightrag/document_limiter.py Document limiting and allocation strategies
ixrag/lightrag/document_processor.py Converts LightRAG responses to Documents
ixrag/lightrag/lightrag_llm.py LLM and reranking function factories
ixrag/lightrag/manager.py Multi-tenant RAG instance management

Integration Points

System Purpose
LightRAG Hybrid retrieval (vector + graph)
MongoDB Vector storage backend
Neo4j Graph storage backend
Cohere/Jina Document reranking
LangFuse Observability and tracing

Usage

The retriever is typically accessed through ixchat, but can be used directly:

from ixrag.lightrag.langgraph_retriever import LangGraphRetriever

# Create retriever
retriever = LangGraphRetriever(
    site_name="example-site",
    rag_mode="mix",
)

# Retrieve documents
documents = await retriever.ainvoke("What are your pricing plans?")

# Each document has:
# - page_content: The text content
# - metadata: {document_type, source, rerank_score (if reranked)}

Data Consistency

The package includes tools for monitoring and maintaining consistency between MongoDB and Neo4j storage backends. See the ixrag/lightrag/ directory for:

  • cli_consistency_check.py - Check data consistency
  • reconcile_entities.py - Sync missing entities
  • diagnose_entity_mapping.py - Diagnose entity name issues