EraRAG: Revolutionizing Retrieval for Dynamic and Expanding Data with Multi-Layered Graphs
EraRAG introduces a scalable retrieval framework optimized for dynamic, growing datasets by performing efficient localized updates on a multi-layered graph structure, significantly improving retrieval efficiency and accuracy.
Addressing Limitations of Large Language Models
Large Language Models (LLMs) have transformed natural language processing but face challenges with up-to-date facts, domain-specific knowledge, and complex multi-hop reasoning. Retrieval-Augmented Generation (RAG) techniques help mitigate these issues by enabling models to access external information. However, most graph-based RAG systems are designed for static datasets and struggle with efficiency and scalability in dynamic, growing corpora like news streams or research databases.
Introducing EraRAG: Efficient Updates for Evolving Corpora
EraRAG is a novel RAG framework developed by researchers from Huawei, The Hong Kong University of Science and Technology, and WeBank. It is specifically designed for dynamic and continuously expanding data collections. Instead of rebuilding the entire retrieval graph with each update, EraRAG performs localized updates affecting only the relevant graph segments.
Key Innovations of EraRAG
Hyperplane-Based Locality-Sensitive Hashing (LSH)
The corpus is divided into small text chunks embedded as vectors. EraRAG projects these vectors onto randomly sampled hyperplanes to generate binary hash codes. This groups semantically similar chunks into the same buckets, preserving semantic cohesion and enabling efficient grouping.
Hierarchical Multi-Layered Graph Construction
EraRAG builds a multi-layered graph where each layer summarizes text segments using a language model. Large segments are split, and small ones merged, maintaining semantic consistency and balanced granularity. Higher layers provide summarized representations for efficient retrieval of both detailed and abstract queries.
Incremental and Localized Updates
New data embeddings are hashed with the original hyperplanes to keep consistency. Only the buckets impacted by new entries are updated, merged, split, or re-summarized. Updates propagate up the graph hierarchy but remain localized, greatly reducing computation and token costs.
Reproducibility and Determinism
EraRAG preserves the initial hyperplanes used for hashing, making bucket assignments deterministic and reproducible. This is critical for consistent and efficient incremental updates over time.
Performance Highlights
Experiments across multiple question-answering benchmarks show that EraRAG:
- Cuts graph reconstruction time and token usage by up to 95% compared to leading graph-based RAG methods like GraphRAG, RAPTOR, and HippoRAG.
- Consistently delivers higher accuracy and recall across static, growing, and abstract task settings.
- Supports versatile retrieval needs, efficiently handling both fine-grained factual details and high-level semantic summaries.
Practical Applications
EraRAG is ideal for real-world scenarios with continuously growing data such as live news feeds, academic archives, and user-generated content platforms. It balances retrieval efficiency and adaptability, enhancing the factual accuracy and responsiveness of LLM-powered applications in rapidly changing environments.
For more details, check the original paper and GitHub repository.
Сменить язык
Читать эту статью на русском