Minirag : A simple retrieval-augmented generation framework that enables small models to achieve good RAG performance through heterogeneous graph indexing and lightweight topological enhanced retrieval.

Minirag

Model Training and Deployment Development and Tools #Retrieval-Augmented Generation #Small Language Models #Heterogeneous Graph Indexing #Lightweight Retrieval #Natural Language Processing #Open Source Model Standard Picks Open Source

Overview :

MiniRAG is a retrieval-augmented generation system designed for small language models, aimed at simplifying RAG processes and enhancing efficiency. It addresses the performance limitations of small models within traditional RAG frameworks through a semantically aware heterogeneous graph indexing mechanism and lightweight topological enhanced retrieval methods. This model shows significant advantages in resource-constrained scenarios, such as on mobile devices or edge computing environments. Its open-source nature allows for easy adoption and improvement within the developer community.

Target Users :

The target audience primarily includes researchers and developers in the field of natural language processing, as well as academics and industry professionals interested in lightweight RAG systems. MiniRAG is an ideal choice for those looking to deploy RAG systems in resource-constrained environments or teams needing rapid prototyping and experimentation.

Total Visits： 474.6M

Top Region： US(19.34%)

Website Views ： 56.9K

Use Cases

Deploying a RAG system on mobile devices to provide users with fast and accurate question-answering services.

Utilizing MiniRAG for real-time text generation tasks in edge computing environments, such as automatic summarization and content creation.

Using MiniRAG as a benchmark model for lightweight RAG systems in academic research, focusing on algorithm optimization and performance evaluation.

Features

Provides a heterogeneous graph indexing mechanism that combines text blocks and named entities, reducing reliance on complex semantic understanding.

Employs lightweight topological enhanced retrieval methods, leveraging graph structures for efficient knowledge discovery without requiring advanced language capabilities.

Achieves performance comparable to larger language models when using small language models.

Requires only 25% of storage space, significantly reducing deployment costs.

Offers a comprehensive benchmark dataset, LiHua-World, for evaluating lightweight RAG systems in real device scenarios.

Supports both source code installation and installation via PyPI, facilitating quick onboarding for developers.

Has a clear code structure that is easy to understand and extend, enabling developers to engage in secondary development.

How to Use

1. Clone the MiniRAG repository from GitHub to your local machine.

2. Install MiniRAG from the source code using the command `pip install -e .` or install it from PyPI using `pip install lightrag-hku`.

3. Download the required LiHua-World dataset and place it in the `./dataset/LiHua-World/data/` directory.

4. Index the dataset using the command `python ./reproduce/Step_0_index.py`.

5. Run `python ./reproduce/Step_1_QA.py` for question-answering tasks, or use the code in `main.py` to initialize MiniRAG.

6. Adjust parameters and configurations as needed for model training and optimization.

Featured AI Tools

Devin

Devin is the world's first fully autonomous AI software engineer. With long-term reasoning and planning capabilities, Devin can execute complex engineering tasks and collaborate with users in real time. It empowers engineers to focus on more engaging problems and helps engineering teams achieve greater objectives.

Development and Tools

1.7M

Chinese Picks

Foxkit GPT AI Creation System

FoxKit GPT AI Creation System is a completely open-source system that supports independent secondary development. The system framework is developed using ThinkPHP6 + Vue-admin and provides application ends such as WeChat mini-programs, mobile H5, PC website, and official accounts. Sora video generation interface has been reserved. The system provides detailed installation and deployment documents, parameter configuration documents, and one free setup service.

Development and Tools

752.1K

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Direct Visits	51.61%	External Links	33.46%	Email	0.04%
Organic Search	12.58%	Social Media	2.19%	Display Ads	0.11%

Monthly Visits	4.92m
Average Visit Duration	393.01
Pages Per Visit	6.11
Bounce Rate	36.20%

Monthly Visits	4.92m
United States	19.34%
China	13.25%
India	9.32%
Russia	4.28%
Germany	3.63%