Meta-RAG: A Metadata-Driven Retrieval-Augmented Generation Framework for the Power Industry
收藏中国科学数据2026-02-09 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.19678/j.issn.1000-3428.0070415
下载链接
链接失效反馈官方服务:
资源简介:
Large Language Models (LLMs) have made significant progress in dialogue, reasoning, and knowledge retention. However, they still face challenges in terms of factual accuracy, knowledge updates, and a lack of high-quality domain datasets for handling knowledge-intensive tasks in the electricity sector. This study aims to address these challenges by introducing an improved Retrieval-Augmented Generation (RAG) strategy. This strategy combines hybrid retrieval with a fine-tuned generative model for efficient knowledge capturing and updating. The Metadata-driven RAG framework (Meta-RAG) is proposed for knowledge Question Answering (QA) tasks in the electricity domain. This includes data preparation, model fine-tuning, and reasoning retrieval stages. The data-preparation stage involves document conversion, metadata extraction and enhancement, and document parsing. These processes ensure efficient indexing and structured processing of power regulation documents. The Electricity Question Answering (EleQA) dataset, consisting of 19 560 QA pairs, is constructed specifically for this sector. The model fine-tuning stage uses multi-question generation, chain-of-thought prompting, and supervised instruction fine-tuning to optimize the reasoning abilities in specific tasks. The retrieval reasoning stage employs mixed encoding and re-ranking strategies, combining retrieval and generation modules to improve answer accuracy and relevance. Experiments validate the effectiveness of Meta-RAG. Compared to baseline models such as Self-RAG, Corrective-RAG, Adaptive-RAG, and RA-ISF, Meta-RAG shows higher answer accuracy and retrieval hit rates. Meta-RAG with the Qwen1.5-14B-Chat model achieves an overall accuracy of 0.804 3, surpassing the other methods. Ablation and document recall experiments indicate that document retrieval significantly impacts the framework performance, with a 0.292 8 drop in accuracy when the retrieval capability is lost.
创建时间:
2026-02-09



