MCINext/LongRag-Fa
收藏Hugging Face2025-09-23 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/MCINext/LongRag-Fa
下载链接
链接失效反馈官方服务:
资源简介:
Translated LONG2RAG数据集是一个将LONG2RAG基准(Qi等人,EMNLP发现2024年)翻译成波斯语版本的数据集,并调整为MTEB风格的检索格式,用于评估多语种检索增强生成(RAG)和长上下文检索系统。该数据集包含280个跨10个领域和8个问题类别的复杂问题,每个问题配对5个检索到的文档(平均长度约2444个单词)。该数据集与MTEB评估框架兼容。
The Translated LONG2RAG dataset is a Persian translated version of the LONG2RAG benchmark (Qi et al., EMNLP Findings 2024) adapted into MTEB-style retrieval format for evaluating multilingual retrieval-augmented generation (RAG) and long-context retrieval systems. It includes 280 complex questions across 10 domains and 8 question categories, each paired with 5 retrieved documents (avg. length ~2,444 words). The dataset is compatible with the MTEB evaluation framework.
提供机构:
MCINext



