lsz05/JQaRARerankingLite
收藏Hugging Face2025-12-12 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/lsz05/JQaRARerankingLite
下载链接
链接失效反馈官方服务:
资源简介:
JQaRARerankingLite是一个用于检索增强的日文问答任务的数据集,属于MTEB(大规模文本嵌入基准)的一部分。该数据集由JAQKET的问题和日本维基百科的语料组成,是一个轻量级版本,包含172,897个文档,这些文档是通过5个高性能模型的硬负例构建的。数据集的任务类别包括文本排序、多项选择QA和问答,语言为日语,许可证为cc-by-sa-4.0。数据集的来源是sbintuitions/JMTEB-lite。
JQaRARerankingLite is a reranking dataset for Japanese Question Answering with Retrieval Augmentation, part of the Massive Text Embedding Benchmark (MTEB). It consists of questions from JAQKET and corpus from Japanese Wikipedia. This is the lightweight version with a reduced corpus (172,897 documents) constructed using hard negatives from 5 high-performance models. The task categories include text-ranking, multiple-choice QA, and question-answering. The dataset is in Japanese (jpn) and licensed under cc-by-sa-4.0. The source dataset is sbintuitions/JMTEB-lite.
提供机构:
lsz05



