mteb/JaqketRetrievalLite
收藏Hugging Face2025-12-13 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/mteb/JaqketRetrievalLite
下载链接
链接失效反馈官方服务:
资源简介:
JaqketRetrievalLite是一个日语问答数据集,基于测验问题构建,属于MTEB(大规模文本嵌入基准)的一部分。该数据集是轻量级版本,包含65,802个文档,通过5个高性能模型的硬负例构建。数据集包含三个主要配置:语料库(corpus)、查询相关度(qrels)和查询(queries)。任务类别涵盖文本检索、多项选择问答和问答。数据集语言为日语,采用cc-by-sa-4.0许可证。
JaqketRetrievalLite is a Japanese QA dataset based on quiz questions, part of the MTEB (Massive Text Embedding Benchmark). This is the lightweight version with a reduced corpus (65,802 documents) constructed using hard negatives from 5 high-performance models. The dataset includes three main configurations: corpus, qrels, and queries. Task categories cover text retrieval, multiple-choice QA, and question answering. The dataset is in Japanese and licensed under cc-by-sa-4.0.
提供机构:
mteb



