mteb/WikipediaRetrievalMultilingual
收藏Hugging Face2025-05-04 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/mteb/WikipediaRetrievalMultilingual
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多种语言的语料库、查询相关性评分和查询数据。每个语言的配置包括corpus(语料库)、qrels(查询相关性评分)和queries(查询)三个部分。语料库包含文本和标题,查询相关性评分包含查询ID、语料库ID和评分,查询数据包含查询ID和查询文本。数据集涵盖了保加利亚语、孟加拉语、捷克语、丹麦语、德语、英语、波斯语、芬兰语、印地语、意大利语、荷兰语、挪威语、葡萄牙语、罗马尼亚语、塞尔维亚语和瑞典语。每个配置的数据文件路径也提供了详细信息。
This dataset contains corpora, query relevance scores, and query data for multiple languages. Each language configuration includes three parts: corpus, qrels (query relevance scores), and queries. The corpus contains text and titles, the qrels contain query IDs, corpus IDs, and scores, and the queries contain query IDs and query text. The dataset covers Bulgarian, Bengali, Czech, Danish, German, English, Persian, Finnish, Hindi, Italian, Dutch, Norwegian, Portuguese, Romanian, Serbian, and Swedish. Detailed data file paths for each configuration are also provided.
提供机构:
mteb



