antonioloison/en-msmarco-corpus-queries
收藏Hugging Face2025-04-09 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/antonioloison/en-msmarco-corpus-queries
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含两个配置:corpus和queries。corpus配置包含文本数据,具有id、文本内容、URL和语言等字段。queries配置包含查询数据,具有id、查询内容、正例文本段落、反例文本段落、答案、查询类型、格式良好的答案和语言等字段。数据集分为训练集、验证集和小验证集,每个集合都有相应的字节数和示例数量。总下载大小约为1.73GB,总数据大小约为3.45GB。
The dataset consists of two configurations: corpus and queries. The corpus configuration includes text data with fields such as id, text content, URL, and language. The queries configuration includes query data with fields such as id, query content, positive passages, negative passages, answers, query type, well-formed answers, and language. The dataset is split into training set, validation set, and small validation set, each with corresponding byte sizes and example counts. The total download size is approximately 1.73GB, and the total dataset size is approximately 3.45GB.
提供机构:
antonioloison



