Colino23/Wikipedia_Passages_Sample
收藏Hugging Face2025-04-16 更新2025-08-30 收录
下载链接:
https://hf-mirror.com/datasets/Colino23/Wikipedia_Passages_Sample
下载链接
链接失效反馈官方服务:
资源简介:
这是一个英文文本检索数据集,包含id和内容两个字段,字段类型均为字符串。数据集分为训练集,共有超过1.18亿个样本,总大小约为22.3亿字节。数据集遵循MIT许可。
This is an English text retrieval dataset containing two fields: id and contents, both of which are strings. The dataset is split into a training set with over 118 million samples and a total size of approximately 22.3 billion bytes. The dataset is licensed under the MIT license.
提供机构:
Colino23



