five

yuan19/ms_marco

收藏
Hugging Face2025-12-18 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/yuan19/ms_marco
下载链接
链接失效反馈
官方服务:
资源简介:
MS MARCO(Microsoft Machine Reading Comprehension Dataset)是一个专注于搜索领域深度学习的问答和自然语言生成数据集。最初发布于NIPS 2016,包含多个版本(v1.1和v2.1)。v1.1版本包含100,000个真实Bing问题和人工生成的答案,而v2.1版本扩展到了1,000,000个查询,并提高了质量。数据集支持问答(QnA)和自然语言生成(NLGEN)任务,适用于机器阅读理解、问答系统和智能语音助手等应用。数据集的结构包括训练集、验证集和测试集,每个数据实例包含查询(query)、答案(answers)、段落(passages)等信息。

MS MARCO (Microsoft Machine Reading Comprehension Dataset) is a collection of datasets focused on deep learning in search, initially released at NIPS 2016. It includes multiple versions (v1.1 and v2.1). The v1.1 dataset features 100,000 real Bing questions and human-generated answers, while v2.1 expands to over 1,000,000 queries with higher quality. The dataset supports tasks such as Question Answering (QnA) and Natural Language Generation (NLGEN), applicable to machine reading comprehension, question answering systems, and smart speaker applications. The dataset structure includes train, validation, and test splits, with each data instance containing fields like query, answers, passages, and more.
提供机构:
yuan19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作