five

PaDaS-Lab/webfaq-en-test

收藏
Hugging Face2024-11-21 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/PaDaS-Lab/webfaq-en-test
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是一个用于文本检索任务的单语(英语)数据集,源自MS MARCO数据集。它包含三个主要配置:default、corpus和queries。default配置用于开发集,包含查询ID、语料库ID和分数特征,共有52,160个示例。corpus配置包含语料库的ID、标题和文本特征,共有52,160个示例。queries配置包含查询的ID和文本特征,共有52,160个示例。

This dataset is designed for text retrieval tasks and includes three configurations: default, corpus, and queries. The default configuration is used for evaluating retrieval performance and includes features such as query-id, corpus-id, and score. The corpus configuration contains document features like _id, title, and text, which are used to build the retrieval document library. The queries configuration contains query features like _id and text, which are used to build the query set. Each configuration has corresponding data files and splits, such as dev, corpus, and queries.
提供机构:
PaDaS-Lab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作