five

Re-DocRED

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/tonytan48/re-docred
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是一个大规模的多标签通用领域数据集,它从DocRED数据集中重新标注,解决了原始数据集中存在的标注不完整问题。关系提及被划分为97种类型,其中包括一种'na'类型。在微调过程中,使用了包含3,053个文档的训练集。数据集的规模为:训练集包含3,053个文档,预训练集包含101,873个文档。该数据集的任务是关系提取。

This dataset is a large-scale multi-label general-domain dataset, which is re-annotated based on the DocRED dataset to resolve the incomplete annotation problem present in the original dataset. Relation mentions are classified into 97 types, including one 'na' type. During the fine-tuning stage, a training set containing 3,053 documents is used. The scale of the dataset is specified as follows: the training set consists of 3,053 documents, and the pre-training set comprises 101,873 documents. The task of this dataset is relation extraction.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作