DFKI-SLT/SemEval2018_Task7
收藏数据集概述
数据集基本信息
- 数据集名称: SemEval2018Task7
- 数据集描述: 该数据集用于描述科学论文中的语义关系提取和分类。
- 语言: 英语
- 数据集大小: 1K<n<10K
- 任务类别: 文本分类
- 任务ID: entity-linking-classification
- 标签: 关系分类, 关系提取, 科学论文, 研究论文
数据集结构
数据实例
Subtask_1.1
- 数据文件大小: 714 KB
- 数据字段:
id: 实例ID,字符串类型title: 标题,字符串类型abstract: 摘要,字符串类型entities: 实体列表,包含实体ID、起始和结束字符位置relations: 关系列表,包含关系标签、实体ID和是否可逆
Subtask_1.2
- 数据文件大小: 1.00 MB
- 数据字段:
id: 实例ID,字符串类型title: 标题,字符串类型abstract: 摘要,字符串类型entities: 实体列表,包含实体ID、起始和结束字符位置relations: 关系列表,包含关系标签、实体ID和是否可逆
数据分割
| 任务 | 类型 | 训练集 | 测试集 |
|---|---|---|---|
| Subtask_1.1 | 文本 | 2807 | 3326 |
| 关系 | 1228 | 1248 | |
| Subtask_1.2 | 文本 | 1196 | 1193 |
| 关系 | 335 | 355 |
数据集创建
来源数据
初始数据收集和标准化
- 信息缺失
源语言生产者
- 信息缺失
注释
注释过程
- 信息缺失
注释者
- 信息缺失
个人和敏感信息
- 信息缺失
使用数据注意事项
数据集的社会影响
- 信息缺失
偏见讨论
- 信息缺失
其他已知限制
- 信息缺失
附加信息
数据集管理员
- 信息缺失
许可信息
- 信息缺失
引用信息
@inproceedings{gabor-etal-2018-semeval, title = "{S}em{E}val-2018 Task 7: Semantic Relation Extraction and Classification in Scientific Papers", author = {G{a}bor, Kata and Buscaldi, Davide and Schumann, Anne-Kathrin and QasemiZadeh, Behrang and Zargayouna, Ha{"i}fa and Charnois, Thierry}, booktitle = "Proceedings of the 12th International Workshop on Semantic Evaluation", month = jun, year = "2018", address = "New Orleans, Louisiana", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/S18-1111", doi = "10.18653/v1/S18-1111", pages = "679--688", abstract = "This paper describes the first task on semantic relation extraction and classification in scientific paper abstracts at SemEval 2018. The challenge focuses on domain-specific semantic relations and includes three different subtasks. The subtasks were designed so as to compare and quantify the effect of different pre-processing steps on the relation classification results. We expect the task to be relevant for a broad range of researchers working on extracting specialized knowledge from domain corpora, for example but not limited to scientific or bio-medical information extraction. The task attracted a total of 32 participants, with 158 submissions across different scenarios.", }
贡献者
- 信息缺失




