five

MiRoR11 - P2 - Annotated corpus for primary and reported outcomes extraction

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/3234810
下载链接
链接失效反馈
官方服务:
资源简介:
Annotated corpus for outcome extraction This folder contains 2 subfolders: 1. Primary_outcomes - a corpus annotated for primary outcomes The folder contains the following files: po_sent_marked_p1_1000.txt - sentences 1 - 1000 of the annotated corpus, ConstruKT format; coordinated outcomes annotated as single entity po_sent_marked_p2_1000.txt - sentences 1001 - 2000 of the annotated corpus, ConstruKT format; coordinated outcomes annotated as single entity po_sent_marked_col_p1.txt - sentences 1 - 1000 of the annotated corpus, tabulated format; coordinated outcomes annotated as single entity po_sent_marked_col_p2.txt - sentences 1001 - 2000 of the annotated corpus, tabulated format; coordinated outcomes annotated as single entity po_sent_marked_col_p1_coord.txt - sentences 1 - 1000 of the annotated corpus, tabulated format; coordinated outcomes annotated as separate entities po_sent_marked_col_p2_coord.txt - sentences 1001 - 2000 of the annotated corpus, tabulated format; coordinated outcomes annotated as separate entities Subfolders: po - the corpus for 10-fold cross-validation (10 subfolders with train/dev/test sets); coordinated outcomes annotated as separate entities po_coord - the corpus for 10-fold cross-validation (10 subfolders with train/dev/test sets); coordinated outcomes annotated as separate entities 2. Reported_outcomes - a corpus annotated for reported outcomes The corpus contains sentences from Results and Conclusions sections of the articles for which primary outcomes were annotated. The first part of reported outcomes corpus contains sentences from articles for the first half of the primary outcomes corpus (sentences 1 - 1000). The second part of reported outcomes corpus contains sentences from articles for the second half of the primary outcomes corpus (sentences 1001 - 2000). The folder contains the following files: res_sent_marked_p1.txt - first part of the annotated corpus, ConstruKT format res_sent_marked_p2.txt - second part of the annotated corpus, ConstruKT format res_sent_marked_p1_col.txt - first part of the annotated corpus, tabulated format res_sent_marked_p2_col.txt - first part of the annotated corpus, tabulated format Subfolders: rep - the corpus for 10-fold cross-validation (10 subfolders with train/dev/test sets)
创建时间:
2020-02-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作