five

BioEsCorpus

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6699942
下载链接
链接失效反馈
官方服务:
资源简介:
This folder contains the files and resources obtained in the process of annotating 18 Spanish clinical reports from the Spanish Clinical Case Corpus (SPACCC) (https://doi.org/10.5281/zenodo.2560316) with biomedical entities and semantic relations. Three annotators had to identify the following eleven types of entities: Gen, Proteína, Glúcido, Lípido, Enfermedad, Síntoma, Signo, Medicamento, Alias, Abreviatura and Sigla.  And the next eight semantic relations: "Implicado en", "Activa", "Inhibe", "Interacciona con",  "Previene", "Alivia", "Cura" and "Refiere a". Finally there were identified 324 entities from ten of the groups of entities, and 170 relations from five of the eight types.  Content: - brat_annotations: It contains 3 folder, one for each annotator. They contain the eighteen annotations made by the annotator in brat format. - Clinical_Reports_SPACCC: It contais the 18 original Spanish clinical reports (.txt) from SPACCC. - Pub_Annotations: It contains 3 folder, one for each annotator. They contain eighteen JSON files with the annotations in PubAnnotation format, which is the original output from TextAE. - Annotation_guideline_Tool_Usage_Guide.pdf: PDF file which contains firstable a guide in Spanish with indications in how to annotate using TextAE, and secondly the annotation guideline, also in Spanish, provided to the annotators with the indications in how to procee with the annotations.  The scripts employed to produced this data can be found at the GitHub repository: https://github.com/LuciaSG99/BioEsCorpus.git  These resources are freely distributed under a Creative Commons Attribution 4.0 International License. The author of this project is Lucía Sánchez González, and it has been supervised by Carlos Badenes Olmedo and María Poveda Villalón.
创建时间:
2022-06-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作