fmmolina/eHealth-KD-Adaptation
收藏Hugging Face2022-04-11 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/fmmolina/eHealth-KD-Adaptation
下载链接
链接失效反馈官方服务:
资源简介:
---
license: afl-3.0
---
## Description
An adaptation of [eHealth-KD Challenge 2020 dataset](https://knowledge-learning.github.io/ehealthkd-2020/), filtered only for the task of NER. Some adaptation of the original dataset have been made:
- BIO annotations
- Errors fixing
- Overlapped entities has been processed as an unique entity
## Dataset loading
datasets = load_dataset('json', data_files={'train': ['@YOUR_PATH@/training_anns_bio.json'], 'testing': ['@YOUR_PATH@/testing_anns_bio.json'], 'validation': ['@YOUR_PATH@/development_anns_bio.json']})
提供机构:
fmmolina
原始信息汇总
数据集概述
数据集来源与描述
- 本数据集是对eHealth-KD Challenge 2020的改编,专门针对命名实体识别(NER)任务进行了过滤和调整。
- 调整内容包括:
- 采用BIO标注格式。
- 修复了原数据集中的错误。
- 处理了重叠实体,将其视为单一实体。
数据集加载
- 数据集包含训练集、测试集和验证集,分别存储在以下JSON文件中:
- 训练集:
training_anns_bio.json - 测试集:
testing_anns_bio.json - 验证集:
development_anns_bio.json
- 训练集:
- 加载数据集的代码示例: python datasets = load_dataset(json, data_files={train: [@YOUR_PATH@/training_anns_bio.json], testing: [@YOUR_PATH@/testing_anns_bio.json], validation: [@YOUR_PATH@/development_anns_bio.json]})



