JoelMba/MantraGSC_emea_CAS_annotations_combined_v2
收藏Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/JoelMba/MantraGSC_emea_CAS_annotations_combined_v2
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为MantraGSC_emea_CAS_annotations_combined_v2,是一个用于命名实体识别(NER)任务的数据集,专注于医疗或健康领域,特别是疾病或障碍的实体标注。数据集包含三个分割:训练集(56个例子)、验证集(12个例子)和测试集(7个例子)。每个例子包括id、tokens(文本序列)、ner_tags(序列标注,使用BIO格式,类别为B-Disorders、I-Disorders和O)和ner_tag_labels(对应的标签字符串)。数据总大小约为49.8KB,下载大小为18.3KB。该数据集可能用于训练和评估医疗文本中的实体识别模型,支持自然语言处理在医疗信息提取中的应用。
The dataset, named MantraGSC_emea_CAS_annotations_combined_v2, is designed for Named Entity Recognition (NER) tasks, focusing on the medical or health domain, specifically for annotating disease or disorder entities. It includes three splits: train (56 examples), validation (12 examples), and test (7 examples). Each example consists of id, tokens (text sequence), ner_tags (sequence annotations in BIO format with classes B-Disorders, I-Disorders, and O), and ner_tag_labels (corresponding label strings). The total dataset size is approximately 49.8KB, with a download size of 18.3KB. This dataset is likely used for training and evaluating entity recognition models in medical texts, supporting natural language processing applications in healthcare information extraction.
提供机构:
JoelMba



