JoelMba/MantraGSC_emea_CAS_annotations_combined_v2

Name: JoelMba/MantraGSC_emea_CAS_annotations_combined_v2
Creator: JoelMba
Published: 2026-04-27 22:21:50
License: 暂无描述

Hugging Face2026-04-27 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/JoelMba/MantraGSC_emea_CAS_annotations_combined_v2

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为MantraGSC_emea_CAS_annotations_combined_v2，是一个用于命名实体识别（NER）任务的数据集，专注于医疗或健康领域，特别是疾病或障碍的实体标注。数据集包含三个分割：训练集（56个例子）、验证集（12个例子）和测试集（7个例子）。每个例子包括id、tokens（文本序列）、ner_tags（序列标注，使用BIO格式，类别为B-Disorders、I-Disorders和O）和ner_tag_labels（对应的标签字符串）。数据总大小约为49.8KB，下载大小为18.3KB。该数据集可能用于训练和评估医疗文本中的实体识别模型，支持自然语言处理在医疗信息提取中的应用。

The dataset, named MantraGSC_emea_CAS_annotations_combined_v2, is designed for Named Entity Recognition (NER) tasks, focusing on the medical or health domain, specifically for annotating disease or disorder entities. It includes three splits: train (56 examples), validation (12 examples), and test (7 examples). Each example consists of id, tokens (text sequence), ner_tags (sequence annotations in BIO format with classes B-Disorders, I-Disorders, and O), and ner_tag_labels (corresponding label strings). The total dataset size is approximately 49.8KB, with a download size of 18.3KB. This dataset is likely used for training and evaluating entity recognition models in medical texts, supporting natural language processing applications in healthcare information extraction.

提供机构：

JoelMba

5,000+

优质数据集

54 个

任务类型

进入经典数据集