Safidy/breastCancer

Name: Safidy/breastCancer
Creator: Safidy
Published: 2024-06-17 09:47:04
License: 暂无描述

Hugging Face2024-06-17 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/Safidy/breastCancer

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含多个配置版本，主要用于命名实体识别任务。每个配置版本包含id、tokens和ner_tags三个特征，其中ner_tags用于标注文本中的实体类别，如参与者、干预、控制和结果等。数据集被分割为训练集、验证集和测试集，并提供了每个分割的字节大小和样本数量。

This dataset contains multiple configurations primarily used for named entity recognition tasks. Each configuration includes features such as id, tokens, and ner_tags, where ner_tags are used to annotate entity categories in the text, such as participants, interventions, controls, and outcomes. The dataset is divided into training, validation, and test sets, with the byte size and number of examples provided for each split.

提供机构：

Safidy

原始信息汇总

数据集概述

配置名称：Data-v1

特征:
- id: 数据类型为字符串。
- tokens: 数据类型为字符串序列。
- ner_tags: 数据类型为标签序列，标签包括：
  - 0: O
  - 1: B-participants
  - 2: I-participants
  - 3: B-intervention
  - 4: I-intervention
  - 5: B-control
  - 6: I-control
  - 7: B-outcomes
  - 8: I-outcomes
分割:
- train: 大小为6313125字节，包含1011个样本。
下载大小: 831609字节
数据集大小: 6313125字节

配置名称：Data-v1_I-C

特征:
- id: 数据类型为字符串。
- tokens: 数据类型为字符串序列。
- ner_tags: 数据类型为标签序列，标签包括：
  - 0: O
  - 1: B-participants
  - 2: I-participants
  - 3: B-I-C
  - 4: I-I-C
  - 5: B-outcomes
  - 6: I-outcomes
分割:
- train: 大小为6313125字节，包含1011个样本。
下载大小: 831265字节
数据集大小: 6313125字节

配置名称：Default_split-v1

特征:
- id: 数据类型为字符串。
- tokens: 数据类型为字符串序列。
- ner_tags: 数据类型为标签序列，标签包括：
  - 0: O
  - 1: B-participants
  - 2: I-participants
  - 3: B-intervention
  - 4: I-intervention
  - 5: B-control
  - 6: I-control
  - 7: B-outcomes
  - 8: I-outcomes
分割:
- train: 大小为5058132字节，包含808个样本。
- valid: 大小为620600字节，包含101个样本。
- test: 大小为634162字节，包含102个样本。
下载大小: 857892字节
数据集大小: 6312894字节

配置名称：Default_split-v1_I-C

特征:
- id: 数据类型为字符串。
- tokens: 数据类型为字符串序列。
- ner_tags: 数据类型为标签序列，标签包括：
  - 0: O
  - 1: B-participants
  - 2: I-participants
  - 3: B-I-C
  - 4: I-I-C
  - 5: B-outcomes
  - 6: I-outcomes
分割:
- train: 大小为5021698字节，包含808个样本。
- valid: 大小为640991字节，包含101个样本。
- test: 大小为650205字节，包含102个样本。
下载大小: 847026字节
数据集大小: 6312894字节

5,000+

优质数据集

54 个

任务类型

进入经典数据集