siddharthtumre/Revised-JNLPBA
收藏Hugging Face2023-04-12 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/siddharthtumre/Revised-JNLPBA
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- expert-generated
language_creators:
- expert-generated
language:
- en
license:
- unknown
multilinguality:
- monolingual
size_categories:
- 10K<n<100K
task_categories:
- token-classification
task_ids:
- named-entity-recognition
pretty_name: IASL-BNER Revised JNLPBA
dataset_info:
features:
- name: id
dtype: string
- name: tokens
sequence: string
- name: ner_tags
sequence:
class_label:
names:
'0': O
'1': B-DNA
'2': I-DNA
'3': B-RNA
'4': I-RNA
'5': B-cell_line
'6': I-cell_line
'7': B-cell_type
'8': I-cell_type
'9': B-protein
'10': I-protein
config_name: revised-jnlpba
---
# Dataset Card for Dataset Name
## Dataset Description
- **Homepage:**
- **Repository:**
- **Paper:**
- **Leaderboard:**
- **Point of Contact:**
### Dataset Summary
This dataset card aims to be a base template for new datasets. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1).
### Supported Tasks and Leaderboards
[More Information Needed]
### Languages
[More Information Needed]
## Dataset Structure
### Data Instances
[More Information Needed]
### Data Fields
[More Information Needed]
### Data Splits
[More Information Needed]
## Dataset Creation
### Curation Rationale
[More Information Needed]
### Source Data
#### Initial Data Collection and Normalization
[More Information Needed]
#### Who are the source language producers?
[More Information Needed]
### Annotations
#### Annotation process
[More Information Needed]
#### Who are the annotators?
[More Information Needed]
### Personal and Sensitive Information
[More Information Needed]
## Considerations for Using the Data
### Social Impact of Dataset
[More Information Needed]
### Discussion of Biases
[More Information Needed]
### Other Known Limitations
[More Information Needed]
## Additional Information
### Dataset Curators
[More Information Needed]
### Licensing Information
[More Information Needed]
### Citation Information
[More Information Needed]
### Contributions
[More Information Needed]
提供机构:
siddharthtumre
原始信息汇总
数据集概述
基本信息
- 数据集名称: IASL-BNER Revised JNLPBA
- 语言: 英语 (en)
- 许可证: 未知
- 多语言性: 单语
- 大小: 10K<n<100K
- 任务类别: 词元分类
- 具体任务: 命名实体识别
数据集结构
- 特征:
- id: 字符串类型
- tokens: 字符串序列
- ner_tags: 标签序列,包含以下类别:
- 0: O
- 1: B-DNA
- 2: I-DNA
- 3: B-RNA
- 4: I-RNA
- 5: B-cell_line
- 6: I-cell_line
- 7: B-cell_type
- 8: I-cell_type
- 9: B-protein
- 10: I-protein
数据集创建
- 注释创建者: 专家生成
- 语言创建者: 专家生成
搜集汇总
数据集介绍

背景与挑战
背景概述
Revised-JNLPBA是一个用于命名实体识别(NER)任务的英文文本数据集,专注于生物医学领域,包含约2.24万条数据实例,分为训练集和测试集。该数据集以JSON格式提供,标注了基因、蛋白质、细胞类型等实体,适用于自然语言处理中的序列标注任务。
以上内容由遇见数据集搜集并总结生成



