iSemantics/conllpp-ner-ar
收藏Hugging Face2024-04-28 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/iSemantics/conllpp-ner-ar
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: tokens
sequence: string
- name: ner_tags
sequence:
class_label:
names:
'0': O
'1': B-PER
'2': I-PER
'3': B-ORG
'4': I-ORG
'5': B-LOC
'6': I-LOC
'7': B-MISC
'8': I-MISC
splits:
- name: train
num_bytes: 2780353
num_examples: 10250
- name: validation
num_bytes: 698574
num_examples: 2383
- name: test
num_bytes: 641032
num_examples: 2572
download_size: 1089320
dataset_size: 4119959
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: validation
path: data/validation-*
- split: test
path: data/test-*
license: mit
task_categories:
- token-classification
language:
- ar
size_categories:
- 1K<n<10K
---
dataset_info(数据集信息):
features(特征):
- name: 令牌(Token)
sequence(序列类型): 字符串序列
- name: 命名实体识别标签(ner_tags)
sequence(序列类型):
class_label(类别标签):
names(类别名称):
'0': O(非实体)
'1': B-PER(人物实体起始标记)
'2': I-PER(人物实体内部标记)
'3': B-ORG(组织实体起始标记)
'4': I-ORG(组织实体内部标记)
'5': B-LOC(地点实体起始标记)
'6': I-LOC(地点实体内部标记)
'7': B-MISC(混合实体起始标记)
'8': I-MISC(混合实体内部标记)
splits(划分):
- name: 训练集(train)
num_bytes(字节大小): 2780353
num_examples(样本数量): 10250
- name: 验证集(validation)
num_bytes(字节大小): 698574
num_examples(样本数量): 2383
- name: 测试集(test)
num_bytes(字节大小): 641032
num_examples(样本数量): 2572
download_size(下载大小): 1089320
dataset_size(数据集总大小): 4119959
configs(配置项):
- config_name(配置名称): 默认配置
data_files(数据文件):
- split(划分): 训练集(train)
path(路径): data/train-*
- split(划分): 验证集(validation)
path(路径): data/validation-*
- split(划分): 测试集(test)
path(路径): data/test-*
license(许可证): MIT许可证
task_categories(任务类别):
- 令牌分类(Token Classification)
language(语言):
- 阿拉伯语(ar)
size_categories(样本规模类别):
- 1K<n<10K
提供机构:
iSemantics



