davanstrien/unsilence_voc
收藏Hugging Face2023-11-16 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/davanstrien/unsilence_voc
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
dataset_info:
features:
- name: tokens
sequence: string
- name: NE-MAIN
sequence:
class_label:
names:
'0': B-Organization
'1': B-Organization,B-Place
'2': B-Organization,I-Person
'3': B-Organization,I-Place
'4': B-Person
'5': B-Person,B-Place
'6': B-Person,I-Place
'7': B-Place
'8': I-Organization
'9': I-Organization,B-Place
'10': I-Organization,I-Person
'11': I-Organization,I-Person,B-Place
'12': I-Organization,I-Person,I-Place
'13': I-Organization,I-Place
'14': I-Person
'15': I-Person,B-Place
'16': I-Person,I-Place
'17': I-Place
'18': O
- name: NE-PER-NAME
sequence:
class_label:
names:
'0': I-ProperName
'1': O
'2': B-ProperName
'3': ''
- name: NE-PER-GENDER
sequence:
class_label:
names:
'0': B-Group
'1': B-Man
'2': B-Man,B-Unspecified
'3': B-Man,I-Woman
'4': B-Unspecified
'5': B-Unspecified,I-Woman
'6': B-Woman
'7': I-Group
'8': I-Man
'9': I-Man,I-Unspecified
'10': I-Man,I-Woman
'11': I-Unspecified
'12': I-Unspecified,I-Woman
'13': I-Woman
'14': NE-PER-GENDER
'15': O
- name: NE-PER-LEGAL-STATUS
sequence:
class_label:
names:
'0': B-Enslaved
'1': B-Freed
'2': B-Unspecified
'3': I-Enslaved
'4': I-Freed
'5': I-Unspecified
'6': NE-PER-LEGAL-STATUS
'7': O
- name: NE-PER-ROLE
sequence:
class_label:
names:
'0': B-Acting_Notary
'1': B-Beneficiary
'2': B-Notary
'3': B-Other
'4': B-Testator
'5': B-Testator_Beneficiary
'6': B-Witness
'7': I-Acting_Notary
'8': I-Beneficiary
'9': I-Beneficiary,B-Other
'10': I-Beneficiary,I-Other
'11': I-Notary
'12': I-Other
'13': I-Testator
'14': I-Testator_Beneficiary
'15': I-Witness
'16': NE-PER-ROLE
'17': O
- name: NE-ORG-BENEFICIARY
sequence:
class_label:
names:
'0': B-No
'1': B-Yes
'2': I-No
'3': I-Yes
'4': NE-ORG-BENEFICIARY
'5': O
- name: MISC
dtype: string
- name: document_id
dtype: string
splits:
- name: train
num_bytes: 31436367
num_examples: 2199
download_size: 2148172
dataset_size: 31436367
---
# Dataset Card for "unsilence_voc"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
davanstrien
原始信息汇总
数据集概述
配置
- 默认配置 (
default)- 数据文件路径:
data/train-* - 数据分割:
train
- 数据文件路径:
数据特征
- tokens: 字符序列
- NE-MAIN: 命名实体主标签序列
- 类别标签名称:
- 0: B-Organization
- 1: B-Organization,B-Place
- 2: B-Organization,I-Person
- 3: B-Organization,I-Place
- 4: B-Person
- 5: B-Person,B-Place
- 6: B-Person,I-Place
- 7: B-Place
- 8: I-Organization
- 9: I-Organization,B-Place
- 10: I-Organization,I-Person
- 11: I-Organization,I-Person,B-Place
- 12: I-Organization,I-Person,I-Place
- 13: I-Organization,I-Place
- 14: I-Person
- 15: I-Person,B-Place
- 16: I-Person,I-Place
- 17: I-Place
- 18: O
- 类别标签名称:
- NE-PER-NAME: 人名命名实体标签序列
- 类别标签名称:
- 0: I-ProperName
- 1: O
- 2: B-ProperName
- 3:
- 类别标签名称:
- NE-PER-GENDER: 人物性别命名实体标签序列
- 类别标签名称:
- 0: B-Group
- 1: B-Man
- 2: B-Man,B-Unspecified
- 3: B-Man,I-Woman
- 4: B-Unspecified
- 5: B-Unspecified,I-Woman
- 6: B-Woman
- 7: I-Group
- 8: I-Man
- 9: I-Man,I-Unspecified
- 10: I-Man,I-Woman
- 11: I-Unspecified
- 12: I-Unspecified,I-Woman
- 13: I-Woman
- 14: NE-PER-GENDER
- 15: O
- 类别标签名称:
- NE-PER-LEGAL-STATUS: 人物法律状态命名实体标签序列
- 类别标签名称:
- 0: B-Enslaved
- 1: B-Freed
- 2: B-Unspecified
- 3: I-Enslaved
- 4: I-Freed
- 5: I-Unspecified
- 6: NE-PER-LEGAL-STATUS
- 7: O
- 类别标签名称:
- NE-PER-ROLE: 人物角色命名实体标签序列
- 类别标签名称:
- 0: B-Acting_Notary
- 1: B-Beneficiary
- 2: B-Notary
- 3: B-Other
- 4: B-Testator
- 5: B-Testator_Beneficiary
- 6: B-Witness
- 7: I-Acting_Notary
- 8: I-Beneficiary
- 9: I-Beneficiary,B-Other
- 10: I-Beneficiary,I-Other
- 11: I-Notary
- 12: I-Other
- 13: I-Testator
- 14: I-Testator_Beneficiary
- 15: I-Witness
- 16: NE-PER-ROLE
- 17: O
- 类别标签名称:
- NE-ORG-BENEFICIARY: 组织受益者命名实体标签序列
- 类别标签名称:
- 0: B-No
- 1: B-Yes
- 2: I-No
- 3: I-Yes
- 4: NE-ORG-BENEFICIARY
- 5: O
- 类别标签名称:
- MISC: 字符串类型
- document_id: 字符串类型
数据分割
- 训练集 (
train)- 字节数: 31436367
- 样本数: 2199
数据集大小
- 下载大小: 2148172 字节
- 数据集大小: 31436367 字节



