hfvladkon/WiNER
收藏Hugging Face2023-11-03 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/hfvladkon/WiNER
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: id
dtype: string
- name: text
dtype: string
- name: tokens
sequence: string
- name: pos_tags
sequence: string
- name: ner_tags
sequence: string
splits:
- name: train
num_bytes: 133047685
num_examples: 203286
download_size: 46621835
dataset_size: 133047685
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
# Dataset Card for "WiNER"
## WiNER: A Wikipedia Annotated Corpus for Named Entity Recognition
## Sample
```json
{'id': '1',
'text': 'В договоре среди 5 старших князей упоминается Миндовг .',
'tokens': ['В',
'договоре',
'среди',
'5',
'старших',
'князей',
'упоминается',
'Миндовг',
'.'],
'pos_tags': ['PR', 'S', 'PR', 'NUM', 'A', 'S', 'V', 'S', 'SENT'],
'ner_tags': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'I-PER', 'O']}
```
## Citation
[WiNER: A Wikipedia Annotated Corpus for Named Entity Recognition](https://aclanthology.org/I17-1042/)
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
hfvladkon
原始信息汇总
数据集卡片 "WiNER"
WiNER: 维基百科命名实体识别标注语料库
数据集信息
特征
- id: 字符串类型
- text: 字符串类型
- tokens: 字符串序列
- pos_tags: 字符串序列
- ner_tags: 字符串序列
分割
- train:
- 字节数: 133047685
- 样本数: 203286
大小
- 下载大小: 46621835 字节
- 数据集大小: 133047685 字节
配置
- default:
- 数据文件:
- 分割: train
- 路径: data/train-*
- 数据文件:
示例
json { "id": "1", "text": "В договоре среди 5 старших князей упоминается Миндовг .", "tokens": ["В", "договоре", "среди", "5", "старших", "князей", "упоминается", "Миндовг", "."], "pos_tags": ["PR", "S", "PR", "NUM", "A", "S", "V", "S", "SENT"], "ner_tags": ["O", "O", "O", "O", "O", "O", "O", "I-PER", "O"] }



