资源简介:
---
annotations_creators:
- crowdsourced
language:
- fa
language_creators:
- machine-generated
license:
- other
multilinguality:
- monolingual
pretty_name: conll2003-persian
size_categories:
- 10K<n<100K
source_datasets:
- extended|conll2003
tags:
- named entity recognition
task_categories:
- token-classification
task_ids:
- named-entity-recognition
train-eval-index:
- col_mapping:
ner_tags: tags
tokens: tokens
config: conll2003
metrics:
- name: seqeval
type: seqeval
splits:
eval_split: test
train_split: train
task: token-classification
task_id: entity_extraction
---
# Dataset Card for Dataset Name
## Dataset Description
- **Homepage:**
- **Repository:**
- **Paper:**
- **Leaderboard:**
- **Point of Contact:**
### Dataset Summary
This dataset card aims to be a base template for new datasets. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1).
### Supported Tasks and Leaderboards
[More Information Needed]
### Languages
[More Information Needed]
## Dataset Structure
### Data Instances
[More Information Needed]
### Data Fields
[More Information Needed]
### Data Splits
[More Information Needed]
## Dataset Creation
### Curation Rationale
[More Information Needed]
### Source Data
#### Initial Data Collection and Normalization
[More Information Needed]
#### Who are the source language producers?
[More Information Needed]
### Annotations
#### Annotation process
[More Information Needed]
#### Who are the annotators?
[More Information Needed]
### Personal and Sensitive Information
[More Information Needed]
## Considerations for Using the Data
### Social Impact of Dataset
[More Information Needed]
### Discussion of Biases
[More Information Needed]
### Other Known Limitations
[More Information Needed]
## Additional Information
### Dataset Curators
[More Information Needed]
### Licensing Information
[More Information Needed]
### Citation Information
If you used the datasets and models in this repository, please cite it.
```bibtex
@misc{https://doi.org/10.48550/arxiv.2302.09611,
doi = {10.48550/ARXIV.2302.09611},
url = {https://arxiv.org/abs/2302.09611},
author = {Sartipi, Amir and Fatemi, Afsaneh},
keywords = {Computation and Language (cs.CL), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Exploring the Potential of Machine Translation for Generating Named Entity Datasets: A Case Study between Persian and English},
publisher = {arXiv},
year = {2023},
copyright = {arXiv.org perpetual, non-exclusive license}
}
```
### Contributions
[More Information Needed]
annotations_creators:
- 众包(crowdsourced)
language:
- 波斯语(fa)
language_creators:
- 机器生成(machine-generated)
license:
- 其他(other)
multilinguality:
- 单语言(monolingual)
pretty_name: conll2003-persian
size_categories:
- 1万至10万样本(10K<n<100K)
source_datasets:
- 扩展版|conll2003(extended|conll2003)
tags:
- 命名实体识别(Named Entity Recognition)
task_categories:
- 词元分类(token-classification)
task_ids:
- 命名实体识别(named-entity-recognition)
train-eval-index:
- col_mapping:
ner_tags: tags
tokens: tokens
config: conll2003
metrics:
- name: seqeval
type: seqeval
splits:
eval_split: 测试集(test)
train_split: 训练集(train)
task: 词元分类(token-classification)
task_id: 实体抽取(entity_extraction)
# 数据集卡片:conll2003-persian
## 数据集说明
- **主页:**
- **代码仓库:**
- **相关论文:**
- **排行榜:**
- **联系人:**
### 数据集概述
本数据集卡片旨在作为新建数据集的基础模板,其基于[该原始模板](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1)生成。
### 支持的任务与排行榜
[More Information Needed]
### 支持语言
[More Information Needed]
## 数据集结构
### 数据实例
[More Information Needed]
### 数据字段
[More Information Needed]
### 数据集划分
[More Information Needed]
## 数据集构建
### 构建依据
[More Information Needed]
### 源数据
#### 初始数据收集与归一化
[More Information Needed]
#### 源语言内容创作者是谁?
[More Information Needed]
### 标注信息
#### 标注流程
[More Information Needed]
#### 标注人员是谁?
[More Information Needed]
### 个人与敏感信息
[More Information Needed]
## 数据集使用注意事项
### 数据集的社会影响
[More Information Needed]
### 偏差分析
[More Information Needed]
### 其他已知局限性
[More Information Needed]
## 附加信息
### 数据集维护者
[More Information Needed]
### 授权协议信息
[More Information Needed]
### 引用信息
若您使用了本仓库中的数据集与模型,请引用本作品。
bibtex
@misc{https://doi.org/10.48550/arxiv.2302.09611,
doi = {10.48550/ARXIV.2302.09611},
url = {https://arxiv.org/abs/2302.09611},
author = {Sartipi, Amir and Fatemi, Afsaneh},
keywords = {Computation and Language (cs.CL), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Exploring the Potential of Machine Translation for Generating Named Entity Datasets: A Case Study between Persian and English},
publisher = {arXiv},
year = {2023},
copyright = {arXiv.org perpetual, non-exclusive license}
}
### 贡献情况
[More Information Needed]