IsmatS/azerbaijani-ner-benchmark

Name: IsmatS/azerbaijani-ner-benchmark
Creator: IsmatS
Published: 2026-03-19 07:18:15
License: 暂无描述

Hugging Face2026-03-19 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/IsmatS/azerbaijani-ner-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - az license: mit tags: - ner - token-classification - azerbaijani - benchmark pretty_name: Azerbaijani NER Benchmark size_categories: - 1K<n<10K task_categories: - token-classification dataset_info: features: - name: tokens sequence: string - name: ner_tags sequence: class_label: names: '0': O '1': B-PERSON '2': I-PERSON '3': B-LOCATION '4': I-LOCATION '5': B-ORGANISATION '6': I-ORGANISATION '7': B-DATE '8': I-DATE splits: - name: test num_examples: 2915 --- # Azerbaijani NER Benchmark This dataset is the **evaluation benchmark** used to test and compare four Azerbaijani Named Entity Recognition (NER) models trained on the [LocalDoc/azerbaijani-ner-dataset](https://huggingface.co/datasets/LocalDoc/azerbaijani-ner-dataset). ## Dataset Description - **Source:** Test split of [LocalDoc/azerbaijani-ner-dataset](https://huggingface.co/datasets/LocalDoc/azerbaijani-ner-dataset) - **Language:** Azerbaijani (az) - **Task:** Token classification / Named Entity Recognition - **Annotation format:** IOB2 (Inside-Outside-Beginning) - **Number of examples:** 2,915 sentences - **Entity types:** 12 categories (see below) ## Entity Types The dataset uses IOB2 annotation with 12 entity categories: | Tag | Description | |-----|-------------| | O | Outside (non-entity token) | | B-PERSON / I-PERSON | Person names (e.g., İlham Əliyev) | | B-LOCATION / I-LOCATION | Geographic locations (e.g., Bakı, Azərbaycan) | | B-ORGANISATION / I-ORGANISATION | Organizations (e.g., universitetlər, şirkətlər) | | B-DATE / I-DATE | Date expressions (e.g., 2014-cü il, yanvar ayı) | ## Model Comparison The following four models were evaluated on this benchmark: | Model | Parameters | F1-Score | Hugging Face | |-------|------------|----------|--------------| | [mBERT Azerbaijani NER](https://huggingface.co/IsmatS/mbert-az-ner) | 180M | 67.70% | IsmatS/mbert-az-ner | | [XLM-RoBERTa Base Azerbaijani NER](https://huggingface.co/IsmatS/xlm-roberta-az-ner) | 125M | 75.22% | IsmatS/xlm-roberta-az-ner | | [XLM-RoBERTa Large Azerbaijani NER](https://huggingface.co/IsmatS/xlm_roberta_large_az_ner) | 355M | **75.48%** | IsmatS/xlm_roberta_large_az_ner | | [Azerbaijani-Turkish BERT Base NER](https://huggingface.co/IsmatS/azeri-turkish-bert-ner) | 110M | 73.55% | IsmatS/azeri-turkish-bert-ner | **XLM-RoBERTa Large** achieves the highest F1-score of **75.48%** and is used in the production deployment at [named-entity-recognition.fly.dev](https://named-entity-recognition.fly.dev/). ## How to Use for Evaluation ### Quick Start ```python from datasets import load_dataset dataset = load_dataset("IsmatS/azerbaijani-ner-benchmark", split="test") print(dataset) # Dataset({features: ['tokens', 'ner_tags'], num_rows: 2915}) ``` ### Evaluate a Model Use the provided `evaluate_models.py` script to reproduce benchmark results: ```bash pip install transformers datasets seqeval python evaluate_models.py ``` Or evaluate a single model programmatically: ```python from transformers import pipeline from datasets import load_dataset from seqeval.metrics import f1_score # Load benchmark dataset = load_dataset("IsmatS/azerbaijani-ner-benchmark", split="test") # Load model ner_pipeline = pipeline( "token-classification", model="IsmatS/xlm-roberta-az-ner", aggregation_strategy="simple" ) # Run evaluation # See evaluate_models.py for the full evaluation loop ``` ### Evaluation Script The full evaluation script (`evaluate_models.py`) in this repository: 1. Loads each of the 4 Azerbaijani NER models from Hugging Face Hub 2. Runs inference on all 2,915 benchmark sentences 3. Computes precision, recall, and F1-score using `seqeval` 4. Prints a comparison table with all results ## Dataset Loading ```python from datasets import load_dataset # Load test split (the full benchmark) benchmark = load_dataset("IsmatS/azerbaijani-ner-benchmark", split="test") # Inspect a sample print(benchmark[0]) # { # 'tokens': ['2014-cü', 'ildə', 'Azərbaycan', ...], # 'ner_tags': [7, 8, 3, ...] # } ``` ## Citation If you use this benchmark in your research, please cite the original dataset: ```bibtex @dataset{azerbaijani_ner_benchmark, title = {Azerbaijani NER Benchmark}, author = {Ismat Samadov}, year = {2024}, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/IsmatS/azerbaijani-ner-benchmark}, note = {Derived from LocalDoc/azerbaijani-ner-dataset} } ``` ## Related Resources - [LocalDoc/azerbaijani-ner-dataset](https://huggingface.co/datasets/LocalDoc/azerbaijani-ner-dataset) — original training/test data - [IsmatS/xlm-roberta-az-ner](https://huggingface.co/IsmatS/xlm-roberta-az-ner) — production NER model - [Named Entity Recognition Demo](https://named-entity-recognition.fly.dev/) — live demo application ## License MIT License

提供机构：

IsmatS

5,000+

优质数据集

54 个

任务类型

进入经典数据集