five

fadhel-alobaidi/AraFact

收藏
Hugging Face2026-04-03 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/fadhel-alobaidi/AraFact
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - ar license: mit task_categories: - text-classification - question-answering task_ids: - fact-checking pretty_name: AraFact size_categories: - 10K<n<100K tags: - arabic - synthetic - fact-checking - validation - msa - agentic - json - documents --- # AraFact-Synth — Arabic Fact Validation Dataset `dataset_ar.jsonl` is a synthetic Arabic-language fact-checking dataset generated by a two-agent pipeline (Agent A + Agent B). It is designed for training and evaluating language models on the task of **validating factual claims** in Arabic text. - **Format**: JSON Lines (one record per line) - **Size**: 10,008 records / 60,250 items - **Language**: Arabic MSA (Modern Standard Arabic) — 100% --- ## Statistics | Metric | Value | |---|---| | Total records | 10,008 | | Total items | 60,250 | | Valid items | 29,865 (49.6%) | | Invalid items | 28,128 (46.7%) | | Batch type — json | 5,131 records (51.3%) | | Batch type — documents | 4,877 records (48.7%) | ### Flaw Type Distribution (invalid items) | Flaw Type | Count | Description | |---|---|---| | `wrong_spec` | 9,003 | Wrong entity, location, organization, or specification | | `subtle_number` | 8,721 | Slightly altered numbers, percentages, or dates | | `contradictory` | 6,206 | Content that contradicts itself or the source | | `chronological` | 4,243 | Wrong dates, year, or sequence of events | --- ## Knowledge Categories The dataset spans **159 categories** and **1,131 subcategories** across diverse knowledge domains, covering both mainstream fields (Mathematics, Medicine, Law, Physics) and highly specialised niches (Lunar Geology, Ethnobotany, Traditional Boatbuilding, etc.). --- ## Record Structure Each line is a JSON object with four top-level fields: `instruction`, `input`, `output`, and `meta`. ``` { "instruction": "...", "input": { ... }, "output": { ... }, "meta": { ... } } ``` --- ### `instruction` (string) The task prompt for the model. One of two values depending on batch type: - `"Validate the following batch of json. For each item, provide a label and a detailed reason."` - `"Validate the following batch of documents. For each item, provide a label and a detailed reason."` --- ### `input` (object) The batch to be validated. | Field | Type | Description | |---|---|---| | `type` | string | `"json"` or `"documents"` | | `count` | integer | Number of items in the batch | | `items` | array | List of items to validate (see below) | #### Item structure | Field | Type | Description | |---|---|---| | `id` | integer | Item identifier (1-based) | | `content` | string or object | The claim to validate. String for `documents`, dict for `json` | **Example — documents type:** ```json { "id": 1, "content": "تُعد شبكات الجيل الخامس نقلة نوعية في عالم الاتصالات ..." } ``` **Example — json type:** ```json { "id": 2, "content": { "الاسم": "سوق المشتقات المالية السعودي", "تاريخ_الانطلاق": "الربع الثالث من عام 2020", "أول_منتج": "العقود المستقبلية لمؤشر (إم تي 30)" } } ``` --- ### `output` (object) Agent B's validation results for the batch. | Field | Type | Description | |---|---|---| | `results` | array | Per-item validation results (see below) | | `summary` | string | Arabic summary, e.g. `"عنصر واحد صحيح، وعنصر واحد خاطئ من أصل عنصرين"` | #### Result item structure | Field | Type | Description | |---|---|---| | `id` | integer | Matches the input item id | | `label` | string | `"valid"` or `"invalid"` | | `flaw_detected` | string | `"none"`, `"wrong_spec"`, `"subtle_number"`, `"contradictory"`, or `"chronological"` | | `reason` | string | Arabic explanation of the validation decision | **Example:** ```json { "id": 2, "label": "invalid", "flaw_detected": "wrong_spec", "reason": "يحتوي العنصر على ادعاء رقمي غير دقيق ..." } ``` --- ### `meta` (object) Metadata about the record for analysis and filtering. | Field | Type | Description | |---|---|---| | `batch_type` | string | `"json"` or `"documents"` | | `batch_size` | integer | Number of items in the batch | | `valid_count` | integer | Number of items Agent B labeled as valid | | `invalid_count` | integer | Number of items Agent B labeled as invalid | | `language_distribution` | object | Count of items per language, e.g. `{"ar": 6}` | | `all_hints_confirmed` | boolean | `true` if Agent B's labels match all intended labels | | `intended_labels` | array | The ground-truth plan used to generate the batch (see below) | #### `intended_labels` structure Each entry reflects the **intended** validity for that item as decided at generation time. Agent A was instructed to follow this plan, but may not always have done so perfectly. Agent B's `label` field reflects its independent assessment. | Field | Type | Description | |---|---|---| | `id` | integer | Item identifier | | `is_valid` | boolean | Whether the item was intended to be valid | | `flaw_type` | string | Intended flaw type if invalid, `"none"` if valid | **Example:** ```json "intended_labels": [ {"id": 1, "is_valid": true, "flaw_type": "none"}, {"id": 2, "is_valid": false, "flaw_type": "subtle_number"} ] ``` > **Note**: `all_hints_confirmed: false` does not always mean Agent B made an error. It may also mean Agent A did not faithfully inject the intended flaw, in which case Agent B's independent label is the more reliable signal. --- ## Generation Pipeline Records were generated by a two-agent pipeline: 1. **Agent A** (producer): Given a category, subcategory, and a validity plan (specifying which items should be valid/invalid and what flaw type), Agent A searches the web for a real source, extracts facts, and generates a batch of items — some faithful to the source, others containing the specified flaw. 2. **Agent B** (validator): Given the batch items and hints (the intended labels), Agent B independently validates each item using web search and reasoning. It treats hints as guidance but verifies independently, and may disagree with the hint if the content does not match. --- ## Usage Notes - Use `intended_labels` as **ground truth** for training the producer/generation side. - Use `output.results` (`label`, `flaw_detected`, `reason`) as **ground truth** for training a validator model. - When `all_hints_confirmed` is `false`, inspect both `intended_labels` and `output.results` — the disagreement may reflect Agent A non-compliance or a genuine Agent B catch. - The `reason` field in `output.results` is always in Arabic and typically references specific facts or search results that support the decision.
提供机构:
fadhel-alobaidi
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作