large-traversaal/commonsenseqa_urdu_final
收藏Hugging Face2026-03-05 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/large-traversaal/commonsenseqa_urdu_final
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card: **CommonSenseQA Urdu**
## Dataset Summary
`commonsenseqa_urdu_cleaned` is a cleaned Urdu version of the **CommonsenseQA** benchmark — a multiple-choice question-answering dataset that tests **commonsense reasoning** across diverse everyday scenarios. It provides English questions and answer choices alongside high-quality Urdu translations for both.
This dataset enables evaluation and training of **multilingual and Urdu-native language models** on commonsense reasoning tasks. The translations and structure are curated to preserve semantic quality and reasoning difficulty when moving from English to Urdu. ([Hugging Face][1])
---
## Dataset Details
* **Dataset Name:** `commonsenseqa_urdu_cleaned`
* **Publisher:** `large-traversaal` (Traversaal.ai)
* **Modality:** Text (English + Urdu)
* **Format:** Parquet
* **Total Examples:** ~12.1K ([Hugging Face][1])
### Splits
| Split | Count |
| ------------------------------------------- | ------ |
| `train` | ~9.74K |
| `validation` | ~1.22K |
| `test` | ~1.14K |
| *Total ~12.1K examples* ([Hugging Face][1]) | |
---
## Source & Motivation
CommonsenseQA is a widely recognized benchmark for evaluating **commonsense reasoning** in NLP — requiring models to go beyond pattern matching and draw on world knowledge to answer questions correctly. The original CommonsenseQA dataset was created using ConceptNet and human-generated multiple-choice questions. `commonsenseqa_urdu_cleaned` brings this challenge to **Urdu speakers and multilingual models** by providing Urdu translations for both questions and answer choices, preserving task difficulty and semantics. ([ACL Anthology][2])
This dataset fills a significant gap in Urdu NLP by enabling researchers to **benchmark and improve Urdu-capable language models** on reasoning tasks. ([Hugging Face][1])
---
## 📚 Data Fields
Each record contains the following fields:
| Field | Type | Description | |
| ------------------ | ------ | --------------------------------------- | ------------------- |
| `id` | string | Unique example identifier | |
| `question_concept` | string | Concept used to generate the question | |
| `question` | string | English multiple-choice question | |
| `choices` | dict | English answer options (A–E) | |
| `urdu_question` | string | Urdu translation of the question | |
| `urdu_choices` | dict | Urdu translations of the answer options | |
| `answerKey` | string | The correct answer (A, B, C, D, or E) | ([Hugging Face][1]) |
---
## Intended Use
This dataset is ideal for:
* Evaluating **Urdu and multilingual language models** on commonsense reasoning.
* Training models to learn cross-lingual reasoning between English and Urdu.
* Benchmarking performance on **multiple-choice question answering** in a low-resource language.
* Research on **cross-lingual transfer and reasoning capabilities** of NLP systems.
---
## Licensing & Usage
The dataset is hosted on the Hugging Face Hub under `large-traversaal`. License details and usage terms are available on the dataset page. ([Hugging Face][1])
---
## 📄 Citation
If you use this dataset in your research, please cite the **UrduBench paper**:
```bibtex
@misc{shafique2026urdubenchurdureasoningbenchmark,
title={UrduBench: An Urdu Reasoning Benchmark using Contextually Ensembled Translations with Human-in-the-Loop},
author={Muhammad Ali Shafique and Areej Mehboob and Layba Fiaz and Muhammad Usman Qadeer and Hamza Farooq},
year={2026},
eprint={2601.21000},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2601.21000}
}
提供机构:
large-traversaal



