large-traversaal/mgsm_urdu_cleaned

Name: large-traversaal/mgsm_urdu_cleaned
Creator: large-traversaal
Published: 2025-11-26 17:17:09
License: 暂无描述

Hugging Face2025-11-26 更新2026-02-07 收录

下载链接：

https://hf-mirror.com/datasets/large-traversaal/mgsm_urdu_cleaned

下载链接

链接失效反馈

官方服务：

资源简介：

# 📘 Dataset Card: **mgsm_urdu_cleaned** ## 📝 **Dataset Summary** `mgsm_urdu_cleaned` is a high-quality Urdu translation of the Multilingual Grade School Math (MGSM) benchmark, originally derived from GSM8K. The dataset has been cleaned and refined through native-speaker review, ensuring accurate linguistic quality, natural Urdu phrasing, and faithful preservation of the original mathematical intent. It contains English math word problems, their Urdu translations, Urdu answers, and numerical solutions, enabling consistent evaluation of math reasoning in Urdu. It is designed to evaluate **Urdu-capable math reasoning**, **multilingual chain-of-thought**, and **cross-lingual generalization** for large language models. --- ## 📂 **Dataset Details** * **Dataset Name:** `mgsm_urdu_cleaned` * **Publisher:** `large-traversaal` (Traversaal.ai) * **Modality:** Text (English + Urdu) * **Format:** Parquet * **Total Examples:** 258 * **Splits:** * `train`: 8 * `test`: 250 * **Data Size:** ~96.9 KB (auto-converted Parquet) --- ## 🧩 **Source & Motivation** This dataset is a refined subset of MGSM (built from GSM8K) translated into Urdu using high-quality translation pipelines for evaluating: * Mathematical reasoning in Urdu * Cross-lingual reasoning (English → Urdu) * Step-by-step chain-of-thought in multilingual contexts * Performance of Urdu-language LLMs It fills a major gap in **Urdu NLP benchmarking**, enabling reliable evaluation of reasoning-heavy tasks. --- ## 🔧 **Data Fields** Each record contains: | Field | Type | Description | | ------------------- | ------ | --------------------------------- | | `question` | string | Math word problem in English | | `urd_question` | string | Urdu translation of the problem | | `answer` | string | English answer text | | `urd_answer` | string | Urdu answer text | | `answer_number` | int64 | Numerical answer | | `equation_solution` | string | Step-by-step reasoning / equation | These fields allow researchers to evaluate both direct QA and chain-of-thought reasoning quality. --- ## 🎯 **Intended Use** This dataset is ideal for: * Evaluating Urdu LLMs on math reasoning * Benchmarking multilingual reasoning models * Fine-tuning small models for Urdu chain-of-thought * Research on cross-lingual consistency * Translation-robust reasoning tasks It is primarily an **evaluation dataset**, not a large-scale training set. ---

提供机构：

large-traversaal

5,000+

优质数据集

54 个

任务类型

进入经典数据集