five

"BanglaEdu_MCQ"

收藏
DataCite Commons2026-03-14 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/banglaedumcq
下载链接
链接失效反馈
官方服务:
资源简介:
"OverviewBanglaEdu-MCQ is a large-scale, high-quality multiple-choice question (MCQ) dataset designed for Bengali language education and computational linguistics research. The dataset comprises 6,261 carefully curated questions sourced from NCTB (National Curriculum and Textbook Board) educational materials, covering diverse subject areas including science, history, literature, technology, and general knowledge.Key FeaturesEach question entry includes a contextual passage, a question stem, four multiple-choice options, the correct answer, and a detailed explanation. The dataset employs passage-disjoint splitting with 5,631 training, 727 validation, and 675 test samples, ensuring no data leakage across partitions. All content is presented entirely in Bangla, achieving 98%+ language purity after comprehensive cleaning.ApplicationsThis dataset serves as a benchmark for developing and evaluating Bengali reading comprehension systems, transformer-based models, and educational AI applications. It supports machine learning research in Bangla NLP, curriculum development, and question-answering systems for Bengali-speaking learners worldwide.Dataset SpecificationsTotal Questions: 7,033Unique Passages: 5,920Language: Bengla (Bengali)Format: CSVData Quality: 100% complete, zero missing valuesLicense: Educational use"

## 数据集概述 BanglaEdu-MCQ是一款大规模高质量多项选择题(multiple-choice question, MCQ)数据集,专为孟加拉语教育领域与计算语言学研究设计。该数据集包含6261道经精心甄选的题目,数据源自孟加拉国国家课程与教材委员会(National Curriculum and Textbook Board, NCTB)的教育素材,涵盖科学、历史、文学、技术及通识等多元学科领域。 ## 核心特性 每道题目条目均包含上下文段落、题干、四个多项选择选项、正确答案及详细解析。本数据集采用段落无重叠划分策略,划分为5631个训练样本、727个验证样本与675个测试样本,确保各数据子集间不存在数据泄露问题。所有内容均以孟加拉语呈现,经全面清洗后语言纯净度可达98%以上。 ## 应用场景 本数据集可作为开发与评估孟加拉语阅读理解系统、基于Transformer的模型及教育类AI应用的基准测试集。它可支撑面向全球孟加拉语学习者的孟加拉语自然语言处理(Natural Language Processing, NLP)、课程开发及问答系统相关的机器学习研究工作。 ## 数据集规格 题目总数:7033 唯一段落数:5920 语言:孟加拉语(Bengali) 格式:CSV 数据质量:100%完整,无缺失值 许可协议:仅可用于教育用途
提供机构:
IEEE DataPort
创建时间:
2026-03-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作