SSC-BanglaTutor: A Curriculum-Aligned Bengali Dataset for Intelligent Tutoring Systems

NIAID Data Ecosystem2026-05-10 收录

下载链接：

https://data.mendeley.com/datasets/krn9bzypsn

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset comprises a Bengali-language educational corpus specifically curated to support the fine-tuning and evaluation of AI-driven, hint-based tutoring systems aligned with the Secondary School Certificate (SSC) science curriculum of Bangladesh. It contains a total of 11,286 structured question–answer–hint entries, distributed across three core science subjects: - Biology: 4,859 entries (14 chapters) - Chemistry: 3,034 entries (12 chapters) - Physics: 3,393 entries (14 chapters) Each entry includes: - A question written in Bengali - Five progressively ranked hints guiding learners from general to specific concepts - A convergence metric estimating the probability of a correct response at each hint - Correct and distractor answers based on common student misconceptions - Curriculum-aligned topic tags mapped to the SSC syllabus All data are encoded in UTF-8 JSON Lines (.jsonl) format, ensuring compatibility with Bengali NLP tools and large-scale AI training pipelines. The dataset’s structured design supports personalized feedback, enabling adaptive learning, retrieval-augmented generation (RAG), and fine-tuning of large language models (LLMs) for education in low-resource languages.

创建时间：

2025-10-27

5,000+

优质数据集

54 个

任务类型

进入经典数据集