five

SSC-BanglaTutor: A Curriculum-Aligned Bengali Dataset for Intelligent Tutoring Systems

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/krn9bzypsn
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset comprises a Bengali-language educational corpus specifically curated to support the fine-tuning and evaluation of AI-driven, hint-based tutoring systems aligned with the Secondary School Certificate (SSC) science curriculum of Bangladesh. It contains a total of 11,286 structured question–answer–hint entries, distributed across three core science subjects: - Biology: 4,859 entries (14 chapters) - Chemistry: 3,034 entries (12 chapters) - Physics: 3,393 entries (14 chapters) Each entry includes: - A question written in Bengali - Five progressively ranked hints guiding learners from general to specific concepts - A convergence metric estimating the probability of a correct response at each hint - Correct and distractor answers based on common student misconceptions - Curriculum-aligned topic tags mapped to the SSC syllabus All data are encoded in UTF-8 JSON Lines (.jsonl) format, ensuring compatibility with Bengali NLP tools and large-scale AI training pipelines. The dataset’s structured design supports personalized feedback, enabling adaptive learning, retrieval-augmented generation (RAG), and fine-tuning of large language models (LLMs) for education in low-resource languages.
创建时间:
2025-10-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作