five

DataOrigin/k12-video-solutions-india

收藏
Hugging Face2026-04-06 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/DataOrigin/k12-video-solutions-india
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: other task_categories: - video-classification - question-answering language: - hi - en - ta - te - ml - bn - mr - gu - or - pa - as tags: - education, - k12, - india, - multilingual, - mathematics, - science, - curriculum, - indic, pretty_name: K-12 Video Solutions India size_categories: - 1M<n<10M --- # K-12 Video Solutions India ## Dataset Description A large-scale collection of step-by-step video solutions to K-12 curriculum questions, solved by subject matter experts. Produced by Prepp, India's largest K-12 learning platform, operated by Collegedunia Web Private Limited. ## Dataset Summary - **Total videos:** 3.8 million (38 lakh) - **Content type:** Step-by-step question-answer video solutions - **Subjects:** Mathematics, Science, Social Studies, English, Regional Languages - **Grades:** Class 1 through Class 12 - **Boards:** 33 Indian educational boards including CBSE, ICSE, and all major state boards - **Languages:** Hindi, English, Tamil, Telugu, Kannada, Malayalam, Marathi, Bengali, Gujarati, Odia, Punjabi, Assamese - **Format:** Video (MP4) with audio narration and visual problem-solving ## Sample Data Three sample videos are available in this repository demonstrating: - Sample 1: Mathematics problem solving (Class 10, CBSE) - Sample 2: Science concept explanation (Class 8, CBSE) - Sample 3: Regional language content (Hindi medium) ## Key Features - **Human-verified:** All solutions reviewed by qualified subject experts - **Curriculum-mapped:** Each video tagged to specific board, grade, subject, chapter, and topic - **Multimodal:** Rich combination of audio narration, handwritten working, diagrams, and step-by-step visual explanation - **Low-resource languages:** Includes Odia, Assamese, and Punjabi — genuinely low-resource for AI training globally ## Intended Uses - Training multimodal AI models for educational reasoning - STEM problem-solving model development - Indic language speech and audio model training - Video understanding and question-answering model fine-tuning - Multilingual educational AI applications ## Data Collection and Rights All content is proprietary, produced by Prepp's in-house content team and contracted subject matter experts under work-for-hire agreements. Content is ethically sourced and curriculum-mapped. Full dataset licensing is available for commercial AI training purposes. ## Licensing and Commercial Access This repository contains sample data only. The full dataset of 3.8 million videos is available for commercial AI training licensing. **For licensing inquiries contact:** Ankit Dubey — Head of AI Data Partnerships, Collegedunia ankit.dubey@collegedunia.com ## Dataset Curator [Collegedunia Web Private Limited](https://collegedunia.com) | [Prepp](https://prepp.in) Gurugram, Haryana, India
提供机构:
DataOrigin
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作