CanisAI/teach-r3-multilingual
收藏Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/CanisAI/teach-r3-multilingual
下载链接
链接失效反馈官方服务:
资源简介:
Canis TEACH Multilingual Tutoring R3 是一个多语言的苏格拉底式辅导数据集,专为 K-12 教育设计,旨在让小型教育语言模型从学生实际聊天方式中学习,而非从经过修饰的单次提示中学习。数据集包含六种语言(英语、德语、乌克兰语、法语、西班牙语、意大利语)的数据,涵盖数学、科学、语言艺术和人文学科等科目。数据集是合成的,保护隐私,适用于微调小型开源模型以用于本地、隐私保护的辅导界面。此外,数据集还特别关注非正式的学生语言和多语言教学法,并提供了详细的统计信息和文件结构。
Canis TEACH Multilingual Tutoring R3 is a multilingual, register-realistic Socratic tutoring dataset for K–12, built so small educational language models learn from how students actually chat, not from polished one-shot prompts. The dataset includes data in six languages (English, German, Ukrainian, French, Spanish, Italian) and covers subjects like math, science, language arts, and humanities. It is synthetic, privacy-safe, and intended for supervised fine-tuning of small open models (≤3B) for local, privacy-preserving tutoring UIs. The dataset also focuses on informal student language, multilingual pedagogy, and provides detailed statistics and file structures.
提供机构:
CanisAI



