five

itsazza/KTT-Day3

收藏
Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/itsazza/KTT-Day3
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是一个用于离线、CPU绑定的AI数学辅导系统的协调和自适应逻辑层。它包含一个ASR组件(openai/whisper-tiny,39M参数)用于主要转录和数字提取,一个自适应大脑(贝叶斯知识追踪引擎)用于实时评估5个不同的数学子技能的掌握概率以动态过滤问题难度,以及一个视觉接地(程序几何生成)以保持磁盘占用并保证推理延迟小于2.5秒。此外,该协调器还包括自定义逻辑层,用于检测和处理语言代码切换。模型在100个历史模拟学生互动的保留回放上进行了测试,结果显示其BKT模型AUC在预测下一个响应正确性方面优于标准Elo算法。由于严格的75MB占用空间限制,该管道未使用量化LLM适配器(QLoRA)进行语言生成,而是依赖于高质量的预生成本地化词干和基于规则的正则表达式提取。

This dataset is an orchestration and adaptive logic layer for an offline, CPU-bound AI Math Tutor. It includes an ASR component (openai/whisper-tiny, 39M parameters) for primary transcription and number extraction, an Adaptive Brain (Bayesian Knowledge Tracing engine) to evaluate real-time P(mastery) against 5 distinct mathematical sub-skills to dynamically filter question difficulty, and Visual Grounding (procedural geometry generation) to maintain the disk footprint and guarantee inference latency of less than 2.5s. Additionally, the orchestrator includes custom logic layered over the ASR to detect and gracefully handle language code-switching. The model was tested against an Elo-style baseline on a held-out replay of 100 historical simulated student interactions, showing that its BKT Model AUC consistently outperforms standard Elo algorithms in predicting next-response correctness. Due to the strict 75 MB footprint constraint, this pipeline does not utilize quantized LLM adapters (QLoRA) for language generation, relying instead on high-quality, pre-generated localized stems and rule-based regex extraction.
提供机构:
itsazza
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作