AllyArc/allyarc_oai_format
收藏Hugging Face2024-04-09 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/AllyArc/allyarc_oai_format
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- question-answering
language:
- en
pretty_name: AllyArc OpenAI Dataset Format
size_categories:
- 1K<n<10K
---
# Dataset Card for AllyArc/allyarc_oai_format
This dataset card provides a structured overview of the AllyArc/allyarc_oai_format dataset, designed for training conversational AI models tailored for educational purposes, with a special focus on supporting students with diverse learning needs, including those in Special Educational Needs (SEN) education.
## Dataset Details
### Dataset Description
The AllyArc/allyarc_oai_format dataset is comprised of conversational exchanges formatted to support the training of AI models for educational dialogues. It includes interactions that cover a wide range of educational support tasks, such as providing detailed explanations (breakdowns), adapting to various learning styles, incorporating student interests into lessons, and managing classroom dynamics tailored to SEN education.
- **Curated by:** Zainab Fahim
- **Language(s) (NLP):** English
- **License:** MIT License
### Dataset Sources
- **Repository:** Hugging Face Datasets
## Uses
### Direct Use
The dataset is intended for direct use in training conversational AI models to:
- Understand and respond to educational queries.
- Personalize interactions based on student needs and learning styles.
- Provide breakdowns of complex educational content.
- Engage students with tailored educational strategies.
### Out-of-Scope Use
This dataset is not intended for uses beyond educational support. Specifically, it should not be used for:
- Commercial advertising.
- Non-educational chatbot training.
- Any form of decision-making that could negatively impact students' wellbeing.
## Dataset Structure
The dataset is structured into dialogues, each containing multiple turns with roles (`system`, `user`, `assistant`) indicating the speaker. It includes fields for dialogue ID, turns, education level, subject matter, and feedback mechanisms, facilitating comprehensive educational dialogues.
## Dataset Creation
### Curation Rationale
The dataset was curated to address the nuanced needs of SEN education, focusing on creating a supportive, interactive, and adaptive learning environment through AI-driven dialogues.
### Source Data
#### Data Collection and Processing
Data collection involved simulating educational dialogues that reflect typical interactions between students and an educational AI. The process emphasized personalization, adaptability, and inclusivity, considering the diverse needs of SEN students.
#### Who are the source data producers?
The data was produced by educational specialists, SEN teachers, and AI developers, with input from SEN students to ensure authenticity and relevance.
### Annotations
#### Annotation process
The dialogues were annotated with educational intent, subject matter tags, and personalized learning strategies to facilitate model training on educational tasks.
#### Who are the annotators?
Educational specialists and SEN teachers annotated the dataset, ensuring that the dialogues accurately reflect educational best practices and SEN considerations.
## Bias, Risks, and Limitations
The dataset aims to minimize bias by including diverse educational needs and learning styles. However, users should be aware of the limitations in scope and ensure models trained on this dataset are used ethically and considerately in educational contexts.
## Citation
**APA:**
AllyArc Educational Team. (2023). AllyArc/allyarc_oai_format Dataset. Hugging Face. URL
**BibTeX:**
```bibtex
@misc{allyarc2023dataset,
title={AllyArc/allyarc_oai_format Dataset},
author={AllyArc Educational Team},
year={2023},
publisher={Hugging Face},
howpublished={\url{}},
}
```
## Dataset Card Authors
Zainab Fahim
## Dataset Card Contact
For inquiries related to the AllyArc/allyarc_oai_format dataset, please contact: [Zainab Fahim](mailto:shafna.zainab.fahim@gmail.com)
提供机构:
AllyArc
原始信息汇总
数据集概述
基本信息
- 数据集名称: AllyArc/allyarc_oai_format
- 许可证: MIT License
- 语言: 英语
- 任务类别: 问答
- 数据集大小: 1K<n<10K
- 数据集格式: 对话交换格式,支持AI模型训练
- 数据集目的: 训练针对教育目的的对话AI模型,特别关注特殊教育需求(SEN)学生
数据集详情
数据集描述
- 内容: 包含多种教育支持任务的对话,如提供详细解释、适应不同学习风格、结合学生兴趣和适应SEN教育。
- 创建者: Zainab Fahim
数据集结构
- 结构: 对话形式,包含对话ID、回合、教育水平、科目和反馈机制。
- 角色:
system,user,assistant
数据集创建
- 采集与处理: 模拟教育对话,强调个性化、适应性和包容性。
- 数据生产者: 教育专家、SEN教师和AI开发者。
- 标注过程: 由教育专家和SEN教师进行,确保对话反映教育最佳实践和SEN考虑。
使用限制
- 直接用途: 用于训练AI模型理解并响应教育查询,个性化交互,提供教育内容分解,以及采用定制教育策略。
- 非预期用途: 不应用于商业广告、非教育性聊天机器人训练或可能对学生福祉产生负面影响的决策。
偏差、风险和限制
- 目标: 通过包含多样化的教育需求和学习风格来最小化偏差。
- 注意事项: 用户应意识到数据集的局限性,并确保模型在教育环境中使用时遵循伦理和考虑周到。
联系信息
- 作者: Zainab Fahim
- 联系方式: Zainab Fahim



