EleutherAI/muInstruct
收藏Hugging Face2024-03-12 更新2024-04-19 收录
下载链接:
https://hf-mirror.com/datasets/EleutherAI/muInstruct
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text2text-generation
language:
- en
tags:
- math
size_categories:
- 1K<n<10K
---
**μInstruct** is a dataset of 1600 instruction-response pairs collected from highly-rated Stack Exchange answers, the Khan Academy subset of [AMPS](https://github.com/hendrycks/math), and the [MATH](https://huggingface.co/datasets/hendrycks/competition_math) training set. All training examples are valid Markdown have been manually reviewed by a human for quality.
The μInstruct dataset is most useful when mixed in with larger instruction or chat datasets, such as [OpenHermes](https://huggingface.co/datasets/teknium/OpenHermes-2.5). Because μInstruct is especially high-quality, you may consider oversampling it in your training mixture.
μInstruct was used to train [`llemma_7b_muinstruct_camelmath`](https://huggingface.co/EleutherAI/llemma_7b_muinstruct_camelmath).
提供机构:
EleutherAI
原始信息汇总
μInstruct数据集包含1600个指令-响应对,这些数据来源于高评分的Stack Exchange答案、Khan Academy子集以及MATH训练集。所有训练示例均为有效的Markdown格式,并经过人工质量审查。



