crumb/flan-ul2-tinystories-complex
收藏Hugging Face2023-07-08 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/crumb/flan-ul2-tinystories-complex
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
language:
- en
---
Around a quarter of a million examples generated from Flan-UL2 (20b) with the prompt "Write a complex short story using the vocabulary of a third-grader." to be used in an experimental curriculum learning setting. I had to checkpoint every 1024 examples to mitigate the program slowing down due to memory usage. This was run in bf16 on an RTXA6000 with the following settings:
```
top_k = random between (40, 128)
temperature = random between (0.6, 0.95)
max_length = 128
batch_size = 32
```
I wanted a less uniform boring set with the same exact patterns so I randomly modulate the temperature and top_k values to get a good mix. This cost ~$6 usd to create on runpod.
提供机构:
crumb
原始信息汇总
数据集概述
- 数据集名称:Flan-UL2 (20b) 生成的复杂短故事集
- 数据量:约250,000个示例
- 生成目的:用于实验性课程学习设置
- 生成提示:使用三年级学生的词汇编写一个复杂的短故事
- 生成参数:
- top_k:随机在40到128之间选择
- temperature:随机在0.6到0.95之间选择
- max_length:128
- batch_size:32
- 运行环境:在RTXA6000上以bf16格式运行
- 存储方式:每1024个示例进行一次检查点存储,以防止程序因内存使用而变慢
- 成本:约$6美元
- 许可证:MIT许可证
- 语言:英语



