genepistudios/tinystoriesclean
收藏Hugging Face2025-11-29 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/genepistudios/tinystoriesclean
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cdla-sharing-1.0
task_categories:
- text-generation
language:
- en
pretty_name: TinyStories - Cleaned dataset
size_categories:
- 1M<n<10M
---
Cleaned-up version of the TinyStories dataset (roneneldan/TinyStories) removing various artifacts (e.g., non-latin characters, non-standard punctuation, etc.)
Stories are separated by tags of this format: `<START><STORY></STORY><END>`
提供机构:
genepistudios



