teleprint-me/tinypairs
收藏Hugging Face2025-03-12 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/teleprint-me/tinypairs
下载链接
链接失效反馈官方服务:
资源简介:
TinyPairs是一个包含1000个预处理输入-目标对的英文数据集,适用于小规模语言模型的训练。数据集以JSON格式存储,每个条目由一个输入文本和一个目标文本组成。数据集来源于roneneldan/TinyStories,经过规范化文本处理和智能句子拆分,以创建适合模型训练的结构化数据对。
TinyPairs is an English dataset of 1000 preprocessed input-target pairs, suitable for training small-scale language models. The dataset is stored in JSON format, with each entry consisting of an input text and a target text. The dataset is derived from roneneldan/TinyStories, processed with text normalization and intelligent sentence splitting to create structured data pairs for model training.
提供机构:
teleprint-me



