HuggingFaceTB/smol-smoltalk
收藏Hugging Face2024-11-21 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/HuggingFaceTB/smol-smoltalk
下载链接
链接失效反馈官方服务:
资源简介:
Smol-SmalTalk数据集是SmolTalk数据集的一个子集,专门为参数少于1B的小型模型设计。它用于训练SmolLM2-360M-Instruct和SmolLM2-135M-Instruct模型,并进行了SFT和DPO训练。与原始SmolTalk数据集相比,这个子集的对话更短,任务特定数据更少,且不包含高级数学数据集。
Smol-SmalTalk is a subset of the SmolTalk dataset adapted for smol models with less than 1B parameters. It was used to build SmolLM2-360M-Instruct and SmolLM2-135M-Instruct models, undergoing SFT and then DPO on UltraFeedback. Compared to SmolTalk, the conversations in this dataset are shorter, it includes less task-specific data, and it does not contain any advanced math datasets.
提供机构:
HuggingFaceTB



