five

HuggingFaceTB/smol-smoltalk

收藏
Hugging Face2024-11-21 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/HuggingFaceTB/smol-smoltalk
下载链接
链接失效反馈
官方服务:
资源简介:
Smol-SmalTalk数据集是SmolTalk数据集的一个子集,专门为参数少于1B的小型模型设计。它用于训练SmolLM2-360M-Instruct和SmolLM2-135M-Instruct模型,并进行了SFT和DPO训练。与原始SmolTalk数据集相比,这个子集的对话更短,任务特定数据更少,且不包含高级数学数据集。

Smol-SmalTalk is a subset of the SmolTalk dataset adapted for smol models with less than 1B parameters. It was used to build SmolLM2-360M-Instruct and SmolLM2-135M-Instruct models, undergoing SFT and then DPO on UltraFeedback. Compared to SmolTalk, the conversations in this dataset are shorter, it includes less task-specific data, and it does not contain any advanced math datasets.
提供机构:
HuggingFaceTB
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作