david-thrower/smol-smoltalk-plus-reasoning-synthetic-data
收藏Hugging Face2025-01-31 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/david-thrower/smol-smoltalk-plus-reasoning-synthetic-data
下载链接
链接失效反馈官方服务:
资源简介:
Smol-Smoltalk Plus Reasoning数据集是一个专为训练具有推理能力的小型语言模型(LLM)设计的专业数据集。它基于SmolTalk数据集,但针对参数少于10亿的模型进行了调整。该数据集包含了合成的推理数据,以帮助模拟大型模型如DeepSeek-R1的行为,同时保持小型架构的效率和适应性。
The Smol-Smoltalk Plus Reasoning dataset is a specialized dataset designed for training smaller language models (LLMs) with reasoning capabilities. It is based on the SmolTalk dataset but tailored for models with fewer than 1 billion parameters. The dataset incorporates synthetic reasoning data to help emulate the behaviors of larger models like DeepSeek-R1, while maintaining the efficiency and adaptability needed for smaller architectures.
提供机构:
david-thrower



