five

fenyo/Multilingual-Thinking-Test

收藏
Hugging Face2025-12-15 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/fenyo/Multilingual-Thinking-Test
下载链接
链接失效反馈
官方服务:
资源简介:
Multilingual-Thinking是一个多语言推理数据集,其思维链(chain-of-thought)从英语翻译成西班牙语、法语、意大利语和德语中的一种。该数据集通过从SmolTalk2的SystemChat子集中采样1000个训练样本,并使用另一个语言模型翻译推理轨迹而创建。此数据集在OpenAI Cookbook中用于微调OpenAI gpt-oss模型。数据集支持多种语言,包括英语、德语、法语、西班牙语和意大利语。数据集的消息格式包括developer(开发者消息,提供自定义指令)、user(用户消息,提供模型输入)、analysis(模型用于思维链的消息)、final(最终展示给用户的消息)和messages(组合上述内容生成完整对话的消息列表)。

Multilingual-Thinking is a reasoning dataset where the chain-of-thought has been translated from English into one of 4 languages: Spanish, French, Italian, and German. The dataset was created by sampling 1k training samples from the SystemChat subset of SmolTalk2 and translating the reasoning traces with another language model. This dataset was used in the OpenAI Cookbook to fine-tune the OpenAI gpt-oss models. The dataset supports multiple languages, including English, German, French, Spanish, and Italian. The message format of the dataset includes developer (provides custom instructions for the model), user (provides the input to the model), analysis (messages used by the model for its chain-of-thought), final (messages intended to be shown to the end-user), and messages (the list of messages that combine the content of the above to produce a full conversation).
提供机构:
fenyo
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作