five

PJMixers-Dev/O1-OPEN_OpenO1-SFT-English-CustomShareGPT

收藏
Hugging Face2025-01-27 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/PJMixers-Dev/O1-OPEN_OpenO1-SFT-English-CustomShareGPT
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集经过预处理,移除了一些样本。具体来说,因为思维模式失败而丢弃了9779个样本,因为解决方案模式失败而丢弃了221个样本,因为发现中文字符而丢弃了6703个样本。代码包含检查中文字符和从数据中提取思维和解决方案字符串的函数。然后遍历数据集,应用这些函数,并过滤掉不符合特定条件的样本,比如包含中文字符或空思维/解决方案字符串。过滤后的数据以特定格式添加到新的列表new_data中,包括指令、思考、输出以及对话结构。

The dataset has undergone preprocessing, during which some samples were removed. Specifically, 9779 samples were dropped due to thought pattern failure, 221 samples were dropped due to solution pattern failure, and 6703 samples were dropped because Chinese characters were found. The code includes functions to check for Chinese characters and to extract thought and solution strings from the data. It then iterates over the dataset, applying these functions and filtering out samples that do not meet certain criteria, such as containing Chinese characters or empty thought/solution strings. The filtered data is appended to a new list new_data with a specific format including the instruction, thinking, output, and a conversation structure.
提供机构:
PJMixers-Dev
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作