five

Chan-Y/Opus-4.6-Reasoning-3000x-filtered-tr

收藏
Hugging Face2026-03-22 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Chan-Y/Opus-4.6-Reasoning-3000x-filtered-tr
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: problem dtype: string - name: thinking dtype: string - name: solution dtype: string - name: difficulty dtype: string - name: category dtype: string splits: - name: train num_bytes: 5935468 num_examples: 2326 download_size: 3322553 dataset_size: 5935468 configs: - config_name: default data_files: - split: train path: data/train-* language: - tr --- # 🇹🇷 Turkish Translation of Opus-4.6 Reasoning Dataset ## Dataset Description * **Name:** `Opus-4.6-Reasoning-3000x-filtered-tr` * **Original Dataset:** `nohurry/Opus-4.6-Reasoning-3000x-filtered` * **Language:** Turkish (translated from English) * **Size:** ~3000 samples * **Task:** Reasoning / Instruction Following / Chain-of-Thought This dataset is a Turkish translation of the original *Opus-4.6 Reasoning 3000x filtered* dataset. The goal is to provide high-quality reasoning data for Turkish LLM training, evaluation, and alignment tasks. --- ## Motivation High-quality reasoning datasets in Turkish are limited. This dataset aims to: * Improve Turkish LLM reasoning capabilities * Enable better instruction-following in Turkish * Support fine-tuning and evaluation for Turkish AI systems --- ## Data Source * Original dataset: [`nohurry/Opus-4.6-Reasoning-3000x-filtered`](https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered) * The dataset consists of curated reasoning samples with structured prompts and responses. --- ## Translation Details * **Model used:** [`tencent/HY-MT1.5-1.8B`](https://huggingface.co/tencent/HY-MT1.5-1.8B) * **Method:** Fully automatic translation

数据集信息: 特征字段: - 字段名:problem,数据类型:字符串 - 字段名:thinking,数据类型:字符串 - 字段名:solution,数据类型:字符串 - 字段名:difficulty,数据类型:字符串 - 字段名:category,数据类型:字符串 数据划分: - 划分名称:train,数据字节数:5935468,样本数量:2326 下载体积:3322553 字节,数据集总大小:5935468 字节 配置项: - 配置名称:default,数据文件: - 对应划分:train,文件路径:data/train-* 支持语言: - tr # 🇹🇷 Opus-4.6 推理数据集土耳其语翻译版 ## 数据集说明 * **数据集名称**:`Opus-4.6-Reasoning-3000x-filtered-tr` * **原始数据集**:`nohurry/Opus-4.6-Reasoning-3000x-filtered` * **语言属性**:土耳其语(由英语翻译生成) * **样本规模**:约3000条样本 * **任务范畴**:推理/指令遵循/思维链(Chain-of-Thought) 本数据集为原始*Opus-4.6 推理3000x过滤版*数据集的土耳其语翻译版本,旨在为土耳其语大语言模型(Large Language Model,LLM)的训练、评估与对齐任务提供高质量的推理数据支撑。 ## 项目动机 当前面向土耳其语的高质量推理数据集较为匮乏,本数据集旨在实现以下目标: * 提升土耳其语大语言模型的推理能力 * 优化土耳其语场景下的指令遵循效果 * 为土耳其语AI系统的微调与评估提供支撑 ## 数据来源 * 原始数据集:[`nohurry/Opus-4.6-Reasoning-3000x-filtered`](https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered) * 本数据集包含经过精心筛选的推理样本,均带有结构化的提示词与回复内容。 ## 翻译细节 * **所用翻译模型**:[`tencent/HY-MT1.5-1.8B`](https://huggingface.co/tencent/HY-MT1.5-1.8B) * **翻译方法**:全自动机器翻译
提供机构:
Chan-Y
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作