Ba2han/merged_sft_mix
收藏Hugging Face2025-12-21 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Ba2han/merged_sft_mix
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: text
dtype: string
splits:
- name: train
num_bytes: 9100078548
num_examples: 2928070
- name: test
num_bytes: 138580191
num_examples: 44590
download_size: 2372931905
dataset_size: 9238658739
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
---
("Ba2han/fineweb_augmented-instruct-TR", "train"),
("Ba2han/huge-sft-2", "train"),
("nvidia/Nemotron-Instruction-Following-Chat-v1", "chat_if"),
"tahsinsoyak/agriculture-qa-turkish-translated",
"turkish-nlp-suite/InstrucTurca",
"erythropygia/Instruct-Python-Code-Turkish",
"avometre/turkish-wikipedia-qa",
"Ba2han/Multiple_Choice-Turkish",
- Dropped longest 25%
- Dropped char_len < 200
提供机构:
Ba2han



