five

wisenut-nlp-team/data_recipe

收藏
Hugging Face2024-09-10 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/wisenut-nlp-team/data_recipe
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: v1_1k data_files: - split: train path: v1/1k/data_recipe_v1_1k.json - config_name: v1_5k data_files: - split: train path: v1/5k/data_recipe_v1_5k.json - config_name: v1_10k data_files: - split: train path: v1/10k/data_recipe_v1_10k.json - config_name: v2_10k data_files: - split: train path: v2/10k/data_recipe_v2_10k.json - config_name: v3_10k data_files: - split: train path: v3/10k/data_recipe_v3_10k.json --- **V1** - 영어(8k) - [ShareGPT](https://huggingface.co/datasets/shibing624/sharegpt_gpt4) : 3.24k - [Claued 3 Opus](https://huggingface.co/datasets/QuietImpostor/Sao10K-Claude-3-Opus-Instruct-15K-ShareGPT) : 1.76k - [Lima](https://huggingface.co/datasets/llamafactory/lima) : 1k - [Slimorca](https://huggingface.co/datasets/Open-Orca/slimorca-deduped-cleaned-corrected) : 2k - 한국어 (2k) - [Alpca-GPT4](https://huggingface.co/datasets/FreedomIntelligence/alpaca-gpt4-korean) : 1k - [SMR](https://huggingface.co/datasets/wisenut-nlp-team/llama_ko_smr) : 0.5k - [MMLU](https://huggingface.co/datasets/HAERAE-HUB/KMMLU) : 0.5k **V2** - 영어(4k) - [ShareGPT](https://huggingface.co/datasets/shibing624/sharegpt_gpt4) : 1k - [Lima](https://huggingface.co/datasets/llamafactory/lima) : 1k - [Slimorca](https://huggingface.co/datasets/Open-Orca/slimorca-deduped-cleaned-corrected) : 1k - [Math](https://huggingface.co/datasets/EleutherAI/hendrycks_math) : 1k - 한국어 (6k) - [ShareGPT](https://huggingface.co/datasets/shibing624/sharegpt_gpt4) : 1k - [Multi-turn](https://huggingface.co/datasets/maywell/koVast) : 1k - [Alpca-GPT4](https://huggingface.co/datasets/FreedomIntelligence/alpaca-gpt4-korean) : 2k - [SMR](https://huggingface.co/datasets/wisenut-nlp-team/llama_ko_smr) : 1k - [MMLU](https://huggingface.co/datasets/ccw7463/Ko_MMLU_ver0.3) : 1k **V3** - 영어(7k) - [ShareGPT](https://huggingface.co/datasets/shibing624/sharegpt_gpt4) : 3k - [Claued 3 Opus](https://huggingface.co/datasets/QuietImpostor/Sao10K-Claude-3-Opus-Instruct-15K-ShareGPT) : 1k - [Lima](https://huggingface.co/datasets/llamafactory/lima) : 1k - [Slimorca](https://huggingface.co/datasets/Open-Orca/slimorca-deduped-cleaned-corrected) : 2k - 한국어 (3k) - [ShareGPT](https://huggingface.co/datasets/shibing624/sharegpt_gpt4) : 1k - [Multi-turn](https://huggingface.co/datasets/maywell/koVast) : 1k - [MMLU](https://huggingface.co/datasets/ccw7463/Ko_MMLU_ver0.3) : 1k
提供机构:
wisenut-nlp-team
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作