five

InstructionWild-v2

收藏
OpenXLab2026-04-18 收录
下载链接:
https://openxlab.org.cn/datasets/OpenDataLab/InstructionWild-v2
下载链接
链接失效反馈
官方服务:
资源简介:
Instruction Tuning is a key component of ChatGPT. OpenAI used their user-based Instruction dataset, but unfortunately, this dataset is not open-sourced. Self-Instruct released a small instruction dataset including 175 instructions written by human labors. Standford Alpaca Team generated 52K instructions by model based on the the 175 seed instructions above.text-davinci-003 This project targets on a larger and more diverse instruction dataset. To this end, we collected (110K in v2 dataset, 429 in v1 dataset) instructions from ChatGPT usage sharing and released both English and Chinese versions. We found these instructions are very diverse. We follow Alpaca to generate 52K instructions and their responses. All data can be found in and dir.datadata v2 Note: This is an ongoing project. We are still collecting and improving our data. We release this dataset as early as possible to speedup our LLM research. We will also release a whitepaper soon.
提供机构:
OpenDataLab
创建时间:
2024-04-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作