Replete-AI/Everything_Instruct
收藏Hugging Face2024-07-05 更新2024-07-06 收录
下载链接:
https://hf-mirror.com/datasets/Replete-AI/Everything_Instruct
下载链接
链接失效反馈官方服务:
资源简介:
Everything Instruct数据集是一个大规模的Alpaca指令格式数据集,涵盖了广泛的主题,旨在将开源AI的大语言模型(LLM)提升到新的水平。数据集包含多个领域的子集,如科学、社交媒体、常识、烹饪、写作、医学、历史、法律、角色扮演、新闻、编程、数学、函数调用和通用指令等。数据集的最大上下文窗口为78,451个token,且数据集未经过审查,意味着模型不会拒绝任何请求,除非另有调整。
Everything Instruct is a massive alpaca instruct formatted dataset consisting of a wide variety of topics meant to bring LLMs to the next level in open source AI. This dataset contains 5,685,816 rows with a maximum length of 78,451 tokens. It includes various fields such as science, social media, general knowledge, cooking, writing, medicine, history, law, role-play, news, coding, math, function calling, and general instruct. The dataset is fully uncensored, with a max token context window of 78,451 tokens per line.
提供机构:
Replete-AI



