sequelbox/Celestia2
收藏Hugging Face2024-12-04 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/sequelbox/Celestia2
下载链接
链接失效反馈官方服务:
资源简介:
Celestia 2是一个多轮对话的代理指令数据集,专注于科学领域的数据。该数据集包含17.6万行合成的多轮科学指令数据,使用了Microsoft的AgentInstruct风格,所有提示和响应都是使用Llama 3.1 405b Instruct模型生成的。主要学科包括物理、化学、生物和计算机科学,次要学科包括地球科学、天文学和信息论。数据集是合成的,未经人工审查。
Celestia 2 is a multi-turn agent-instruct dataset focusing on science data. The dataset contains 176k rows of synthetic multi-turn science-instruct data, generated using Microsofts AgentInstruct style, with all prompts and responses synthetically created using Llama 3.1 405b Instruct. Primary subjects include physics, chemistry, biology, and computer science; secondary subjects include Earth science, astronomy, and information theory. The dataset is synthetic and has not been subject to manual review.
提供机构:
sequelbox



