summykai/chemistry-sft-ultra
收藏Hugging Face2025-12-18 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/summykai/chemistry-sft-ultra
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为Chemistry SFT Ultra,是一个由多个上游化学相关数据集合并而成的语料库,专为文本生成任务设计。数据集包含1,370,322行数据,每行包含messages和metadata两个主要列。messages列是一个聊天风格的消息列表,包含role和content字段;metadata列则提供了关于数据来源、任务、语言、许可证以及化学特定字段的详细信息。数据集以英语为主,适用于指令微调和基于聊天的文本生成。README还提供了使用示例、使用注意事项(包括许可证详情)以及引用信息。
The dataset named Chemistry SFT Ultra is a merged corpus built from multiple upstream sources related to chemistry, designed for text-generation tasks. It contains 1,370,322 rows, each with messages and metadata columns. The messages column is a list of chat-style messages with role and content fields, while the metadata column provides detailed information about the source, task, language, license, and chemistry-specific fields. The dataset is English-focused and intended for instruction-tuning and chat-based text generation. The README also provides usage examples, considerations for use (including licensing details), and citation information.
提供机构:
summykai



