LLMcompe-Team-Watanabe/ChemData700K_preprocess
收藏Hugging Face2025-08-08 更新2025-08-09 收录
下载链接:
https://hf-mirror.com/datasets/LLMcompe-Team-Watanabe/ChemData700K_preprocess
下载链接
链接失效反馈官方服务:
资源简介:
ChemData700K预处理的化学数据集,它是AI4Chem/ChemData700K数据集的预处理版本。该数据集通过过滤掉对话部分和顶层指令的样本、格式化输出列、重命名输入和输出列以及移除其他列进行了预处理。最终数据集包含问题和答案两列,并且仅提供了训练集,大小为373,928条记录。
ChemData700K Preprocessed is a preprocessed version of the AI4Chem/ChemData700K dataset. It has been processed by filtering out samples that are part of a conversation or have a top-level instruction, formatting the output column, renaming the input and output columns, and pruning other columns. The final dataset includes question and answer columns, and only provides a training split with a size of 373,928 entries.
提供机构:
LLMcompe-Team-Watanabe



