Supplementary data for the manuscript: Image2SMILES: Transformer-based Molecular Optical Recognition Engine
收藏Zenodo2021-07-05 更新2026-06-04 收录
下载链接:
https://zenodo.org/record/5069805
下载链接
链接失效反馈官方服务:
资源简介:
This is the supplementary data for the manuscript: Image2SMILES: Transformer-based Molecular Optical Recognition Engine It contains pairs of image-string, generated from 1M SMILES strings. These strings were randomly chosen from PubChem database.<br> It was prepared using the code, published at https://github.com/syntelly/img2smiles_generator/ To unpack do:<br> <em>tar xvf subset_1M.tar.xz && tar xvf subset_1M_dump.tar.gz && rm subset_1M_dump.tar.gz</em> You'll get the following data: subset_1M.smi - list of 1M source SMILES subset_1M_dump - directory with images subset_1M_result.csv - list of pairs FGSMILES - pathcode, first 3 chars of pathcode are corresponding subdirs in subset_1M_dump subset_1M_fails.csv - list of failed molecules from subset_1M.smi subset_1M_grpcounter.lst - list of counted groups, used in this generation You can generate your own data using https://github.com/syntelly/img2smiles_generator/
提供机构:
Zenodo
创建时间:
2021-07-05



