ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep Learning
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/ChemGrapher_Optical_Graph_Recognition_of_Chemical_Compounds_by_Deep_Learning/13004572
下载链接
链接失效反馈官方服务:
资源简介:
In
drug discovery, knowledge of the graph structure of chemical
compounds is essential. Many thousands of scientific articles and
patents in chemistry and pharmaceutical sciences have investigated
chemical compounds, but in many cases, the details of the structure
of these chemical compounds are published only as an image. A tool
to analyze these images automatically and convert them into a chemical
graph structure would be useful for many applications, such as drug
discovery. A few such tools are available and they are mostly derived
from optical character recognition. However, our evaluation of the
performance of these tools reveals that they often make mistakes in
recognizing the correct bond multiplicity and stereochemical information.
In addition, errors sometimes even lead to missing atoms in the resulting
graph. In our work, we address these issues by developing a compound
recognition method based on machine learning. More specifically, we
develop a deep neural network model for optical compound recognition.
The deep learning solution presented here consists of a segmentation
model, followed by three classification models that predict atom locations,
bonds, and charges. Furthermore, this model not only predicts the
graph structure of the molecule but also provides all information
necessary to relate each component of the resulting graph to the source
image. This solution is scalable and can rapidly process thousands
of images. Finally, we empirically compare the proposed method with
the well-established tool OSRA1 and observe significant
error reduction.
创建时间:
2020-09-14



