ReactionDataExtractor: A Tool for Automated Extraction of Information from Chemical Reaction Schemes
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/ReactionDataExtractor_A_Tool_for_Automated_Extraction_of_Information_from_Chemical_Reaction_Schemes/16624656
下载链接
链接失效反馈官方服务:
资源简介:
Chemical reaction schemes are commonly
used for visual encapsulation
of chemical information. Figures of reaction schemes contain chemical
transformations, the chemical species involved, as well as reaction
conditions. From a data-mining point of view, they constitute rich
sources, densely packed with knowledge. Yet, the challenge of automatically
extracting data from them has remained largely untackled. This work
presents ReactionDataExtractor, a software tool that can be used for
the automatic extraction of information from multistep reaction schemes.
Its capabilities include segmentation of reaction steps, regions containing
reaction conditions, chemical diagrams, as well as optical character
and structure recognition. A combination of rules and unsupervised
machine-learning approaches is used, with bespoke detection algorithms
that identify arrows, structures, labels, and conditions detection
algorithms. It can be used as a low-maintenance tool for database
generation capable of extracting data from large quantities of images
supplied by the user. On assessment using a self-generated evaluation
set, the tool achieved precision and recall metrics of between 67%
and 91% in the six core areas of data extraction. The ReactionDataExtractor
tool is released under the MIT license and is available to download
from http://www.reactiondataextractor.org.
创建时间:
2021-09-15



