five

ReactionDataExtractor: A Tool for Automated Extraction of Information from Chemical Reaction Schemes

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/ReactionDataExtractor_A_Tool_for_Automated_Extraction_of_Information_from_Chemical_Reaction_Schemes/16624656
下载链接
链接失效反馈
官方服务:
资源简介:
Chemical reaction schemes are commonly used for visual encapsulation of chemical information. Figures of reaction schemes contain chemical transformations, the chemical species involved, as well as reaction conditions. From a data-mining point of view, they constitute rich sources, densely packed with knowledge. Yet, the challenge of automatically extracting data from them has remained largely untackled. This work presents ReactionDataExtractor, a software tool that can be used for the automatic extraction of information from multistep reaction schemes. Its capabilities include segmentation of reaction steps, regions containing reaction conditions, chemical diagrams, as well as optical character and structure recognition. A combination of rules and unsupervised machine-learning approaches is used, with bespoke detection algorithms that identify arrows, structures, labels, and conditions detection algorithms. It can be used as a low-maintenance tool for database generation capable of extracting data from large quantities of images supplied by the user. On assessment using a self-generated evaluation set, the tool achieved precision and recall metrics of between 67% and 91% in the six core areas of data extraction. The ReactionDataExtractor tool is released under the MIT license and is available to download from http://www.reactiondataextractor.org.
创建时间:
2021-09-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作