five

Rethinking the Masking Strategy for Pretraining Molecular Graphs from a Data-Centric View

收藏
Figshare2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Rethinking_the_Masking_Strategy_for_Pretraining_Molecular_Graphs_from_a_Data-Centric_View/25745165
下载链接
链接失效反馈
官方服务:
资源简介:
Node-level self-supervised learning has been widely applied for pretraining molecular graphs. Attribute Masking (AttrMask) is pioneering work in this field, and its improved methods focus on enhancing the capacity of the backbone models by incorporating additional modules. However, these methods overlook the imbalanced atom distribution due to employing only the random masking strategy to mask atoms for pretraining. According to the properties of molecules, we propose a weighted masking strategy to enhance the capacity of pretrained models by more effective utilization of molecular information while pretraining. Our experimental results demonstrate that AttrMask combined with our proposed weighted masking strategy yields superior performance compared to the random masking strategy, even surpassing the model-centric improvement methods without increasing the parameters. Additionally, our weighted masking strategy can be extended to other pretraining methods to achieve enhanced performance.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作