Document-level Relation Extraction Method Based on Data Augmentation and Dynamic Threshold
收藏中国科学数据2026-04-13 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.19678/j.issn.1000-3428.0070117
下载链接
链接失效反馈官方服务:
资源简介:
Relationship Extraction (RE) tasks in the biomedical field often face issues such as data scarcity, class imbalance, and multiple labels. To address these issues, a method that combines data augmentation with a dynamic threshold strategy is proposed. First, the GPT model is fine tuned using a custom loss function and new data is generated based on the Word2Vec model by obtaining feature templates. Second, the BERT classifier is used to screen the generated data, combining high-quality samples with the original dataset to form a richer training set. Finally, a learnable dynamic threshold strategy is proposed to dynamically adjust the classification threshold based on document length and the difference between model output and real labels, enabling the model to flexibly handle multi-label document problems. Experimental results on two publicly available medical datasets showed that the method achieved F1 values of 84.1% and 69.3%, which were 1.6 and 1.1 percentage points higher than those of the ATLOP method, respectively, verifying the effectiveness of the method.
创建时间:
2026-04-13



