five

CHEN-AND: A Labeled Dataset for Chinese and English Joint Author Name Disambiguation

收藏
Zenodo2024-10-05 更新2026-04-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.7654430
下载链接
链接失效反馈
官方服务:
资源简介:
Abstract: Author name disambiguation (AND) is an important problem in literature databases and is even more prominent in the cross-language (database) context. Extensive research has been conducted in the academic community to eliminate such ambiguity. However, the existing research mainly focuses on monolingual literature, with less attention paid to author disambiguation in cross-language environments. In this regard, this study focuses on a typical cross-language author disambiguation task - Chinese and English joint author disambiguation. We first propose an automated dataset construction method for Chinese-English literature joint AND using online open resources, with this method, we create a dataset named CHEN-AND for joint Chinese and English author disambiguation. Then we propose a merging-first-then-disambiguation (MFTD)--based disambiguation framework and evaluate several variants of this method on the test dataset. For details on building this dataset, please refer to this repository on my GitHub page.
提供机构:
wuhan university
创建时间:
2024-10-05
二维码
社区交流群
二维码
科研交流群
商业服务