CHEN-AND: A Labeled Dataset for Chinese and English Joint Author Name Disambiguation
收藏Zenodo2024-10-05 更新2026-04-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.7654430
下载链接
链接失效反馈官方服务:
资源简介:
Abstract: Author name disambiguation (AND) is an important problem in literature databases and is even more prominent in the cross-language (database) context. Extensive research has been conducted in the academic community to eliminate such ambiguity. However, the existing research mainly focuses on monolingual literature, with less attention paid to author disambiguation in cross-language environments. In this regard, this study focuses on a typical cross-language author disambiguation task - Chinese and English joint author disambiguation. We first propose an automated dataset construction method for Chinese-English literature joint AND using online open resources, with this method, we create a dataset named CHEN-AND for joint Chinese and English author disambiguation. Then we propose a merging-first-then-disambiguation (MFTD)--based disambiguation framework and evaluate several variants of this method on the test dataset. For details on building this dataset, please refer to this repository on my GitHub page.
提供机构:
wuhan university
创建时间:
2024-10-05



