CHEN-AND: A Labeled Dataset for Chinese and English Joint Author Name Disambiguation

Name: CHEN-AND: A Labeled Dataset for Chinese and English Joint Author Name Disambiguation
Creator: wuhan university
Published: 2024-10-05 00:00:00
License: 暂无描述

Zenodo2024-10-05 更新2026-04-07 收录

下载链接：

https://zenodo.org/doi/10.5281/zenodo.7654430

下载链接

链接失效反馈

官方服务：

资源简介：

Abstract: Author name disambiguation (AND) is an important problem in literature databases and is even more prominent in the cross-language (database) context. Extensive research has been conducted in the academic community to eliminate such ambiguity. However, the existing research mainly focuses on monolingual literature, with less attention paid to author disambiguation in cross-language environments. In this regard, this study focuses on a typical cross-language author disambiguation task - Chinese and English joint author disambiguation. We first propose an automated dataset construction method for Chinese-English literature joint AND using online open resources, with this method, we create a dataset named CHEN-AND for joint Chinese and English author disambiguation. Then we propose a merging-first-then-disambiguation (MFTD)--based disambiguation framework and evaluate several variants of this method on the test dataset. For details on building this dataset, please refer to this repository on my GitHub page.

提供机构：

wuhan university

创建时间：

2024-10-05