Harvard Inventors Benchmark
收藏DataCite Commons2025-06-01 更新2024-07-25 收录
下载链接:
https://figshare.com/articles/dataset/Harvard_Inventors_Benchmark/3502754/1
下载链接
链接失效反馈官方服务:
资源简介:
The manual Harvard disambiguation used as one of the benchmarks in the paper. In all columns, if more than one ID is found, the elements are comma separated. 6 columns, 587 rows, "|" delimited; the columns are:--Pub\_number: the publication number of the patent--ManualIDs: The manually disambiguated IDs of each inventor on the patent. --OurIDs: the output of our algorithm on this patent. --LiLow: The "low" disambiguation of Li et. al. on this patent. --LiHigh: the "high" disambiguation of Li et. al. on this patent. --RawNames: the undisambiguated names on the patent, with case and punctuation dropped. <br>
本数据集采用手动哈佛消歧方案,作为论文中的基准测试集之一。所有列中,若存在多个ID,元素间以逗号分隔。该数据集共包含6列、587行,以竖线(|)作为分隔符,各列说明如下:
--Pub_number:专利的公开编号
--ManualIDs:该专利上每位发明人的手动消歧ID
--OurIDs:本研究算法针对该专利的输出结果
--LiLow:Li等人针对该专利的「低阈值」消歧结果
--LiHigh:Li等人针对该专利的「高阈值」消歧结果
--RawNames:该专利上未经过消歧的发明人姓名,已去除大小写与标点符号。
提供机构:
figshare
创建时间:
2017-04-04



