Patent Descriptor File
收藏Mendeley Data2024-06-29 更新2024-06-27 收录
下载链接:
https://figshare.com/articles/dataset/Patent_Descriptor_File/3502706/1
下载链接
链接失效反馈官方服务:
资源简介:
This is a list of every patent with a disambiguated assignee or inventor in our disambiguation. There are 7 columns and ~9.3M lines, "|" delimited; the columns are: --pat: the patent publication number. Each can be linked exactly to either the OECD or USPTO databases for further processing if needed.--invs: the inventors on the patent. If multiple inventors are found, they are comma separated.--localInvs: the local IDs of the inventors. If no mobile inventor is found, this is identical to the invs column. If multiple inventors are found, they are comma separated. --apps: the assignees (applicants) on the patent. If multiple assignees are found, they are comma separated. --yr: the application year of the patent--classes: the WIPO industrial fields of the patent. If there were more than one field, they are comma separated. These are integers between 1 and 36, the meaning of which is in the file wipo_class_ID.txt.--wasComplete: a flag to indicate whether too many or too few inventors or assignees were found. A "1" indicates no problem was detected with this patent, a "0" indicates a count difference between disambiguated IDs and names found on the patent. There are 134K errors in the data. For example, the line EP0000013|HI478595,HM167287,HM246163|HI1085135,HI478595,HI587028|HA45|1978|3,4|1means that patent EP0000013 had the three indicated inventors (two of whom are mobile), one indicated assignee, application year 1978, had the two indicated WIPO classes (identified in the wipo_class_ID.txt file), and appears to be a complete disambiguation of all listed entities (no error flag). In the paper, we describe dividing disambiguation into high- and low-quality, based on whether geolocation and name cleaning was performed for that ID. These are indicated by the first two characters in the IDs themselves: High quality IDs:--HA: High-resolution Assignee (linked in step 1)--HI: High-resolution Inventor (linked in step 1)--HX: High-resolution cross-linked inventor (linked in step 5)--HM: High-resolution mobile inventor (linked in step 5)--LX: Low-resolution cross-linked inventor (linked in step 5)--LM: Low-resolution mobile inventor (linked in step 5) Low quality IDs:--LA: Low-resolution Assignee--HS: High-resolution Split Inventor--LI: Low-resolution inventor--LS: Low-resolution Split Inventor--UI: Unlocated inventor--UX: Unlocated cross-linked inventor--US: Unlocated split inventor Each of these two-letter codes are followed by a number:--For assignees (where the second character is "A"), they indicate that assignee's rank by total number of patents. In the case of a tie, the rank is alphabetical. For example, HA1 is a high resolution assignee with the most patents worldwide, and LA4 is a low resolution assignee with the fourth most patents worldwide. --For local inventor IDs (where the second character is NOT "M" or "A"), they indicate the rank of that local ID by total number of patents. In the case of a tie, the rank is alphabetical. For example, HI1 is the most prolific high-resolution inventor in a single 20km region, while LI2 is the second-most prolific inventor in a single 20km region. --For mobile inventor IDs (where the second character is an "M"), they indicate the rank among other mobile IDs of that Inventor's most prolific local ID. For example, LM1 is the mobile inventor with the greatest number of patents in one 20km region, and all geolocatons for that inventor were low resolution (indicated by the "L"). HM2 is the mobile inventor with the second greatest number of patents in one 20km region, and at least one geolocation was high resolution (indicated by the "H").
创建时间:
2023-06-28



