five

Researchers' first name variations, based on ORCID

收藏
DataCite Commons2025-07-11 更新2025-09-08 收录
下载链接:
https://figshare.com/articles/dataset/Researchers_first_name_variations_based_on_ORCID/29544386
下载链接
链接失效反馈
官方服务:
资源简介:
First name variation based on ORCID, as present in Dimensions' open data platform.Query used<pre><pre>WITH name_variant AS (<br> SELECT<br> TRIM(LOWER(pname.given_names)) AS first_name,<br> TRIM(LOWER(variant.content)) AS variant_name,<br> COUNT(DISTINCT s.orcid_identifier.path) AS researcher_count<br> FROM<br> ds-open-datasets.orcid.summaries_2024 AS s,<br> UNNEST(s.person.other_names.names) AS variant<br> JOIN<br> UNNEST([s.person.name]) AS pname<br> WHERE<br> pname.given_names IS NOT NULL<br> AND variant.content IS NOT NULL<br> AND TRIM(LOWER(pname.given_names)) != TRIM(LOWER(variant.content))<br> AND NOT REGEXP_CONTAINS(variant.content, r"\s") -- remove variants with whitespace<br> AND NOT REGEXP_CONTAINS(variant.content, r"[\?\.,]") -- remove variants with punctuation<br> AND NOT REGEXP_CONTAINS(LOWER(variant.content), r"^(dr|professor|phd|doctor|n/a|reviewer|sociologist|lecturer|lecture|researcher|architect|everyone|physician)\b")<br> AND NOT REGEXP_CONTAINS(LOWER(variant.content), r"\b(dr|professor|mr|ms|mrs|phd|doctor|n/a|reviewer|sociologist|lecturer|lecture|researcher|architect|everyone|physician)\b")<br> GROUP BY<br> first_name, variant_name<br> ORDER BY<br> researcher_count DESC<br>)<br>SELECT *<br>FROM name_varianWITH name_variant AS (<br> SELECT<br> TRIM(LOWER(pname.given_names)) AS first_name,<br> TRIM(LOWER(variant.content)) AS variant_name,<br> COUNT(DISTINCT s.orcid_identifier.path) AS researcher_count<br> FROM<br> ds-open-datasets.orcid.summaries_2024 AS s,<br> UNNEST(s.person.other_names.names) AS variant<br> JOIN<br> UNNEST([s.person.name]) AS pname<br> WHERE<br> pname.given_names IS NOT NULL<br> AND variant.content IS NOT NULL<br> AND TRIM(LOWER(pname.given_names)) != TRIM(LOWER(variant.content))<br> AND NOT REGEXP_CONTAINS(variant.content, r"\s") -- remove variants with whitespace<br> AND NOT REGEXP_CONTAINS(variant.content, r"[\?\.,]") -- remove variants with punctuation<br> AND NOT REGEXP_CONTAINS(LOWER(variant.content), r"^(dr|professor|phd|doctor|n/a|reviewer|sociologist|lecturer|lecture|researcher|architect|everyone|physician)\b")<br> AND NOT REGEXP_CONTAINS(LOWER(variant.content), r"\b(dr|professor|mr|ms|mrs|phd|doctor|n/a|reviewer|sociologist|lecturer|lecture|researcher|architect|everyone|physician)\b")<br> GROUP BY<br> first_name, variant_name<br> ORDER BY<br> researcher_count DESC<br>)<br>SELECT *<br>FROM name_variant<br>WHERE researcher_count &gt; 1;<br>WHERE researcher_count &gt; 1;</pre></pre>
提供机构:
figshare
创建时间:
2025-07-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作