five

Career promotions, research publications, Open Access dataset

收藏
DataCite Commons2022-02-28 更新2025-04-16 收录
下载链接:
https://ordo.open.ac.uk/articles/dataset/Career_promotions_research_publications_Open_Access_dataset/19228785
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is a compilation of processed data on citation and references for research papers including their author, institution and open access info for a selected sample of academics analysed using Microsoft Academic Graph (MAG) data and CORE. The data for this dataset was collected during December 2019 to January 2020.<br><br>Six countries (Austria, Brazil, Germany, India, Portugal, United Kingdom and United States) were the focus of the six questions which make up this dataset. There is one csv file per country and per question (36 files in total). <br>More details about the creation of this dataset are available on the public ON-MERRIT D3.1 deliverable report.The dataset is a combination of two different data sources, one part is a dataset created on analysing promotion policies across the target countries, while the second part is a set of data points available to understand the publishing behaviour. To facilitate the analysis the dataset is organised in the following seven folders:<br><br><b>PRT</b>The dataset with the file name "PRT_policies.csv" contains the related information as this was extracted from promotion, review and tenure (PRT) policies. <br><br><b>Q1: What % of papers coming from a university are Open Access?</b>- Dataset Name format: oa_status__countryname__papers.csv- Dataset Contents: Open Access (OA) status of all papers of all the universities listed in Times Higher Education World University Rankings (THEWUR) for the given country. A paper is marked OA if there is at least an OA link available. OA links are collected using the CORE Discovery API.- Important considerations about this dataset: - Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. - The service we used to recognise if a paper is OA, CORE Discovery, does not contain entries for all <i>_paperids_</i> in MAG. This implies that some of the records in the dataset extracted will not have either a true or false value for the <i>_is_OA_</i> field. - Only those records marked as true for <i>_is_OA_</i> field can be said to be OA. Others with false or no value for <i>is_OA</i> field are unknown status (i.e. not necessarily closed access).<br><b>Q2: How are papers, published by the selected universities, distributed across the three scientific disciplines of our choice?<br></b><br>- Dataset Name format: fsid__countryname__papers.csv- Dataset Contents: For the given country, all papers for all the universities listed in THEWUR with the information of <i>_fieldofstudy_</i> they belong to.- Important considerations about this dataset: * MAG can associate a paper to multiple <i>_fieldofstudyid_</i>. If a paper belongs to more than one of our <i>_fieldofstudyid_</i>, separate records were created for the paper with each of those <i>_fieldofstudyid_s</i>.- MAG assigns <i>_fieldofstudyid_</i> to every paper with a _score_. We preserve only those records whose score is more than 0.5 for any <i>_fieldofstudyid_</i> it belongs to.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.<br><b>Q3: What is the gender distribution in authorship of papers published by the universities?<br></b><br>- Dataset Name format: author_gender__countryname__papers.csv<br>- Dataset Contents: All papers with their author names for all the universities listed in THEWUR.- Important considerations about this dataset :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- An external script was executed to determine the gender of the authors. The script is available here.<br><b>Q4: Distribution of staff seniority (= number of years from their first publication until the last publication) in the given university.<br></b><br>- Dataset Name format: author_ids__countryname__papers.csv<br>- Dataset Contents: For a given country, all papers for authors with their publication year for all the universities listed in THEWUR.- Important considerations about this work :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- Calculating staff seniority can be achieved in various ways. The most straightforward option is to calculate it as <i>_academic_age = MAX(year) - MIN(year) _for each _authorid_</i>.<br><b>Q5: Citation counts (incoming) for OA vs Non-OA papers published by the university.</b><br><br>- Dataset Name format: cc_oa__countryname__papers.csv<br>- Dataset Contents: OA status and OA links for all papers of all the universities listed in THEWUR and for each of those papers, count of incoming citations available in MAG.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to.- Only those records marked as true for <i>_is_OA_</i> field can be said to be OA. Others with false or no value for <i>is_OA</i> field are unknown status (i.e. not necessarily closed access).<br><b>Q6: Count of OA vs Non-OA references (outgoing) for all papers published by universities.</b><br>- Dataset Name format: rc_oa__countryname_-papers.csv<br>- Dataset Contents: Counts of all OA and unknown papers referenced by all papers published by all the universities listed in THEWUR.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers being referenced.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.<br><b>Additional files:<br></b><br>- _fieldsofstudy_mag_.csv: this file contains a dump of <i>_fieldsofstudy_</i> table of MAG mapping each of the ids to their actual field of study name.<br><br><br>
提供机构:
The Open University
创建时间:
2022-02-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作