five

ORCID Public Data File 2019|学术研究数据集|研究人员信息管理数据集

收藏
DataCite Commons2020-08-26 更新2024-07-13 收录
学术研究
研究人员信息管理
下载链接:
https://orcid.figshare.com/articles/ORCID_Public_Data_File_2019/9988322/2
下载链接
链接失效反馈
资源简介:
These files contain a snapshot of all public data in the ORCID Registry associated with an ORCID record that was created or claimed by an individual as of October 1st, 2019. ORCID publishes this file once per year under a Creative Commons CC0 1.0 Universal public domain dedication. This means that, to the extent possible under law, ORCID has waived all copyright and related or neighbouring rights to the Public Data File. For more information on the file, see https://orcid.org/content/orcid-public-data-file-use-policy<br>The file contains the public information associated with each user's ORCID record. The data is available in XML format and is further divided into separate files for easier management. One file contains the full record summary for each record. The rest of the data is divided into 11 files which contain the activities for each record including full work data.<br>Below is more complete description of how the data is structured.<br><b>Summaries file</b><br>Name: ORCID_2019_summaries.tar.gzDescription: Contains all the existing summaries, when extracted, it will generate the following file structure: summaries/[3 digits checksum]/[iD].xmlExample: If you are looking for the summary of iD '0000-0002-7869-831X', decompress the file and you will find the summary under 'summaries/31X/0000-0002-7869-831X.xml'.<b><br></b><b>Activities files</b><br>Named: <br> - ORCID_2019_activites_0.tar.gz - ORCID_2019_activites_1.tar.gz - ORCID_2019_activites_2.tar.gz - ORCID_2019_activites_3.tar.gz - ORCID_2019_activites_4.tar.gz - ORCID_2019_activites_5.tar.gz - ORCID_2019_activites_6.tar.gz - ORCID_2019_activites_7.tar.gz - ORCID_2019_activites_8.tar.gz - ORCID_2019_activites_9.tar.gz - ORCID_2019_activites_X.tar.gz<br>Description: Consists of 11 .tar.gz files, each file contains the public activities that belongs to an iD that contains a given checksum. The file hierarchy is as follows: [checksum]/[3 digits checksum]/[iD]/[activity type]/[iD]_[activity_type]_[putcode].xml<br><b>Examples: </b><br>If you are looking for the public activities that belong to `0000-0002-7869-831X: <br>Decompress the file 'ORCID_2019_activites_X.tar.gz'.You will find all the public activities under 'X/31X/0000-0002-7869-831X/' which are then sub-divided in folders for each activity type.<br>If you are looking for all the employments that belong to '0000-0002-7869-831X': <br>Decompress the file 'ORCID_2019_activites_X.tar.gz',Navigate to 'X/31X/0000-0002-7869-831X/employments'.<br>If you are looking for the employment with put-code '7923980' that belongs to '0000-0002-7869-831X' : <br>Decompress the file 'ORCID_2019_activites_X.tar.gz'.You will find that employment under 'X/31X/0000-0002-7869-831X/employments/0000-0002-7869-831X_employments_7923980.xml'.<br><br><b><br></b><b>Companion Resources:</b><br><br>ORCID 3.0 XSD: https://github.com/ORCID/orcid-model/tree/master/src/main/resources/record_3.0#orcid-api-v30-guide<br>2018 File: https://doi.org/10.23640/07243.7234028.v12017 File: https://doi.org/10.6084/m9.figshare.5479792.v12016 File: https://doi.org/10.6084/m9.figshare.41340272015 File: https://dx.doi.org/10.6084/m9.figshare.15827052014 File: http://dx.doi.org/10.14454/07243.2014.0012013 File: http://dx.doi.org/10.14454/07243.2013.001
提供机构:
ORCID
创建时间:
2019-10-17
用户留言
有没有相关的论文或文献参考?
这个数据集是基于什么背景创建的?
数据集的作者是谁?
能帮我联系到这个数据集的作者吗?
这个数据集如何下载?
点击留言
数据主题
具身智能
数据集  4098个
机构  8个
大模型
数据集  439个
机构  10个
无人机
数据集  37个
机构  6个
指令微调
数据集  36个
机构  6个
蛋白质结构
数据集  50个
机构  8个
空间智能
数据集  21个
机构  5个
5,000+
优质数据集
54 个
任务类型
进入经典数据集
热门数据集

Plant-Diseases

Dataset for Plant Diseases containg variours Plant Disease

kaggle 收录

中国空气质量数据集(2014-2020年)

数据集中的空气质量数据类型包括PM2.5, PM10, SO2, NO2, O3, CO, AQI,包含了2014-2020年全国360个城市的逐日空气质量监测数据。监测数据来自中国环境监测总站的全国城市空气质量实时发布平台,每日更新。数据集的原始文件为CSV的文本记录,通过空间化处理生产出Shape格式的空间数据。数据集包括CSV格式和Shape格式两数数据格式。

国家地球系统科学数据中心 收录

ChineseSafe

ChineseSafe是由南方科技大学统计与数据科学系创建的一个中文安全评估基准数据集,旨在评估大型语言模型在识别中文不安全内容方面的能力。该数据集包含205,034个样本,涵盖4个类别和10个子类别的安全问题,特别关注政治敏感性、色情内容和变体/同音词等新型安全问题。数据集通过从开源数据集和互联网资源中收集数据,经过数据清洗和去重处理,确保了数据集的高质量和多样性。ChineseSafe的应用领域主要集中在大型语言模型的安全评估,旨在帮助开发者和研究者提升模型在实际应用中的安全性。

arXiv 收录

HyperGlobal-450K - 全球最大规模高光谱图像数据集

HyperGlobal-450K数据集由武汉大学联合国内外多所知名高校及研究机构共同构建,是迄今为止全球规模最大的高光谱图像数据集。该数据集包含约45万张高光谱图像,规模等价于超过2000万张不重叠的三波段图像,远超现有的同类数据集。数据集涵盖了全球范围内的高光谱遥感图像,包括来自地球观测一号(EO-1)Hyperion和高分五号(GF-5B)两种传感器的图像,光谱范围从可见光到短波及中波红外,具有从紫外到长波红外的330个光谱波段,空间分辨率为30米。每幅图像经过精心处理,去除了无效波段和水汽吸收波段,保留了具有实际应用价值的光谱信息。HyperGlobal-450K数据集不仅支持高光谱图像的基础研究,还能够用于开发和测试各种高光谱图像处理方法,比如图像分类、目标检测、异常检测、变化检测、光谱解混、图像去噪和超分辨率等任务。

github 收录

NuminaMath-CoT

数据集包含约86万道数学题目,每道题目的解答都采用思维链(Chain of Thought, CoT)格式。数据来源包括中国高中数学练习题以及美国和国际数学奥林匹克竞赛题目。数据主要从在线考试试卷PDF和数学讨论论坛收集。处理步骤包括从原始PDF中进行OCR识别、分割成问题-解答对、翻译成英文、重新对齐以生成CoT推理格式,以及最终答案格式化。

huggingface 收录