five

A biodiversity dataset graph: GBIF, iDigBio, BioCASe hash://sha256/450deb8ed9092ac9b2f0f31d3dcf4e2b9be003c460df63dd6463d252bff37b55 hash://md5/898a9c02bedccaea5434ee4c6d64b7a2

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/1472393
下载链接
链接失效反馈
官方服务:
资源简介:
A biodiversity dataset graph: GBIF, iDigBio, BioCASe hash://sha256/450deb8ed9092ac9b2f0f31d3dcf4e2b9be003c460df63dd6463d252bff37b55 hash://md5/898a9c02bedccaea5434ee4c6d64b7a2 The intended use of this archive is to facilitate meta-analysis of the Global Biodiversity Information Facility, Integrated Digitized Biocollections, Biological Collection Access Service (GBIF, iDigBio, BioCASe). GBIF, iDigBio and BioCASe help provide access to biological data collections. This dataset provides versioned provenance logs of snapshots of the GBIF, iDigBio, BioCASe network as tracked by Preston [2] between 2018-09-03 and 2023-02-02 using "preston update -u https://gbif.org,https://idigbio.org,http://biocase.org". This publication contains two types of files: index files and provenance logs. Associated data files are hosted elsewhere for pragmatic reasons. Index files provide a way to link provenance files in time to establish a versioning mechanism. Provenance logs describe how, when, what and where the GBIF, iDigBio, BioCASe content was retrieved. For more information, please visit https://preston.guoda.bio or https://doi.org/10.5281/zenodo.1410543 .   To retrieve and verify the downloaded GBIF, iDigBio, BioCASe biodiversity dataset graph, use the preston[2] command-line tool to "clone" this dataset using: $ java -jar preston.jar ls --remote https://zenodo.org/record/7651831/files > /dev/null Optionally, you can retrieve all associated data (>500GB) files using: $ java -jar preston.jar clone https://zenodo.org/record/7651831/files --remote https://zenodo.org/record/7651831/files,https://linker.bio,https://archive.org/download/biodiversity-dataset-archives/data.zip/data/ Please note https://archive.org/download/biodiversity-dataset-archives/data.zip/data/ and https://linker.bio are Preston remotes that provided access to GBIF, iDigBio, BioCASe data files at time of writing (17 Feb 2023). These remotes can replaced with any other Preston remote(s) if needed. This may take a while depending on network speed and hardware constraints. See also https://archive.org/details/biodiversity-dataset-archives . After that, verify the index of the archive by reproducing the following provenance log history: $ java -jar preston.jar history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . If you retrieved data files, you can check the integrity of the extracted archive by confirming that each line produce by the command "preston verify" produces lines as shown below, with each line including "CONTENT_PRESENT_VALID_HASH". Depending on hardware capacity, this may take a while. $ java -jar preston.jar verify hash://sha256/3eff98d4b66368fd8d1f8fa1af6a057774d8a407a4771490beeb9e7add76f362    file:/home/preston/preston-archive/data/3e/ff/3eff98d4b66368fd8d1f8fa1af6a057774d8a407a4771490beeb9e7add76f362    OK    CONTENT_PRESENT_VALID_HASH    89931 hash://sha256/184886cc6ae4490a49a70b6fd9a3e1dfafce433fc8e3d022c89e0b75ea3cda0b    file:/home/preston/preston-archive/data/18/48/184886cc6ae4490a49a70b6fd9a3e1dfafce433fc8e3d022c89e0b75ea3cda0b    OK    CONTENT_PRESENT_VALID_HASH    210344 hash://sha256/1846abf2b9623697cf9b2212e019bc1f6dc4a20da51b3b5629bfb964dc808c02    file:/home/preston/preston-archive/data/18/46/1846abf2b9623697cf9b2212e019bc1f6dc4a20da51b3b5629bfb964dc808c02    OK    CONTENT_PRESENT_VALID_HASH    210344 hash://sha256/554fdab07f2372bf363a1d7ef30fcf4c32e1da98b95a6342780c5eb35e0e7b38    file:/home/preston/preston-archive/data/55/4f/554fdab07f2372bf363a1d7ef30fcf4c32e1da98b95a6342780c5eb35e0e7b38    OK    CONTENT_PRESENT_VALID_HASH    202701 Note that a copy of the java program "preston", preston.jar, is included in this publication. The program runs on java 8+ virtual machine using "java -jar preston.jar", or in short "preston". Files in this data publication: --- start of file descriptions --- -- description of archive and its contents (this file) -- README -- executable java jar containing preston[2] v0.5.4 -- preston.jar --- end of file descriptions --- References [1] Global Biodiversity Information Facility, Integrated Digitized Biocollections, Biological Collection Access Service (GBIF, iDigBio, BioCASe, https://gbif.org,https://idigbio.org,http://biocase.org) accessed from 2018-09-03 to 2023-02-02 with provenance hash://sha256/450deb8ed9092ac9b2f0f31d3dcf4e2b9be003c460df63dd6463d252bff37b55 hash://md5/898a9c02bedccaea5434ee4c6d64b7a2 . [2] https://preston.guoda.bio, https://doi.org/10.5281/zenodo.1410543 . This work is funded in part by grant NSF OAC 1839201 from the National Science Foundation
创建时间:
2023-06-02
二维码
社区交流群
二维码
科研交流群
商业服务