five

A biodiversity dataset graph: Biological Associations in TaxonWorks hash://sha256/e4a47c067d6c125da60c9a1b92b5eecdea539cb8666cd3aed99db347ae5b8ed0 hash://md5/686007de79cc2a49ab23fd3debe56e3f

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/8252843
下载链接
链接失效反馈
官方服务:
资源简介:
The intended use of this archive is to facilitate (meta-)analysis of Biological Associations captured in TaxonWorks [1]. TaxonWorks is an integrated web-based workbench for taxonomists and biodiversity scientists. It allows you to capture, organize, and enrich your data; share it with collaborators; and package it for analysis and publication.  This dataset provides versioned snapshots of the TaxonWorks network as tracked by Preston [2,3,4] during 2024-05-07 using: preston track -u https://sfg.taxonworks.org . In addition, this dataset provides a processed version of the biological associations using the "preston tw-stream" command as generated by the following bash script: #!/bin/bash # # Generates GloBI interaction JSON Lines from provided provenance log as generated by preston tw-stream. # /usr/local/bin/preston cat hash://sha256/c1b081afa6ea0f60570c24cca85c4d9acd91eeefe36b9cacd1fe53b6893ea154\  | /usr/local/bin/preston tw-stream The script itself was executed using: cat transform.sh | preston bash The execution of this transform.sh script (with content id hash://sha256/6dfe3c4ebf877bed73aebbe88c7d388bf894c569578ed7b28ca68e57a6afe43b), as well as their results, is captured within this datasets also. A rdf/quads formatted machine readable version of the workflow execution description can be found via: preston cat hash://sha256/e4a47c067d6c125da60c9a1b92b5eecdea539cb8666cd3aed99db347ae5b8ed0 And, the resulting JSON Lines file has content id (or signature) hash://sha256/4c2b8642251ced5985660d63c565efa6e5a9bf3d12b3b0c0d9ac577905f5e897 and is also included as interactions.json to facilitate access.  The first json record can be generated using: preston cat hash://sha256/4c2b8642251ced5985660d63c565efa6e5a9bf3d12b3b0c0d9ac577905f5e897\  | head -n1\  | jq . or, provided that the interactions.json has content id starting with hash://sha256/4c2b86... cat interactions.json\  | head -n1\  | jq . This produces the following (formatted) json object: {  "http://www.w3.org/ns/prov#wasDerivedFrom": "hash://sha256/fdbf13dc5f3d9c5afbc03db62699e2ce2724c499b7d91d8b0bf31e39409b153a",  "http://www.w3.org/1999/02/22-rdf-syntax-ns#type": "application/vnd.taxonworks+json",  "referenceId": "https://sfg.taxonworks.org/api/v1/sources/213218",  "interactionId": "https://sfg.taxonworks.org/api/v1/biological_associations/227664",  "taxonRootsResolved": 2,  "referenceResolved": true,  "referenceCitation": "@article{213218,\n  author = {Monzen, Kota},\n  journal = {Annual Report of the Gakugei Faculty of the Iwate University},\n  pages = {24-38},\n  title = {Revision of the Japanese gall wasps with the descriptions of new genus, subgenus, species and subspecies (II). Cynipidae (Cynipinae) Hymenoptera.},\n  volume = {6},\n  year = {1954}\n}\n",  "interactionTypeId": "gid://taxon-works/BiologicalRelationship/69",  "interactionTypeName": "gall",  "sourceTaxonName": "Neuroterus hakonensis",  "sourceTaxonId": "gid://taxon-works/TaxonName/1174121",  "sourceTaxonRank": "species",  "sourceTaxonAuthorship": "Ashmead, 1904",  "sourceTaxonPath": "Root | Cynipidae | Neuroterus | Neuroterus hakonensis",  "sourceTaxonPathIds": "gid://taxon-works/TaxonName/623170 | gid://taxon-works/TaxonName/1170060 | gid://taxon-works/TaxonName/1170097 | gid://taxon-works/TaxonName/1174121",  "sourceTaxonPathNames": "nomenclatural rank | family | genus | species",  "targetTaxonName": "Quercus",  "targetTaxonId": "gid://taxon-works/TaxonName/1173543",  "targetTaxonRank": "genus",  "targetTaxonAuthorship": "",  "targetTaxonPath": "Root | Fagaceae | Quercus",  "targetTaxonPathIds": "gid://taxon-works/TaxonName/623170 | gid://taxon-works/TaxonName/1173542 | gid://taxon-works/TaxonName/1173543",  "targetTaxonPathNames": "nomenclatural rank | family | genus"} In this example, a claim is made that, according to https://sfg.taxonworks.org/api/v1/sources/213218 [6]  Neuroterus hakonensis (a gall wasp) has a primary host in the genus of Quercus (oak tree).  In total, 237,068 such claims can be found in the generated resource with alias interactions.json and content id starting with hash://sha256/4c2b86... . In addition, the archive preston.tar.gz to allow for batch download. The archive contains three types of files: index files, provenance logs and data files. In addition, index files have been individually included in this dataset publication to facilitate remote access. Index files provide a way to links provenance files in time to establish a versioning mechanism. Provenance files describe how, when, what and where the TaxonWorks content was retrieved. For more information, please visit https://preston.guoda.bio or https://doi.org/10.5281/zenodo.1410543 .   To retrieve and verify the downloaded TaxonWorks biodiversity dataset graph, download preston.tar.gz. Then, extract the archive into a "data" folder. Alternatively, you can use the preston[2] command-line tool to "clone" this dataset using: java -jar preston.jar clone --remote https://zenodo.org/record/11151783/files After that, verify the index of the archive by reproducing the following provenance log history: java -jar preston.jar history --log tsv to be: hash://sha256/e4a47c067d6c125da60c9a1b92b5eecdea539cb8666cd3aed99db347ae5b8ed0    http://www.w3.org/ns/prov#wasDerivedFrom    hash://sha256/c1b081afa6ea0f60570c24cca85c4d9acd91eeefe36b9cacd1fe53b6893ea154    hash://sha256/c1b081afa6ea0f60570c24cca85c4d9acd91eeefe36b9cacd1fe53b6893ea154    http://www.w3.org/ns/prov#wasDerivedFrom    hash://sha256/a4d651aac5220487835e6178511886e98b845b2d98cb7c5447fb2b042e0654d2hash://sha256/a4d651aac5220487835e6178511886e98b845b2d98cb7c5447fb2b042e0654d2 http://www.w3.org/ns/prov#wasDerivedFrom hash://sha256/ab7550368905e7c919e70a306efbb97719a1edbba2cfe4c4515f635ebc0be4bb hash://sha256/a4d651aac5220487835e6178511886e98b845b2d98cb7c5447fb2b042e0654d2    http://www.w3.org/ns/prov#wasDerivedFrom    hash://sha256/ab7550368905e7c919e70a306efbb97719a1edbba2cfe4c4515f635ebc0be4bbhash://sha256/ab7550368905e7c919e70a306efbb97719a1edbba2cfe4c4515f635ebc0be4bb    http://www.w3.org/ns/prov#wasDerivedFrom    hash://sha256/ff5e709305e593c87711e897b6341b94e775e2f312aa6d4ae5ed6120babd6f5e     urn:uuid:0659a54f-b713-4f86-a917-5be166a14110    http://purl.org/pav/hasVersion    hash://sha256/ff5e709305e593c87711e897b6341b94e775e2f312aa6d4ae5ed6120babd6f5e     To check the integrity of the extracted archive, confirm that each line produce by the command "preston verify" produces lines as shown below, with each line including "CONTENT_PRESENT_VALID_HASH". Depending on hardware capacity, this may take a while. java -jar preston.jar verify Note that a copy of the java program "preston", preston.jar, is included in this publication. The program runs on java 8+ virtual machine using "java -jar preston.jar", or in short "preston".  Files in this data publication: --- start of file descriptions --- -- description of archive and its contents (a rendition of this file) --README -- biological associations indexed from TaxonWorks expressed in a GloBI [5] compatible JSON Lines file --interactions.json -- first 10 biological associations indexed from TaxonWorks expressed in a GloBI [5] compatible JSON Lines file --interactions-10.json -- executable java jar containing preston [2,3,4] v0.8.5-SNAPSHOT. --preston.jar -- preston archive containing TaxonWorks data files, associated provenance logs and a provenance index --preston.tar.gz -- individual provenance index files -- 1fed32bf78298d7ecc3d9f36d106f1d7d7773a8b9a5e47af6632f36c1f82adb529306c5c144c3d7fd21be344d8b6b554b6f6efa3b8f8f5c0b27cdf0e887856522a5de79372318317a382ea9a2cef069780b852b01210ef59e06b640a3539cb5ad31ff1ef1dea88c5952181a4f30e7ea7862873aa5f66430451275aa6d08d329edeb84d69224af488da585186f88cafc58e978db5f9897de624cc9b02c0c83742e9c34683f1e826f68f841f3419bd5ee9c0fa18be04713a6fd3364f226c7c5f2ff98d36a9dc7bd833c93b3b61130865628f7bc2f7bb0920e95afcd16fba3dc6a8ffb41d48979ceb964fbfbeb68cb60b584b759950087fdcc012521b866249bc39 --- end of file descriptions --- This work is funded in part by grant NSF OAC 1839201, NSF DBI 1901932, NSF DBI 1901926, and NSF DBI 2102006 from the National Science Foundation.
创建时间:
2024-05-08
二维码
社区交流群
二维码
科研交流群
商业服务