Determinants of microbiome composition: Insights from free-ranging hybrid zebras (Equus quagga × grevyi)
收藏Mendeley Data2024-04-13 更新2024-06-29 收录
下载链接:
https://datadryad.org/stash/dataset/doi:10.5061/dryad.0rxwdbs7c
下载链接
链接失效反馈官方服务:
资源简介:
This 'README_file_Hybrid_zebra_microbiomes.txt' file was generated on 2024-01-22 by JOEL O. ABRAHAM GENERAL INFORMATION 1. Title of Dataset: Determinants of microbiome composition: insights from free-ranging hybrid zebras (Equus quagga × grevyi) 2. Author Information A. Principal Investigator Contact Information Name: Joel O. Abraham Institution: Princeton University Email: B. Associate or Co-investigator Contact Information Name: Daniel I. Rubenstein Institution: Princeton University Email: 3. Date of data collection (approximate): 2020-01-08 to 2020-01-17 4. Geographic location of data collection: Laikipia County, Kenya 5. Information about funding sources that supported the collection of the data: NSF DEB-2225088 and the High Meadows Environmental Institute at Princeton University. Joel O. Abraham is supported by the NSF Graduate Research Fellowship (Fellow ID: 2019256075). ## SHARING/ACCESS INFORMATION 1. Licenses/restrictions placed on the data: None. 2. Links to publications that cite or use the data: TBD 3. Links to other publicly accessible locations of the data: TBD 4. Links/relationships to ancillary data sets: NA 5. Was data derived from another source? NO 6. Recommended citation for this dataset: Abraham, J.O., Lin, B., Miller, A.E., Henry, L.P., Demmel, M.Y., Warungu, R., Mwangi, M., Lobura, P.M., Pallares, L.F., Ayroles, J.F, Pringle, R.M, Rubenstein, D.I. (2024). Determinants of microbiome composition: insights from free-ranging hybrid zebras (Equus quagga × grevyi). Dryad, Dataset, ## DATA & FILE OVERVIEW 1. File List: README_file_Hybird_zebra_microbiomes.txt: README file explaining how the dataset was generated and the data contained in the dataset Sample_metadata.text: File containing the sample metadata Hybrid_zebra_diet_forward_reads.fastq: Raw sequence data (forward reads for diet data) Hybrid_zebra_diet_reverse_reads.fastq: Raw sequence data (reverse reads for diet data) Hybrid_zebra_microbiome_forward_reads.fastq: Raw sequence data (forward reads for microbiome data) Hybrid_zebra_microbiome_reverse_reads.fastq: Raw sequence data (reverse reads for microbiome data) Diet_composition_data.txt: Relative read abundances of plant mOTUs in zebra fecal samples (post-filtering) Microbiome_composition_data.csv: Rarefied abundances of bacterial ASVs in zebra fecal samples (post-filtering) Bacterial_taxonomy.csv: Taxonomic information for all bacterial ASVs in zebra microbiomes Bacterial_phylogeny.nwk: Phylogenetic tree of bacterial ASVs in zebra microbiomes Hybrid zebra microbiome analyses.R: R script for analyzing data and generating visualizations. ### METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: Fecal samples from free-ranging zebra were collected from Laikipia County in January 2020. DNA was extracted from fecal samples and the plant and bacterial components were sequenced to characterize diet and microbiome composition respectively. 2. Methods for processing the data: Diet sequence data were curated using the OBITOOLS v2 package, while microbiome sequence data were processed using the DADA2 v1.18 big data pipeline, implemented in R v4.0.2. 3. Instrument- or software-specific information needed to interpret the data: R is necessary to run the R script file. The R script was written in R v4.0.2. 4. Standards and calibration information, if appropriate: NA 5. Environmental/experimental conditions: NA 6. Describe any quality-assurance procedures performed on the data: NA 7. People involved with sample collection, processing, analysis and/or submission: Joel O. Abraham, Bing Lin, Audrey E. Miller, Lucas P. Henry, Margaret Y. Demmel, Rosemary Warungu, Margaret Mwangi, Patrick M. Lobura, Luisa F. Pallares, Julien F. Ayroles, Robert M. Pringle, Daniel I. Rubenstein AEM and BL conceived of the project and designed the sampling approach, with input from DIR and JFA. AEM and BL led fecal sampling in the field, with help from RW, MM, and PML to find and identify hybrids. JOA, AEM, and BL performed lab work, with technical advice from LFP, LPH, and MYD. Lab work was conducted at the Mpala genomics facility in Kenya and at RMP’s laboratory at Princeton University. JOA led bioinformatics and data analyses, with input from LPH and MYD. #### DATA-SPECIFIC INFORMATION FOR: Sample\_metadata.text 1. Number of variables: 12 2. Number of cases/rows: 91 3. Variable List: sample_number: A unique number assigned to each sample for indexing purposes sample_ID: A unique ID assigned to each sample; corresponds to columns headers in 'Diet_composition_data.txt' and 'Microbiome_composition_data.csv' location: Reserve on which a given sample was collected; 'OLP' corresponds to Ol Pejeta (main area of the park); 'OLP-R' corresponds to Ol Pejeta (fenced-off reserve); 'MPA' corresponds to Mpala; 'OLJ' corresponds to Ol Jogi latitude: Latitude at which the sample was collected longitude: Longitude at which the sample was collected status: Whether the sample was collected from the same reserve where hybrid zebra occur ('SYM' for sympatric) or a different reserve ('ALL' for allopatric) species: The zebra species from which the sample was collected ('PL' is plains zebra, 'GR' is Grevy's zebra, and 'HY' is hybrid zebra) pop: The subpopulation from which the sample was collected ('sym_hyb' is Ol Pejeta hybrids, 'sym_dad' is Ol Pejeta Grevy's, 'sym_mom' is Ol Pejeta plains, 'all_dad' is Mpala/Ol Jogi Grevy's, 'all_mom' is Mpala/Ol Jogi plains) sex: The sex of the zebra from which the sample was collected ('M' is male, 'F' is female) age: The age category of the zebra from which the sample was collected ('AD' is adult, 'JU' is juvenile) group_size: Size of the group that the zebra from which the sample was collected was observed in date: The time each sample was collected 4. Missing data codes: 'NA' 5. Specialized formats or other abbreviations used: NA #### DATA-SPECIFIC INFORMATION FOR: Diet\_composition\_data.txt 1. Number of variables: 19 2. Number of cases/rows: 144 3. Variable List: id: Unique ID assigned to each plant mOTU best_identity_MRC: Percent sequence match with best match in local reference library best_identity_GDB: Percent sequence match with best match in global reference library best_match_MRC: Barcode reference ID in the local reference library that best matches the mOTU sequence best_match_GDB: Barcode reference ID in the global reference library that best matches the mOTU sequence kingdom_name_ok: The kingdom assigned to the plant mOTU by comparing to the two reference libraries (always Viridiplantae) phylum_name_ok: The phylum assigned to the plant mOTU by comparing to the two reference libraries (always Streptophyta) class_name_ok: The class assigned to the plant mOTU by comparing to the two reference libraries order_name_ok: The order assigned to the plant mOTU by comparing to the two reference libraries (if available) family_name_ok: The family assigned to the plant mOTU by comparing to the two reference libraries (if available) genus_name_ok: The genus assigned to the plant mOTU by comparing to the two reference libraries (if available) species_name_ok: The species name assigned to the plant mOTU by comparing to the two reference libraries (if available) scientific_name rank: The finest taxonomic rank available for each plant mOTU species_list_MRC: The list of species matched to each plant mOTU from the local reference library species_list_GDB: The list of species matched to each plant mOTU from the global reference library sci_name_ok: The finest resolution taxonomic categorization of each plant mOTU sci_name_plot: The taxonomic categorization of each plant mOTU used for visualization purposes sequence: The DNA sequence corresponding to each plant mOTU Occurence: The number of samples each plant mOTU occurred in (note that samples were sequenced in triplicate so occurence values are inflated by a factor of three) Note: columns 20-105 are the relative read abundances of each plant mOTU in each sample 4. Missing data codes: 'NA' 5. Specialized formats or other abbreviations used: NA #### DATA-SPECIFIC INFORMATION FOR: Microbiome\_composition\_data.csv 1. Number of variables: 2892 2. Number of cases/rows: 84 3. Variable List: Each contains the read count (averaged across all three replicates) of a particular bacterial ASV in a given sample (post-rarefaction); each row is a sample, each column is a bacterial ASV. The taxonomic information for each bacterial ASV can be found in the file 'Bacterial_taxonomy.csv' 4. Missing data codes: 'NA' 5. Specialized formats or other abbreviations used: NA #### DATA-SPECIFIC INFORMATION FOR: Bacterial\_taxonomy.csv 1. Number of variables: 19 2. Number of cases/rows: 2892 3. Variable List: Kingdom: The kingdom assigned to the bacterial ASV (always Bacteria) Phylum: The phylum assigned to the bacterial ASV (if available) Class: The class assigned to the bacterial ASV (if available) Order: The order assigned to the bacterial ASV (if available) Family: The family assigned to the bacterial ASV (if available) Genus: The genus assigned to the bacterial ASV (if available) Species: The species name assigned to the bacterial ASV (if available) sequence: The DNA sequence corresponding to each bacterial ASV 4. Missing data codes: 'NA' 5. Specialized formats or other abbreviations used: NA
创建时间:
2024-02-11



