These are all the scripts, FastQC reports, filtered datasets and metadata used for analysis. Please read the readme file for a detailed explanation of what each file contains.

Name: These are all the scripts, FastQC reports, filtered datasets and metadata used for analysis. Please read the readme file for a detailed explanation of what each file contains.
Creator: Salloum, Priscila
Published: 2023-05-17 00:00:00
License: 暂无描述

Figshare2023-05-17 更新2026-04-08 收录

下载链接：

https://figshare.com/articles/dataset/These_are_all_the_scripts_FastQC_reports_filtered_datasets_and_metadata_used_for_analysis_Please_read_the_readme_file_for_a_detailed_explanation_of_what_each_file_contains_/22881482/1

下载链接

链接失效反馈

官方服务：

资源简介：

fastqc_files.zip - zip folder containing all the html files with the FastQC report for each sample sequenced. The library names match the ones in the Genebank Bioproject PRJNA972185 (Biosamples SAMN35067136 to SAMN35067307) *.qza files (output from Qiime2) - clean-filtered-2-500-observed-taxonomy-silva.qza is a Qiime2 output file (qza) conatining the taxonomy of all the features observed in the filtered dataset; - clean-filtered-table-2-500-aligned-rep-seqs-silva.qza is a Qiime2 output file (qza) containing the alignment (not masked) representative sequences of all the features observed in the filtered dataset - clean-filtered-table-2-500-masked-aligned-rep-seqs-silva.qza is a Qiime2 output file (qza) contaning the masked alignemnet of the representative sequences of all the features observed in the filtered dataset - clean-filtered-table-2-500-rep-seqs-silva.qza is a Qiime2 output file (qza) containing the representative sequences of all the features observed in the filtered dataset - clean-filtered-table-2-500-rooted-tree-silva.qza is a Qiime2 output file (qza) containing the rooted phylogenetic tree based on the taxonomy assigned to all the features observed in the filtered dataset - clean-filtered-table-2-500-silva.qza is a Qiime2 output file (qza) containing the filtered feature table - clean-filtered-table-2-500-unrooted-tree-silva.qza is a Qiime2 output file (qza) containing the unrooted phylogenetic tree based on the taxonomy assigned to all the features observed in the filtered dataset. Scripts Bioinformatic analyses were undertaken using the New Zealand eScience Infrastructure (NeSI, Linux system) and a local computer (all R analyses, Windows system). The scripts provided are the following: - FastQCandQiime2.md, FastQCandQiime2.pdf and FastQCandQiime2.html: Three file formats (md for markdown, pdf and html) of the script (annotated) used in a Linux System (NeSI) to undertake FastQC analyses and run Qiime2 up to the generation of filtered qza files that were used as input to all R analyses. - Alpha_diversity_rarefied.R Annotated R script used to load the relevant qza files (filtered output from Qiime2), subset and rarefy the data and estimate alpha diversity comparing environment, parasites and snails, as well as just the different parasite species and other combinations of samples. - barPlots_heatmaps_vennD.R Annotated R script used to load the relevant qza files (filtered output from Qiime2), subset the data and draw the bar plots of relative abundance, heatmaps and Venn Diagrams (non-rarefied data). - beta_diversity_rarefied.R Annotated R script used to load the relevant qza files (filtered output from Qiime2), subset and rarefy the data and estimate beta diversity comparing environment, parasites and snails, as well as just the different parasite species and other combinations of samples. - differential_abundance_tests.R Annotated R script used to load the relevant qza files (filtered output from Qiime2), subset the data (all samples, or only parasites, or parasite-snail host pairs, or only snails) to run the tests of differential abundance (Aldex 2, corncob and metastat methods). - indicspecies.R Annotated R script used to load the relevant qza files (filtered output from Qiime2), subset the data and run the indicator taxa tests (Indicspecies R package). - phylogeneticVsMicrobiome_distances.R Annotated R script used to calculate the genetic distances between the four trematode species, calculate the microbiome distances between these four trematode species (based on beta diversity metrics of rarefied data) and test for association between genetic distance and microbiome distance (phylosymbiosis). It includes tests of normality (Shapiro Wilk) and of correlation, as well as mantel tests. Other files -run_insight.csv Output statistics of sequencing run -metadata_complete.csv All the metadata used in the different scripts. Columns are described in detail in the readme file. - partialCOI_trematodes_short.fasta Partial COI gene sequence of the four trematode species (downloaded from Genebank), used estimate genetic distance between species. - partialCOI_trematodes_short_aligned.phy Aligned sequences (partial COI gene) of the four trematode species (based on partialCOI_trematodes_short.fasta and aligned with mafft_command.txt) - 28S_parasite_families.txt Partial 28S gene sequence (in fasta file, but using extension .txt) for the four trematode families and an outgroup (downloaded from Genebank), used to estimate genetic distance between families - 28S_parasites_aligned2.phy Aligned sequences (partial 28S gene) of the four trematode species (based on 28S_parasite_families.txt and aligned with mafft_command.txt) - mafft_command.txt Command used in Lynux (NeSI system) to run MAFFT and align the COI and 28S sequences (partialCOI_trematodes_short.fasta and 28S_parasite_families.txt)

提供机构：

Salloum, Priscila

创建时间：

2023-05-17