Data for: Detection of oomycete pathogens in UK peat-free growing media and implications for plant health
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13376043
下载链接
链接失效反馈官方服务:
资源简介:
This dataset on Zenodo accompanies the manuscript Frederickson-Matika et al. (2024), Detection of oomycete pathogens in UK peat-free growing media and implications for plant health.
There are two files:
metadata.tsv - plain text table as tab-separated variables
raw_data.tar.gz - compressed archive of 43 paired raw FASTQ files
This represents a subset of two complete Illumina MiSeq plates (in two dated folderes) run at the James Hutton Institute containing other environmental samples using the same protocol. Only the synthetic controls and peat-free samples are provided here.To repeat the analysis described in the paper, first install THAPBI PICT. See https://github.com/peterjc/thapbi-pict/ for instructions. At the time of the paper, v1.0.14 was the current release.
Next, decompress the raw data into a folder of paired gzipped FASTQ files. There is no need to decompress those:
$ tar -zxvf raw_data.tar.gz $ ls -1 plate_20220505/ plate_20230608/
If you wish, verify the checksums to confirm the data integrity:
$ cd plate_20220505/
$ md5sum -c MD5SUM.txt $ cd ../plate_20230608/
$ md5sum -c MD5SUM.txt $ cd ..
Setup output directories:
$ mkdir -p intermediate/ summary/
Run the THAPBI PICT pipeline:
$ thapbi_pict pipeline -m 1s3g \ -i plate_*/ -o summary/peat-free \ -y plate_*/GBL*.fastq.gz \ -n plate_*/GBL*.fastq.gz \ -s intermediate/ \ -t metadata.tsv -u \ -x 9 -c 1,2,3,4,5,6,7,8
The options here are as follows:
-i - two input directories of paired raw FASTQ files.
-n - negative controls used to increase the absolute abundance threshold
-y - synthetic controls used to increase the fractional abundance threshold
-s - optional location to store intermediate files
-o - output stem for reports
-t - filename for tab-separated-variable metadata
-u - show unsequenced samples defined in the metadata
-x - which metadata column contains Illumina FASTQ filename stems
-c - which metadata columns to include in the report.
This assumes the following key default settings:
-a 100 (default absolite abundance threshold)
-f 0.001 (default fractional abundance threshold)
-d -(default provided ITS1 database).
With these settings, only synthetic sequences were found in the controls, and therefore the thresholds were not automatically increased any further.
Opening the output file summary/peat-free.ITS1.samples.1s3g.xlsx in Excel or similar should show you a table resembling Table 1 in the paper, but one row per sequencing sample, and additional columns with per-sample per-species read counts etc.
创建时间:
2024-10-10



