five

18S V4 rDNA sequences organized at the OTU level for the SOMLIT-Astan time-series (2009-2016)

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/5032450
下载链接
链接失效反馈
官方服务:
资源简介:
The present file includes metadata for each 18S V4 rDNA OTU from the SOMLIT-Astan time series (2009-2016) including the following fields: amplicon = identifier of the representative (most abundant) sequence; total = total number of reads; spread = number of samples in which the OTU has been found; cloud = number of unique sequences constituting the OTU;  sequence =  nucleic acid sequence of the representative sequence; length = length of the representative sequence; quality = minimum expected error observed for the representative sequence, divided by sequence length;  taxonomy = taxonomic path assigned to the representative sequence; identity = percentage of identity of the representative sequence to the closest reference sequence from PR2; references = best hit reference sequence(s) ;  RA090107_02:RA161222_3 = 375 samples from January 2009 to December 2016, the first two number are the year followed by the month and the day (sampling twice a month during 8 years). Values after “_” indicate the size of the filter used for the filtration: 02 for 0.2 µm and 3 for 3 µm. Generation of 18S V4 rDNA Operational Taxonomic Units (OTUs) from the raw sequencing reads and their assembly into a OTUtable was obtained according to the following pipeline (https://doi.org/10.5281/zenodo.5791089). The V4 region was extracted from the 18S rDNA reference sequences from PR2 v4.12 (Guillou et al., 2013) with Cutadapt. The representative sequences of each OTU were compared to these V4 reference sequences by pairwise global alignment (usearch_global VSEARCH’s command). Each OTU inherits the taxonomy of the best hit or the last common ancestor in case of ties. OTUs with a score below 80% similarity were considered as unassigned (Mahé et al., 2017; Stoeck et al., 2010). The final dataset (filtered OTU table) contains 375 samples (sampled twice per month from 2009 to 2016) with a total of ~30 million sequence reads and 21,418 OTUs.
创建时间:
2022-05-05
二维码
社区交流群
二维码
科研交流群
商业服务