five

Alignments of O. edulis in silico sequences

收藏
DataONE2014-01-16 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
For the in silico SNPs, we investigated O. edulis transcriptome sequence data from eight individuals from the natural range (Cahais et al. 2012, Gayral et al. 2011). For the present study, the 454 and Illumina reads were assembled using a multi-kmer strategy (kmers: 37, 41, 45, 49, 53, 57 and 61, assembled with Velvet version 1.1.03). Contigs longer than 100 bp from every assembly were then meta-assembled with TGICL (http://compbio.dfci.harvard.edu/tgi/software/). The Illumina reads were remapped on the contigs using BWA (0.5.9-r16) and a compressed alignment file was produced using SAMtools view (version 0.1.11). The alignment file was then used to call the SNPs with SAMtools pileup and varFilter (version 0.1.11). In this database, we looked for SNPs that represented different contigs, with a depth ranging from 20 to 500 at the position and no other SNPs in the surrounding 120 bp. The SNP quality score was initially set at 20 but finally, due to the high number of SNPs available, we only used SNPs with the highest score of 227.

针对计算机模拟单核苷酸多态性(in silico SNPs),我们对采自其自然分布区的8个欧洲牡蛎(O. edulis)个体的转录组序列数据开展了分析(Cahais等,2012;Gayral等,2011)。本研究中,我们采用多k-mer策略对454与Illumina测序读段进行序列组装,其中k-mer取值为37、41、45、49、53、57和61,组装软件为Velvet 1.1.03版本。随后,将所有组装结果中长度大于100 bp的重叠群(contigs)通过TGICL工具(http://compbio.dfci.harvard.edu/tgi/software/)进行元组装。我们使用BWA(0.5.9-r16)将Illumina测序读段重新比对至上述重叠群,并通过SAMtools view(0.1.11版本)生成压缩格式的比对文件。随后,借助该比对文件,通过SAMtools pileup与varFilter工具(均为0.1.11版本)调用单核苷酸多态性位点。在本数据库中,我们筛选满足以下条件的SNPs:对应不同的重叠群、位点测序深度介于20至500之间、侧翼120 bp范围内无其他SNPs。初始时将SNP质量评分阈值设为20,但鉴于可获取的SNPs数量较多,最终我们仅保留质量得分为227的最高质量SNPs。
创建时间:
2014-01-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作