five

Mining for single nucleotide polymorphisms and insertions / deletions in expressed sequence tag libraries of oil palm

收藏
NIAID Data Ecosystem2026-03-06 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.157
下载链接
链接失效反馈
官方服务:
资源简介:
The oil palm is a tropical oil bearing tree. Recently EST-derived SNPs and SSRs are a free by-product of the currently expanding EST (Expressed Sequence Tag) data bases. The development of high-throughput methods for the detection of SNPs (Single Nucleotide Polymorphism) and small indels (insertion / deletion) has led to a revolution in their use as molecular markers. Available (5452) Oil palm EST sequences were mined from dbEST of NCBI. CAP3 program was used to assemble EST sequences into contigs. Candidate SNPs and Indel polymorphisms were detected using the perl script auto_snip version 1.0 which has used 576 ESTs for detecting SNPs and Indel sites. We found 1180 SNP sites and 137 indel polymorphisms with frequency 1.36 SNPs / 100 bp. Among the six tissues from which the EST libraries had been generated, mesocarp had high frequency of 2.91 SNPs and indels per 100 bp whereas the zygotic embryos had lowest frequency of 0.15 per 100 bp. We also used the Shannon index to analyze the proportion of ten possible types of SNP/indels. ESTs from tissues of normal apex showed highest values of Shannon index (0.60) whereas abnormal apex had least value (0.02). The present report deals the use of Shannon index for comparing SNP/ indel frequencies mined from ESTlibraries and also confirm that the frequency of SNP occurrence in oil palm to use them as markers for genetic studies.

油棕是一种热带油料树种。近年来,基于表达序列标签(Expressed Sequence Tag,EST)的单核苷酸多态性(Single Nucleotide Polymorphism,SNP)与简单序列重复(Simple Sequence Repeat,SSR)标记,是当前持续扩增的EST数据库的免费副产物。高通量检测SNP与小插入缺失(insertion/deletion,indel)的技术发展,推动其作为分子标记的应用迎来革命性变革。 我们从NCBI的dbEST数据库中挖掘得到5452条油棕EST序列,通过CAP3软件将这些EST序列拼接为重叠群(contigs)。采用perl脚本auto_snip v1.0检测候选SNP与indel多态性,该脚本共利用576条EST定位SNP与indel位点。最终共检出1180个SNP位点与137个indel多态性位点,多态性频率为每100 bp存在1.36个SNP。 在构建EST文库所用的6种组织中,中果皮(mesocarp)的SNP与indel频率最高,达每100 bp 2.91个;而合子胚(zygotic embryos)的频率最低,仅为每100 bp 0.15个。我们还通过香农指数(Shannon index)分析了10种SNP/indel可能类型的占比:正常顶端组织来源的EST的香农指数最高(0.60),异常顶端组织来源的EST香农指数最低(0.02)。 本研究探讨了利用香农指数比较不同EST文库中挖掘得到的SNP/indel频率的方法,并证实油棕的SNP发生频率可作为遗传研究的分子标记加以利用。
创建时间:
2008-06-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作