five

Table_2_Population-Scale Polymorphic Short Tandem Repeat Provides an Alternative Strategy for Allele Mining in Cotton.XLSX

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/Table_2_Population-Scale_Polymorphic_Short_Tandem_Repeat_Provides_an_Alternative_Strategy_for_Allele_Mining_in_Cotton_XLSX/19721017
下载链接
链接失效反馈
官方服务:
资源简介:
Short tandem repeats (STRs), which vary in size due to featuring variable numbers of repeat units, are present throughout most eukaryotic genomes. To date, few population-scale studies identifying STRs have been reported for crops. Here, we constructed a high-density polymorphic STR map by investigating polymorphic STRs from 911 Gossypium hirsutum accessions. In total, we identified 556,426 polymorphic STRs with an average length of 21.1 bp, of which 69.08% were biallelic. Moreover, 7,718 (1.39%) were identified in the exons of 6,021 genes, which were significantly enriched in transcription, ribosome biogenesis, and signal transduction. Only 5.88% of those exonic STRs altered open reading frames, of which 97.16% were trinucleotide. An alternative strategy STR-GWAS analysis revealed that 824 STRs were significantly associated with agronomic traits, including 491 novel alleles that undetectable by previous SNP-GWAS methods. For instance, a novel polymorphic STR consisting of GAACCA repeats was identified in GH_D06G1697, with its (GAACCA)5 allele increasing fiber length by 1.96–4.83% relative to the (GAACCA)4 allele. The database CottonSTRDB was further developed to facilitate use of STR datasets in breeding programs. Our study provides functional roles for STRs in influencing complex traits, an alternative strategy STR-GWAS for allele mining, and a database serving the cotton community as a valuable resource.

短串联重复序列(Short Tandem Repeats, STRs)因重复单元数目可变而导致长度存在差异,广泛分布于多数真核生物基因组中。截至目前,针对作物开展的全群体水平STR鉴定研究尚为数不多。本研究通过对911份陆地棉(Gossypium hirsutum)种质资源的多态性STR进行分析,构建了一张高密度多态性STR图谱。本研究共鉴定得到556426个多态性STR,平均长度为21.1 bp,其中69.08%为双等位基因类型。此外,其中7718个(占比1.39%)位于6021个基因的外显子区域,这些基因显著富集于转录、核糖体生物发生以及信号转导通路中。仅有5.88%的外显子STR会改变开放阅读框,其中97.16%为三核苷酸重复类型。采用STR全基因组关联分析(STR-GWAS)这一替代分析策略,我们鉴定到824个与农艺性状显著相关的STR,其中包含491个此前通过单核苷酸多态性全基因组关联分析(SNP-GWAS)方法无法检测到的新等位变异。例如,我们在GH_D06G1697基因中鉴定到一个由GAACCA重复单元构成的新型多态性STR,与(GAACCA)4等位变异相比,(GAACCA)5等位变异可使纤维长度提升1.96%~4.83%。我们进一步开发了CottonSTRDB数据库,以推动STR数据集在作物育种中的应用。本研究阐明了STRs在调控复杂性状中的功能作用,建立了用于等位基因挖掘的STR-GWAS分析策略,并构建了可供棉花研究领域使用的珍贵资源数据库。
创建时间:
2022-05-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作