Data from: SNPdryad: predicting deleterious non-synonymous human SNPs using only orthologous protein sequences

DataONE2014-01-31 更新2024-06-27 收录

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

官方服务：

资源简介：

The recent advances in genome sequencing have revealed an abundance of non-synonymous polymorphisms among human individuals; subsequently, it is of immense interest and importance to predict whether such substitutions are functional neutral or have deleterious effects. The accuracy of such prediction algorithms depends on the quality of the multiple-sequence alignment, which is used to infer how an amino acid substitution is tolerated at a given position. Because of the scarcity of orthologous protein sequences in the past, the existing prediction algorithms all include sequences of protein paralogs in the alignment, which can dilute the conservation signal and affect prediction accuracy. However, we believe that, with the sequencing of a large number of mammalian genomes, it is now feasible to include only protein orthologs in the alignment and improve the prediction performance.

近年来基因组测序技术的长足进展，揭示了人类个体间存在大量非同义多态性（non-synonymous polymorphisms）；故而，预测此类氨基酸替换属于功能中性还是具有有害效应，已成为极具研究价值与重要意义的课题。此类预测算法的准确性，取决于多序列比对（multiple-sequence alignment）的质量——该比对用于推断某一特定位置的氨基酸替换是否可被耐受。由于过去直系同源蛋白序列（orthologous protein sequences）较为匮乏，现有预测算法均在比对中纳入旁系同源蛋白序列（protein paralogs），这会稀释保守信号并降低预测准确性。但我们认为，随着大量哺乳动物基因组的测序完成，如今仅在比对中纳入直系同源蛋白序列并提升预测性能已具备可行性。

创建时间：

2014-01-31

5,000+

优质数据集

54 个

任务类型

进入经典数据集