Assembly and annotation for great gerbil
收藏DataCite Commons2020-08-28 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/Assembly_and_annotation_for_great_gerbil/7098347/1
下载链接
链接失效反馈官方服务:
资源简介:
Approximantely 86x coverage of Illumina paired end reads were assembled with ALLPATHS-LG resulting in a highly contiguous scaffold assembly. <br>An iterative automatic annotation with MAKER2 using a mouse cDNA set (Ensembl) and proteins from Uniprot/SwissProt resulted in 70,974 predicted gene models. InterProScan was run on the predicted predicted proteins and gene names were allocated based on match with proteins in UniProt/SwissProt. We filtered this set based on Annotation Edit Distance (AED) using the deafult filtering parameter, creating a file set containing all genes with AED less than 1 (AED<1). This resulted in 22,393 gene models.<br>
采用ALLPATHS-LG软件对约86倍覆盖度的Illumina双端测序reads进行组装,最终获得连续性优异的支架序列组装结果。
利用MAKER2软件,结合源自Ensembl数据库的小鼠cDNA数据集与UniProt/SwissProt数据库的蛋白质序列,通过迭代自动注释流程,共预测得到70974个基因模型。
对上述预测得到的蛋白质序列运行InterProScan注释,并基于与UniProt/SwissProt数据库中蛋白质的匹配结果分配基因名称。
我们采用默认过滤参数,基于注释编辑距离(Annotation Edit Distance, AED)对该基因集进行筛选,保留所有AED值小于1(AED<1)的基因,最终得到22393个基因模型。
提供机构:
figshare
创建时间:
2020-01-30



