five

Pseudomolecules and annotation of the third version of the reference genome sequence assembly of barley cv. Morex [Morex V3]

收藏
doi.ipk-gatersleben.de:4432025-01-21 收录
下载链接:
https://doi.ipk-gatersleben.de:443/DOI/b2f47dfb-47ff-4114-89ae-bad8dcc515a1/7eb2707b-d447-425c-be7a-fe3f1fae67cb/2
下载链接
链接失效反馈
官方服务:
资源简介:
DNA sequence file in FASTA format for chromosomal pseudomolecules of barley (Hordeum vulgare) cv. Morex. This is the third release (Morex V3) of the Morex genome sequence assembly. Primary contig assembly from PacBio Hifi reads was done with HiCanu [doi:10.1101/gr.263566.120]. Contigs were scaffolded with Bionano data and arranged into chromosomal pseudomolecules with Hi-C data using the TRITEX pipeline [doi:10.1186/s13059-019-1899-5]. An AGP file specifying the placement of sequence scaffolds in the pseudomolecules is provided. The folder 'gene_annotation' holds the structural gene annotation of the Morex V3 assembly: gene models in GFF3 format, their functional descriptions as well as coding and protein sequences of high- and low-confidence genes. The folder 'repeat annotation' contains GFF files specifying the positions of transposable elements and tandem repeats. A table with the approximate centromere positions is found in the folder 'centromere_positions'. Annotation and data management were supported by the de.NBI grant (www.denbi.de) of the German Federal Ministry of Education and Research (031A536).

本数据集包含大麦(Hordeum vulgare)品种Morex的染色体伪分子DNA序列文件,采用FASTA格式。此为Morex基因组序列组装的第三版(Morex V3)。主要串组装自PacBio Hifi读数,使用HiCanu进行(DOI:10.1101/gr.263566.120)。通过Bionano数据进行串构建,并利用Hi-C数据进行排列,以构建染色体伪分子,采用TRITEX流程(DOI:10.1186/s13059-019-1899-5)。此外,提供了一个AGP文件,用于指定序列串在伪分子中的定位。'gene_annotation'文件夹包含Morex V3组装的结构基因注释:GFF3格式的基因模型、其功能描述以及高置信度和低置信度基因的编码和蛋白质序列。'repeat annotation'文件夹包含GFF文件,用于指定转座元件和串联重复序列的位置。在'centromere_positions'文件夹中,可找到一个包含约中心体位置的表格。数据注释和管理得到了德国联邦教育与研究部(www.denbi.de)下属的de.NBI项目(031A536)的支持。
提供机构:
doi.ipk-gatersleben.de:443
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集提供了大麦品种Morex第三版参考基因组序列组装的伪分子和详细注释,包括基因模型、转座元件和着丝粒位置信息,数据格式为FASTA和GFF3,总大小为4.9 GB。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务