Genome assembly and annotation of the terestrial microalgae Coccomyxa elongata SAG216-3B. Coccomyxa elongate reference genome
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJEB79308
下载链接
链接失效反馈官方服务:
资源简介:
Unicellular green algae of the genus Coccomyxa are recognized for their worldwide distribution and ecological versatility. Coccomyxa elongata is a freshwater species of the Coccomyxa simplex clade, which also includes lichen symbionts. To facilitate future molecular and phylogenomic studies of this versatile clade of algae, we generated a high-quality genome assembly for Coccomyxa elongata Chodat & Jaag SAG 216-3b. A combination of long-read PacBio HiFi and Oxford Nanopore Technologies with chromatin conformation capture (Hi-C) sequencing led to the assembly of the genome into 21 scaffolds with a total length of 51.4 Mb and an N50 of 2.8 Mb. Nineteen of the scaffolds represent full-length nuclear chromosomes delimited by telomeric repeats, while the two additional scaffolds represent the mitochondrial and plastid genomes. Transcriptome-guided gene annotation resulted in the identification of 14,811 protein-coding genes, of which 61% have annotated PFAM domains and 841 are predicted to be secreted. BUSCO analysis identified a total of 1,494 (98.4 %) complete gene models, suggesting a highly complete genome annotation.
颗球藻属(Coccomyxa)的单细胞绿藻以其全球分布与生态多功能性著称。伸长颗球藻(Coccomyxa elongata)是Coccomyxa simplex演化支下的淡水物种,该演化支同时涵盖地衣共生藻类。为推动该多功能藻类演化支的后续分子与系统基因组学研究,我们为伸长颗球藻(Coccomyxa elongata Chodat & Jaag SAG 216-3b)构建了高质量基因组组装版本。本研究结合长读长PacBio HiFi与牛津纳米孔(Oxford Nanopore Technologies)测序数据,并辅以染色质构象捕获(Hi-C)测序,最终将基因组组装为21个基因组支架(scaffold),总长度达51.4 Mb,N50值为2.8 Mb。其中19个基因组支架为以端粒重复序列界定的全长核染色体,剩余2个基因组支架分别对应线粒体基因组与质体基因组。通过转录组引导的基因注释流程,共鉴定出14811个蛋白质编码基因,其中61%带有PFAM结构域注释,另有841个基因被预测为分泌蛋白。BUSCO分析共鉴定出1494个完整基因模型(占比98.4%),表明该基因组注释具有极高的完整性。
创建时间:
2024-08-26



