Data from: De novo genome assembly of Camptotheca acuminata, a natural source of the anti-cancer compound camptothecin
收藏DataCite Commons2025-05-01 更新2025-05-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.nc8qr
下载链接
链接失效反馈官方服务:
资源简介:
Camptotheca acuminata is 1 of a limited number of species that produce
camptothecin, a pentacyclic quinoline alkaloid with anti-cancer activity
due to its ability to inhibit DNA topoisomerase. While transcriptome
studies have been performed previously with various camptothecin-producing
species, no genome sequence for a camptothecin-producing species is
available to date. We generated a high-quality de novo genome assembly for
C. acuminata representing 403 174 860 bp on 1394 scaffolds with an N50
scaffold size of 1752 kbp. Quality assessments of the assembly revealed
robust representation of the genome sequence including genic regions.
Using a novel genome annotation method, we annotated 31 825 genes encoding
40 332 gene models. Based on sequence identity and orthology with
validated genes from Catharanthus roseus as well as Pfam searches, we
identified candidate orthologs for genes potentially involved in
camptothecin biosynthesis. Extensive gene duplication including tandem
duplication was widespread in the C. acuminata genome, with 2571 genes
belonging to 997 tandem duplicated gene clusters. To our knowledge, this
is the first genome sequence for a camptothecin-producing species, and
access to the C. acuminata genome will permit not only discovery of genes
encoding the camptothecin biosynthetic pathway but also reagents that can
be used for heterologous expression of camptothecin and camptothecin
analogs with novel pharmaceutical applications.
提供机构:
Dryad
创建时间:
2017-07-18



