Whole genome assembly and annotation of the King Angelfish (Holacanthus passer) gives insight into the evolution of marine fishes of the Tropical Eastern Pacific
收藏Mendeley Data2024-01-31 更新2024-06-28 收录
下载链接:
https://datadryad.org/stash/dataset/doi:10.7291/D1X10B
下载链接
链接失效反馈官方服务:
资源简介:
To annotate our genome, we used the homology-based gene prediction pipeline GeMoMa (v1.6.4). GeMoMa uses protein-coding gene models and intron position conservation from reference genomes to predict possible protein-coding genes in a target genome (Keilwagen et al., 2018). Here, we ran the GeMoMa pipeline using annotations from three fish species: Amphiprion ocellaris, Oreocromis niloticus, Electrophorus electricus (downloaded from NCBI, see Table S3). These species were selected to represent a variety of genes from close to distant high-quality fish annotations. In our particular case, the pipeline performed four main steps: 1) Extractor or external search, using the search algorithm tbalstn with cds parts as queries from our reference genomes, 2) Gene Model Mapper (GeMoMa), which builds gene models from the extractor results, 3) GeMoMa Annotation Filter (GAF) that filters and combines common gene predictions and 4) AnnotationFinalizer, which predicts UTRs for annotated coding sequences and generate genes and transcripts names (Keilwagen et al., 2018). Additionally, repetitive elements were predicted by running RepeatMasker (open-4.0.6, Smit et al. 2013–2015) with the Teleostei database to identify repetitive elements in the genome and soft-mask the assembly. RepeatMasker.out was converted to GFF with RepeatMasker script `rmOutToGFF3.pl`.
创建时间:
2024-01-31



