Supporting Data for: The genome of the pygmy right whale illuminates the evolution of rorquals
收藏DataONE2023-09-26 更新2025-08-02 收录
下载链接:
https://search.dataone.org/view/sha256:fd215a27d050c5c4c99ce4b2e6114ddc8e1a6fad9753cf5c4d020bd48905f87c
下载链接
链接失效反馈官方服务:
资源简介:
Background
Baleen whales are a clade of gigantic and highly specialized marine mammals. Their genomes have been used to investigate their complex evolutionary history and to decipher the molecular mechanisms that allowed them to reach these dimensions. However, many unanswered questions remain, especially about the early radiation of rorquals and how cancer resistance interplays with their huge number of cells. The pygmy right whale is the smallest and most elusive among the baleen whales. It reaches only a fraction of the body length compared to its relatives and it is the only living member of an otherwise extinct family. This placement makes the pygmy right whale genome an interesting target to update the complex phylogenetic past of baleen whales, because it splits up an otherwise long branch that leads to the radiation of rorquals. Apart from that, genomic data of this species might help to investigate cancer resistance in large whales, since these mechanisms are not as important ...,
Author for correspondence: Magnus Wolf (Magnus.Wolf@senckenberg.de)
The here deposited data is the result of a whole genome sequencing project of the pygmy right whale (Caperea marginata, Gray 1846). Apart of the genome construction, this project includes a phylogenomic revision of the rorqual clade and a positive selection analysis to find genes related to body size and hence cancer resistance in baleen whales. This deposition is composed of:
Code to create phylogenomic trees:
1.) A zip file including the main script written in UNIX bash as well as the necessary subscripts and an extensive README file containing necessary instructions. (filename: GEMOMA-to-Phylogeny.zip)
Genome data (Cmar):
1.) A raw whole genome assembly without changes made by NCBI in fasta format. (filename: Cmar_C18_SBIK-F_TBG_v1.fasta.gz)
2.) A homology-based genome annotation of the newly constructed genome, including a gff table and an amino acid fasta file. (filename gff: Cmar_C18_SBIK-F_TBG_v1_annotation.gff....,
General Usage:
Many files containing sequence data are zipped using gzip. Use âgunzipâ to reverse this. Also, directories containing many sub-files are compiled in a tar ball. Use âtar -xzvfâ to open the directory first.
Usage Annotation Data:
The assembly as well as the cds and amino acid sequences are in typical fasta format and can be viewed by any type of text editor. The gene ID within all these files are named after the best hit within one of the used reference annotations used for homology-based annotation.
Usage Phylogenomics Data:
All alignments including the WGA, WGA fragments and SCOSs are in fasta alignment format and can again be opened by any text editor. To better understand their quality however, we recommend alignment viewing software like AliView (http://genocat.tools/tools/aliview.html). SCOS raw sequences are in regular fasta format and can be opened with any text editor. Within WGA sequences, header represent a short 6- character long species identified made from...,
创建时间:
2025-07-21



