Chrysiptera cyanea genome annotation, transcriptome, proteome, and script for manuscript figure production
收藏DataCite Commons2024-12-25 更新2024-11-05 收录
下载链接:
https://figshare.com/articles/dataset/Chrysiptera_cyanea_genome_annotation_transcriptome_proteome/27143571/2
下载链接
链接失效反馈官方服务:
资源简介:
Chrysiptera cyanea is among the most common coral reef fish in shallow coastal zones around the main island of Okinawa. Our team is working to develop this fish as a model organism for environmental monitoring purposes. Generating a genome allowed to investigate biomolecular aspects (notably using RNAseq) of the impact of environmental factors on Chrysiptera cyanea.On NCBI:- Raw genomic reads can be found on NCBI under the BioProject PRJNA1167451: m64150e_221129_182410.hifi_reads.fastq.gz and m64150e_221201_052041.hifi_reads.fastq.gz- Biosample is SAMN44005798: Sesoko_Male1_2022_Liver_for_DNA.- The assembly presented in the manuscript is C.cyanea_contigs.fasta.Genome annotation was notably performed using organ-specific transcriptomic reads from the same individual (Sesoko Male 1): - Eye1_S45_R1_001.fastq.gz and Eye1_S45_R2_001.fastq.gz (Accession number: SRR30849507)- Gill1_S47_R1_001.fastq.gz and Gill1_S47_R2_001.fastq.gz (Accession number: SRR30849506)- Liver1_S44_R1_001.fastq.gz and Liver1_S44_R2_001.fastq.gz (Accession number: SRR30849505)- Muscle1_S46_R1_001.fastq.gz and Muscle1_S46_R2_001.fastq.gz (Accession number: SRR30849504)- Sesoko-B1_S13_R1_001.fastq.gz and Sesoko-B1_S13_R2_001.fastq.gz [The brain from a different fish (Sesoko Male B1) collected at the same site on the same day also used for the annotation] (Accession number: SRR30849503)<br>On FigShare:- C.cyanea_genomeannotation.gff3 : genome annotation - C.cyanea_Final_Codingseq.fasta : coding sequence - C.cyanea_proteome.fasta : proteome - C.cyanea_blastannotation.csv : functional annotation of the coding sequence- All BUSCO files for assemblies performed with various options of Flye and the Improved Phased Assembler can also be found on FigShare (short summary, full table, and missing BUSCO list for 9 different assembly options). The assembly that was used for the final genome is the Improved Phased Assembler, no phase, parental alleles (nophase_p)- Delta files obtained from nucmer aligning the genome assembly to reference Pomacentrid genomes (Amphiprion clarkii, A. ocellaris, A. percula, Acanthochromis polyacanthus, Dascyllus trimaculatus) as well as the number of bases aligning between the C. cyanea contigs and reference genomes (n_bases_Ccy_to_X_nucmer.csv) by filtering to only keep alignments longer than 10,000 bases- The R (v4.3.3) script using the aforementioned files to produce Figure 2 and 3 from the manuscript: dotplot_and_nb_bases_calculation_Rscript.txt
提供机构:
figshare
创建时间:
2024-11-04



