five

Chrysiptera cyanea genome annotation, transcriptome, proteome, and script for manuscript figure production

收藏
DataCite Commons2024-12-25 更新2024-11-05 收录
下载链接:
https://figshare.com/articles/dataset/Chrysiptera_cyanea_genome_annotation_transcriptome_proteome/27143571/2
下载链接
链接失效反馈
官方服务:
资源简介:
Chrysiptera cyanea is among the most common coral reef fish in shallow coastal zones around the main island of Okinawa. Our team is working to develop this fish as a model organism for environmental monitoring purposes. Generating a genome allowed to investigate biomolecular aspects (notably using RNAseq) of the impact of environmental factors on Chrysiptera cyanea.On NCBI:- Raw genomic reads can be found on NCBI under the BioProject PRJNA1167451: m64150e_221129_182410.hifi_reads.fastq.gz and m64150e_221201_052041.hifi_reads.fastq.gz- Biosample is SAMN44005798: Sesoko_Male1_2022_Liver_for_DNA.- The assembly presented in the manuscript is C.cyanea_contigs.fasta.Genome annotation was notably performed using organ-specific transcriptomic reads from the same individual (Sesoko Male 1): - Eye1_S45_R1_001.fastq.gz and Eye1_S45_R2_001.fastq.gz (Accession number: SRR30849507)- Gill1_S47_R1_001.fastq.gz and Gill1_S47_R2_001.fastq.gz (Accession number: SRR30849506)- Liver1_S44_R1_001.fastq.gz and Liver1_S44_R2_001.fastq.gz (Accession number: SRR30849505)- Muscle1_S46_R1_001.fastq.gz and Muscle1_S46_R2_001.fastq.gz (Accession number: SRR30849504)- Sesoko-B1_S13_R1_001.fastq.gz and Sesoko-B1_S13_R2_001.fastq.gz [The brain from a different fish (Sesoko Male B1) collected at the same site on the same day also used for the annotation] (Accession number: SRR30849503)<br>On FigShare:- C.cyanea_genomeannotation.gff3 : genome annotation - C.cyanea_Final_Codingseq.fasta : coding sequence - C.cyanea_proteome.fasta : proteome - C.cyanea_blastannotation.csv : functional annotation of the coding sequence- All BUSCO files for assemblies performed with various options of Flye and the Improved Phased Assembler can also be found on FigShare (short summary, full table, and missing BUSCO list for 9 different assembly options). The assembly that was used for the final genome is the Improved Phased Assembler, no phase, parental alleles (nophase_p)- Delta files obtained from nucmer aligning the genome assembly to reference Pomacentrid genomes (Amphiprion clarkii, A. ocellaris, A. percula, Acanthochromis polyacanthus, Dascyllus trimaculatus) as well as the number of bases aligning between the C. cyanea contigs and reference genomes (n_bases_Ccy_to_X_nucmer.csv) by filtering to only keep alignments longer than 10,000 bases- The R (v4.3.3) script using the aforementioned files to produce Figure 2 and 3 from the manuscript: dotplot_and_nb_bases_calculation_Rscript.txt
提供机构:
figshare
创建时间:
2024-11-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作