Sequences and annotations of a provisional genome draft of a Senegalese sole female (Sosen1) and a male (Sse05_10M)
收藏DataCite Commons2022-02-02 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/Sequences_and_annotations_of_a_provisional_genome_draft_of_a_Senegalese_sole_female/12472100/3
下载链接
链接失效反馈官方服务:
资源简介:
Information as in 2018 of a female Senegalese sole genome (<b>Sosen1</b>) after Nanopore sequencing. Unzip the archive 1) <b>Sosen1_genome_draft.zip</b> to find: • <b><i>Sosen1_genome_scaffolds.fasta</i></b> containing every contig and scaffold identifier and sequence in fasta format. • <b><i>Sosen1_genome_annotation.gff3</i></b> corresponding to a provisional annotation of genome contigs and scalffolds from (1) using <i>MAKER2</i> and transcript sequences in SOLSEv5.0. • <b><i>Sosen1_maker.transcripts.fasta</i></b> containing the deduced transcripts from the gff3 annotation file. • <b><i>Sosen1_maker.proteins.fasta</i></b> containing the deduced amino acid sequence for all transcripts from (3). • <b><i>Sosen1_maker.proteins_annotation.tsv</i></b> containing a complete annotation of (3) and (4) performed with our software <i>Full-LengtherNext</i>. This includes transcript and protein lengths, best UniProtKB orthologue with identity % and E-value, structural status, open reading frame location in the transcript, description, GOs, KEGG codes, InterPro IDs, Pfam, EC and Unipathway, as tab-separated values (tsv format).<br><br>The <b>Sosen1</b> (or SENf1A) female genome was reannotated in 2020. Data are in the file2) <b>Sosen1_female_reannotation_2020.zip</b> that once unzipped provides the following files:• <b><i>SENf1A.gff3.gz </i></b>--> gff3 file with the protein coding annotation<br>• <b><i>SSENf1A.stats.txt.gz</i></b> --> Stats of the protein-coding annotation• <b><i>SSENf1A.transcripts.fa.gz</i></b> --> multifasta file with the protein-coding annotated transcripts• <b><i>SSENf1A.pep.fa.gz</i></b> --> aminoacid sequence of the annotated proteins• <b><i>SSENf1A.cds.fa.gz</i></b> --> nucleotide sequence of the annotated proteins• <b><i>SSENf1A.longestpeptide.fa.gz</i></b> --> aminoacid sequence of the longest protein annotated for each gene• <b><i>SSENf1ncA.gff3.gz </i></b>--> gff3 file with the non-coding annotation• <b><i>SSENf1ncA.transcripts.fa.gz</i></b> --> multifasta file with the non-coding transcripts<br><br>Information as in 2020 of a male Senegalese sole genome <b>Sse05_10M (</b>or Sosen2 or SSENm1B) after a hybrid sequencing an assembling. 3) <b>Sosen2_male_genome_scaffolds.fasta</b> contain the genome scaffolds4) <b>Sosen2_annotations.zip</b> contains the male genome integrated with genetic markers to provide linkage groups as chromosome surrogates, as well as gene annotations in the following files:• <b><i>Male_LA_Total.fasta.gz</i></b> --> male genome assembly• <b><i>SSENm1B.gff3.fz</i></b> --> gff3 file with the protein coding annotation• <b><i>SSENm1B.stats.txt.gz</i></b> --> Stats of the protein-coding annotation• <b><i>SSENm1B.transcripts.fa.gz</i></b> --> multifasta file with the protein-coding annotated transcripts• <b><i>SSENm1B.pep.fa.gz</i></b> --> aminoacid sequence of the annotated proteins• <b><i>SSENm1B.cds.fa.gz</i></b> --> nucleotide sequence of the annotated proteins• <b><i>SSENm1B.longestpeptide.fa.gz</i></b> --> aminoacid sequence of the longest protein annotated for each gene• <b><i>SSENm1ncB.gff3.gz</i></b> --> gff4 file with the non-coding annotation• <b><i>SSENm1ncB.transcripts.fa.gz</i></b> --> multifasta file with the non-coding transcripts
提供机构:
figshare
创建时间:
2021-07-05



