five

Gene annotation of Blastobotrys mokoenaii, Blastobotrys illinoisensis, and Blastobotrys malaysiensis

收藏
Figshare2025-03-21 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Gene_annotation_of_i_Blastobotrys_mokoenaii_i_i_Blastobotrys_illinoisensis_i_and_i_Blastobotrys_malaysiensis_i_/28606814
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains the gene annotation data for three species of Blastobotrys yeats: B. mokoenaii, B. illinoisensis, and B. malaysiensis.The genome assemblies for B. mokoenaii (NRRL Y-27120) and B. malaysiensis (NRRL Y-6417) were publicly available on the National Center for Biotechnology Information (NCBI) under accessions GCA_003705765.3 and GCA_030558815.1, respectively.The genome assembly for B. illinoisensis (NRRL YB-1343) was generated by SciLifeLab's National Genomics Infrastructure (NGI) using PacBio long-read data and deposited in the European Nucleotide Archive (ENA) under accession GCA_965113335.1.File descriptionbmokoenaii_annotation.gffThis file contains the gene models predicted for B. mokoenaii (GCA_003705765.3).billinoisensis_annotation.gffThis file contains the gene models predicted for B. illinoisensis (GCA_003705765.3).bmalaysiensis_annotation.gffThis file contains the gene models predicted for B. malaysiensis (GCA_030558815.1).Gene annotation methodsRepeat MaskingPrior to annotation, a repeat library was built for each species using RepeatModeler2 v2.0.2 and the genomes were soft-masked using RepeatMasker v4.1.5.$ RepeatModeler -database ${DB} -engine ncbi -pa 16$ RepeatMasker -dir . -gff -u -no_is -xsmall -e ncbi -lib ${LIBRARY} -pa 16 genome.fastaStructural AnnotationStructural annotation was performed on the soft-masked genomes using Braker3 v3.0.3 incorporating external evidence in the form of all fungal proteins from OrthoDB v11 (available at https://bioinf.uni-greifswald.de/bioinf/partitioned_odb11).$ braker.pl --genome="$genome" \--prot_seq=${protein} --workingdir=${PWD} \--gff3 --threads=16 --verbosity=3 \--nocleanup --species=${i}Functional AnnotationThe predicted genes were functionally annotated using the National Bioiformatics Infrastructure Sweden (NBIS) functional_annotation nextflow pipeline v2.0.0 (https://github.com/NBISweden/pipelines-nextflow). Briefly, this pipeline performs similarity searches between the annotated proteins and the UniProtKB/Swiss-Prot database (downloaded on 2023-12) using the Basic Local Alignment Search Tool (BLAST). Then it uses InterProScan to query the proteins against InterPro v59-91 databases, and merges results using AGAT v1.2.0.tRNAs and rRNAsTransfer RNA (tRNA) and ribosomal RNA (rRNA) genes were annotated using tRNAscan-SE v2.0.12 and barrnap v0.9, respectively. Other ncRNAs, such as SRP RNA, RNase P RNA, spliceosomal ncRNAs etc. have not been predicted. Finnally, the functionally annotated protein-coding genes, tRNAs, and rRNAs were combined into a single GFF file using AGAT v1.2.0.$ tRNAscan-SE -E --gff ${output}_trnas.gff --thread 16 ${genome}.fasta$ barrnap --kingdom euk --threads 6 ${genome}.fasta > ${output}_rrna.gffAnnotation integrationFinnally, the functionally annotated protein-coding genes, tRNAs, and rRNAs were combined into a single GFF file using AGAT v1.2.0.$ agat_sp_complement_annotations.pl --ref ${protein_coding} --add ${trna} --add ${rrna} --out full_annotation.gff
创建时间:
2025-03-21
二维码
社区交流群
二维码
科研交流群
商业服务