five

Data from: Aquatic adaptation and depleted diversity: a deep dive into the genomes of the sea otter and giant otter

收藏
Mendeley Data2024-06-25 更新2024-06-30 收录
下载链接:
https://datadryad.org/stash/dataset/doi:10.5061/dryad.f8g6mg8
下载链接
链接失效反馈
官方服务:
资源简介:
Southern sea otter genome annotationThis gff file is the annotation of the southern sea otter (Enhydra lutris nereis) genome, generated using MAKER2 based on evidence from RNA-Seq of sea otter whole blood and protein data from domestic ferret, domestic cat and domestic dog. Gene models were predicted by AUGUSTUS after three rounds of training. Protein and domain information from the Swissprot/Uniprot database (accessed in 2016) and Interproscan 5.19-58 are included in the annotations. See details in SI Methods of Beichman et al. (2019).final_sea_otter_23May2016_bS9RH.round5.AED1.0.blast1e-06.renamed.rmDupScaffs.20171118.gff.gzSouthern sea otter genome annotation protein sequencesThis fasta file contains the sequences of proteins from the southern sea otter (Enhydra lutris nereis) genome, generated using MAKER2 and AUGUSTUS. See details in SI Methods of Beichman et al. (2019).final_sea_otter_23May2016_bS9RH_proteins.round5.AED1.0.blast1e-06.renamed.rmDupScaffs.20171118.fasta.gzSouthern sea otter genome annotation transcript sequences (without UTRs)This fasta file contains the sequences of transcripts from the southern sea otter (Enhydra lutris nereis) genome, generated using MAKER2 and AUGUSTUS. UTR regions have been removed. See details in SI Methods of Beichman et al. (2019).sea_otter_23May2016_bS9RH.all.maker.transcripts.round5.AED1.0.renamed.NOUTRS.rmDupScaffs.20171118.fasta.gzSouthern sea otter genotypesThis directory contains the filtered genotypes from southern sea otter (Enhydra lutris nereis) sequencing reads (Project number: PRJNA472597, Acc. SRR8597300) mapped to the domestic ferret reference genome (GCF_000215625.1). The files are in vcf format, and the genome was split into 323 chunks to reduce file sizes. Each of the first 1-224 .vcf files contain a single scaffold, and the remaining 225-323 vcf files each contain groups of 76 smaller scaffolds. The directory ferret_genome_interval_bedfiles.tar.gz contains the bed files that give the scaffold coordinates of each .vcf file (e.g. interval_244.bed gives the coordinates of the scaffolds contained in the vcf file labeled as 244: *.raw_variants.244.*.HQsites.Only.rmDotGenotypes.rmBadVars.vcf.gz). The reads were mapped, genotypes were called and filtered as described in the README and SI Methods in Beichman et al. 2019.southern_sea_otter.vcfFiles.tar.gzNorthern sea otter genotypesThese vcf files contain filtered genotypes from the northern sea otter (Enhydra lutris kenyoni) sequencing reads (SRA: SRX2967283, from Jones et al. (2018)) mapped to the domestic ferret reference genome (GCF_000215625.1). When gVCF files were generated, the genome was split into 323 chunks to parallelize processing and reduce filesize. Each of the first 1-224 .vcf files each contain a single scaffold, and the remaining 225-323 .vcf files each contain groups of 76 smaller scaffolds. The directory ferret_genome_interval_bedfiles.tar.gz contains the bed files that give the scaffold coordinates of each .vcf file (e.g. interval_244.bed gives the coordinates of the scaffolds contained in the vcf file labeled as 244: *.raw_variants.244.*.HQsites.Only.rmDotGenotypes.rmBadVars.vcf.gz).northern_sea_otter.vcfFiles.tar.gzFerret genome intervals used to split vcf filesThe vcf files of each otter were split into intervals to reduce filesize. This directory (ferret_genome_interval_bedfiles.tar.gz) contains bed files that give the scaffold coordinates of each each interval. For example interval_244.bed gives the coordinates of the scaffolds contained in the vcf file labeled 244: *.raw_variants.244.*.HQsites.Only.rmDotGenotypes.rmBadVars.vcf.gz).ferret_genome_interval_bedfiles.tar.gzGiant otter genotypesThese vcf files contain filtered genotypes from the giant otter (Pteronura brasiliensis) sequencing reads (Project number: PRJNA399365) mapped to the domestic ferret reference genome (GCF_000215625.1). When gVCF files were generated, the genome was split into 323 chunks to parallelize processing and reduce filesize. Each of the first 1-224 .vcf files each contain a single scaffold, and the remaining 225-323 .vcf files each contain groups of 76 smaller scaffolds. The directory ferret_genome_interval_bedfiles.tar.gz contains the bed files that give the scaffold coordinates of each .vcf file (e.g. interval_244.bed gives the coordinates of the scaffolds contained in the vcf file labeled as 244: *.raw_variants.244.*.HQsites.Only.rmDotGenotypes.rmBadVars.vcf.gz).giant_otter.vcfFiles.tar.gz
创建时间:
2023-06-28
二维码
社区交流群
二维码
科研交流群
商业服务