five

GVCFs of 3,039 natural isolates of Saccharomyces cerevisiae

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/12571279
下载链接
链接失效反馈
官方服务:
资源简介:
This archive contains the GVCF files of 3,039 natural isolates of Saccharomyces cerevisiae, which were used to generate a comprehensive catalog of the small genetic variants in the population. These files allow to regenerate the variant matrix without starting from the sequencing reads.  Data description GVCF_3039samples.tar.gzArchive containing the 3,039 GVCFs with their tab index. Sace_S288c_reference_FullMatrixID.fastaReference file used for the short-read mapping. It is the Saccharomyces cerevisiae S288c R64 genome assembly. Chromosomes are named chromosome1, chromosome2, chromosome3, etc.Sace_S288c_reference_FullMatrixID.bedBed file of the reference genome. ProtocolJointGenotyping.pdfProtocol to follow for joint genotyping the 3,034 GVCFs, or to add new isolates to the collection.  Method Publicly available sequencing reads were gathered for 3,039 isolates of S. cerevisiae, taking only whole-genome illumin sequencing with more than 20X sequencing depth. Reads were mapped to the reference genome using bwa-mem2 v2.2.1, and bam files were sorted with samtools v1.15.1. GVCF files were generated using gatk v4.2.3.0 with the command `gatk HaplotypeCaller -R reference.fasta -I input.bam -O output.g.vcf.gz --emit-ref-confidence GVCF`.
创建时间:
2024-10-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作