five

SNP and indel discovery and genotyping in next-generation sequencing data

收藏
Mendeley Data2024-06-25 更新2024-06-28 收录
下载链接:
https://zenodo.org/record/159272
下载链接
链接失效反馈
官方服务:
资源简介:
Code, logs and data for discovery and genotyping of SNPs and indels, in the the D.melanogaster genome, using GATK HaplotypeCaller. Code is in the zipped folder named code.zip. Run logs for this code as in the zipped folder named logs.zip. The unfiltered vcf genotypes file is named lhm_rg_HC_2015-09-15.vcf.gz. The filtered vcf genotypes file is named f1.lhm_rg_HC_raw.vcf.gz. The vcf submitted to NCBI dbSNP (filtered, and with indels >50bp and variants with null alternate alleles both removed) is named dbSNP.lhm_rg_HC_raw.vcf.gz. The folder local_reference.zip contains the reference assembly files against which genotypes were called against, and includes the code used to format the data prior to use. Also included is genotypes data from the two in-house reference line samples sequenced (BDGP6+ISO1 mito/dm6, Bloomington Drosophila Stock Center no. 2057) Samples are 220 Sussex-LHM hemiclones, and 2 RG. The first run did not include chromosome 4 and the mitochondrial genome, so these were genotyped separately, and then added to the rest of the results. The link for the NCBI dbSNP record is currently https://www.ncbi.nlm.nih.gov/projects/SNP/snp_viewBatch.cgi?sbid=1062461and the submitter handle is MORROW_EBE_SUSSEX. At the time of writting, the NCBI D.melanogaster build is still being updated, and therefore ss identifiers, but not rs identifers are available. The pre-print manuscript for this data is available on biorxiv: "Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample" http://biorxiv.org/content/early/2016/10/17/081554 doi: http://dx.doi.org/10.1101/081554
创建时间:
2023-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作