MASPOT panel GBS genotype data - v6.1 reference
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13741630
下载链接
链接失效反馈官方服务:
资源简介:
Genotype-by-sequencing data of the MASPOT panel clones (762 clones in total, 755 with phenotypes) generated by Illumina sequencing of leaf tissue cf. (Sverrisdottir et al., 2017). Biallelic variants have been called relative to the v6.1 Phureja double monoploid reference genome. The SNPs have been filtered to root mean square mapping quality of > 30, MAF > 1 %, missing data < 50 %, and minimum reading depth of 5x. This leaves 175435 variants.
F1 sample names are in HEADER.SAMPLES, SNP identifyer and coordinates are in FILT3.KEY (both outlined in the new readme.txt. The genotypes are in the SNP_V1.0_DMv6.vcf.FILT3_FINAL.DISC.gz file. The snp_counts.txt is a file of the SNP counts in the full sets of filtration we have computed. For our purpose, we use only the discrete genotypes and the FILT3 filtration in version 1. This has the 175435 variants.
The phased MASPOT parent genotypes are in the MASPOT_phased_genome_FILT3_modified_filtered.vcf.gz file. These are the genotype calls for the 18 MASPOT parents, phased to the long read sequencing files using WhatsHap. They are filtered to 1) SNPs passing the quality the FILT3 quality filters, 2) SNPs that are also called for the F1 panel, and 3) SNPs where all parents are phased. The latter, to allow imputation of all offspring. This leaves a subset of 112720 SNPs.
Please see the readme.txt file for explanation of the content of each file.
For our analysis, we are using the FILT3 filtration settings (outlined above) and discrete scale genotypes.
创建时间:
2024-11-08



