Dataset for genome-wide association study of maize phosphorus efficiency
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.9cnp5hqs0
下载链接
链接失效反馈官方服务:
资源简介:
This dataset comprises genotypic data derived from 398 genotypes of maize (Zea mays), categorized into 100 Dent (maize_Dent), 100 Flint (maize_Flint), and 198 doubled haploid Landraces (maize_LR) sourced from six European lines. The data is provided in Variant Call Format (vcf), with each population having its dedicated file, analyzed separately. The vcf files contain meta-information lines, a header line, and subsequent data lines detailing genomic positions. Meta-information is included post the ## string. The header encompasses eight fixed fields per record, tab-delimited and specifying information such as chromosome (CHROM), position (POS), identifier (ID), reference base(s) (REF), alternate base(s) (ALT), quality (QUAL), filter status (FILTER), and additional information (INFO). Missing values are denoted by a dot ('.'). The data format includes a FORMAT field specifying data types and order, followed by individual fields for each sample. Genotypes are encoded as allele values separated by either '/' or '|'. This dataset was employed in a genome-wide association study (GWAS) utilizing a diversity panel to analyze Phosphorus use efficiency in these genotypes.
Methods
All genotypes in the study had available marker data generated using the MaizeSNP50 BeadChip from Illumina®. To ensure data integrity, quality control procedures were implemented. Markers exhibiting more than 5% heterozygotes or 50% missing values per marker were excluded, along with markers having an additional 20% missing values per genotype in the entire dataset. Subsequently, individualized quality control was applied to each subpopulation. Heterozygous markers were set to NA, and a minor allele frequency (MAF) filter of 3% was imposed. Imputation was conducted separately for each subpopulation using BEAGLE 5.0. The imputed data underwent further filtering, with a threshold set for a MAF greater than 5%. The physical positions of Single Nucleotide Polymorphisms (SNPs) are referenced to the B73 genome version 4.
创建时间:
2025-01-17



