In Silico Sorghum GxExM Dataset For Prediction Algorithm Comparisons
收藏Mendeley Data2024-01-31 更新2024-06-29 收录
下载链接:
https://figshare.com/articles/dataset/In_Silico_Sorghum_GxExM_Dataset_For_Prediction_Algorithm_Comparisons/23789685/5
下载链接
链接失效反馈官方服务:
资源简介:
# In Silico Sorghum GxExM Dataset For Prediction Algorithm Comparisons ## BackgroundA simulated Multi-Environment Trial (MET) dataset to stimulate the development and comparisons of predictive algorithms to deconvolute genotype-by-environment-by-management interactions in plant breeding. Contains Genomic (SNP & QTL) & Phenotypic Data for Multiple Traits. ### Data##### Mapmap.csvPositional Information on the Causal (Quantitative Trait Loci - QTL) and Non-Causal (Single Nucleotide Polymorphisms - SNP) Genomic Sites within and across chromosomesColumn 1: Chromosome NumberColumn 2: Marker IDColumn 3: Position (Morgans)##### Genotypeqtl_effects.csvInformation on Causal Genomic Sites (Quantitative Trait Loci - QTL) & their Effect Sizes on Component Traits;- Column 1: QTL Effect Sizes For Propensity To Tiller (ptt)- Column 2: QTL Effect Sizes For Canopy (ams)- Column 3: QTL Effect Sizes For Maturity (mtu) *QtlIndex.csv*Information on QTL position for each Component Trait. Position Number connects to Column order in qtl.csv- Column 1: Per Component Trait QTL Index- Column 2: QTL Positions For Propensity To Tiller (ptt)- Column 3: QTL Positions For Canopy (ams)- Column 4: QTL Positions For Maturity (mtu) *qtl.csv*Information on Number of Gene Copies (Alleles) at each QTL for each individual- 1st Column: Individual ID- 2nd Column Onnwards: QTL Allele Counts *markers.csv*Information on the number of Gene Copies (Alleles) at each SNP (Non-Causal Genomic Sites) for each individual- 1st Column: Individual ID- 2nd Column Onwards: SNP Allele Counts ##### Trait Records *trait_data.csv*Trait Records for each individual- Column 1: Genotype ID- Column 2: Phenotype (without error) for Propensity of Tiller (ptt)- Column 3: Phenotype (without error) for Canopy (ams)- Column 4: Phenotype (without error) for Maturity (mtu)- Column 5: Plant Population/Plot Density- Column 6: Environment/Site where Crop was grown- Column 7: Phenotype (without error) for Biomass- Column 8: Phenotype (without error) for Grain Yield *trait_data_H2_0.3.csv*Trait Records for each individual- Columns 1- 8: same as trait_data.csv- Column 9: Replicate ID- Column 10: Simulated Error for Grain Yield Observations in Column 11- Column 11: Phenotype (with error) for Grain Yield (Broad Sense Heritability=0.3) *trait_data_H2_0.5.csv*Trait Records for each individual- Columns 1- 8: same as trait_data.csv- Column 9: Replicate ID- Column 10: Simulated Error for Grain Yield Observations in Column 11- Column 11: Phenotype (with error) for Grain Yield (Broad Sense Heritability=0.5) *trait_data_H2_0.8.csv*Trait Records for each individual- Columns 1- 8: same as trait_data.csv- Column 9: Replicate ID- Column 10: Simulated Error for Grain Yield Observations in Column 11- Column 11: Phenotype (with error) for Grain Yield (Broad Sense Heritability=0.8) *trait_data_H2_0.99.csv*Trait Records for each individual- Columns 1- 8: same as trait_data.csv- Column 9: Replicate ID- Column 10: Simulated Error for Grain Yield Observations in Column 11- Column 11: Phenotype (with error) for Grain Yield (Broad Sense Heritability=0.99)
# 用于预测算法比较的计算机模拟高粱GxExM数据集
## 背景
本数据集为模拟多环境试验(Multi-Environment Trial, MET)数据集,旨在推动植物育种领域中解析基因型-环境-管理互作的预测算法的开发与比较。数据集涵盖多性状的基因组数据,包含单核苷酸多态性(Single Nucleotide Polymorphisms, SNP)与数量性状位点(Quantitative Trait Loci, QTL)相关信息,以及表型数据。
### 数据
##### 映射文件
map.csv
包含染色体内部及跨染色体的因果(数量性状位点QTL)与非因果(单核苷酸多态性SNP)基因组位点的位置信息。
列1:染色体编号
列2:标记ID
列3:位置(摩根单位)
##### 基因型数据
qtl_effects.csv
包含因果基因组位点(数量性状位点QTL)及其对各组分性状的效应量信息:
列1:分蘖倾向(Propensity To Tiller, ptt)的QTL效应量
列2:冠层性状(Canopy, ams)的QTL效应量
列3:生育期(Maturity, mtu)的QTL效应量
QtlIndex.csv
包含各组分性状的QTL位置信息,位置编号与qtl.csv中的列顺序一一对应:
列1:各组分性状的QTL索引
列2:分蘖倾向(ptt)的QTL位置
列3:冠层性状(ams)的QTL位置
列4:生育期(mtu)的QTL位置
qtl.csv
包含每个个体在各QTL位点的基因拷贝数(等位基因)信息:
第1列:个体ID
第2列及后续列:QTL等位基因计数
markers.csv
包含每个个体在各SNP(非因果基因组位点)位点的基因拷贝数(等位基因)信息:
第1列:个体ID
第2列及后续列:SNP等位基因计数
##### 性状记录文件
trait_data.csv
包含每个个体的性状记录:
列1:基因型ID
列2:分蘖倾向(ptt)的无误差表型值
列3:冠层性状(ams)的无误差表型值
列4:生育期(mtu)的无误差表型值
列5:植株群体/小区密度
列6:作物种植的环境/试验地点
列7:生物量的无误差表型值
列8:籽粒产量的无误差表型值
trait_data_H2_0.3.csv
包含每个个体的性状记录:
列1至8:与trait_data.csv完全一致
列9:重复ID
列10:第11列籽粒产量观测值的模拟误差项
列11:籽粒产量的带误差表型值(广义遗传力=0.3)
trait_data_H2_0.5.csv
包含每个个体的性状记录:
列1至8:与trait_data.csv完全一致
列9:重复ID
列10:第11列籽粒产量观测值的模拟误差项
列11:籽粒产量的带误差表型值(广义遗传力=0.5)
trait_data_H2_0.8.csv
包含每个个体的性状记录:
列1至8:与trait_data.csv完全一致
列9:重复ID
列10:第11列籽粒产量观测值的模拟误差项
列11:籽粒产量的带误差表型值(广义遗传力=0.8)
trait_data_H2_0.99.csv
包含每个个体的性状记录:
列1至8:与trait_data.csv完全一致
列9:重复ID
列10:第11列籽粒产量观测值的模拟误差项
列11:籽粒产量的带误差表型值(广义遗传力=0.99)
创建时间:
2024-01-31



