Additional file 1 of On the impact of relatedness on SNP association analysis
收藏DataCite Commons2020-08-31 更新2024-07-25 收录
下载链接:
https://springernature.figshare.com/articles/Additional_file_1_of_On_the_impact_of_relatedness_on_SNP_association_analysis/5678431/1
下载链接
链接失效反馈官方服务:
资源简介:
R script for simulation. This R script supports simulation of synthetic genotypes for a family study. Instead of genotype simulation, genotypes can also be loaded from a CSV file. Allele frequencies are calculated, monomorphic SNPs are filtered and pairwise relatedness is estimated. Given SNP genotypes and a value for the heritability, variance inflation λis calculated. Additionally, the expected λ ′ is estimated. Finally, the script simulates phenotypes under the null and alternative hypothesis and provides results regarding the T statistic. The R library “mvtnorm” is required for sampling multivariate normally distributed phenotypes. Parameters can be modified to simulate different scenarios. However, the number of samples, the number of SNPs and the number of phenotype realisations per SNP should be limited to reduce the computational burden. For example, running the script on an Intel Xeon X5560 CPU (2.80 GHz) for synthetic family study 3 (SFS3) with parameter set f=111, m=2, c=3 (n=999), 100000 SNPs, 1000 phenotype realisations per SNP and 1000 SNPs required 8.3 GB RAM and took
本R脚本为模拟专用脚本,可实现家族研究用合成基因型的模拟构建。除通过模拟生成基因型外,亦可从CSV格式文件中直接载入基因型数据。脚本将计算等位基因频率,过滤单态性单核苷酸多态性(Single Nucleotide Polymorphism,SNP)位点,并估算成对个体间的亲缘关系。在给定SNP基因型与遗传力数值的前提下,可计算方差膨胀因子λ,并估算其期望λ'。最终,脚本可在原假设与备择假设框架下模拟表型数据,并输出与T统计量相关的分析结果。若需对服从多元正态分布的表型进行采样,需依赖R扩展包"mvtnorm"。用户可通过修改参数以模拟不同的研究场景,但为控制计算负载,应对样本量、SNP总数以及每个SNP对应的表型重复次数加以限制。示例:在搭载Intel Xeon X5560(2.80 GHz)处理器的设备上运行本脚本,针对合成家族研究3(Synthetic Family Study 3,SFS3),当参数设置为f=111、m=2、c=3(n=999)、包含100000个SNP、每个SNP对应1000次表型重复且选取1000个SNP时,需占用8.3 GB内存,耗时
提供机构:
figshare
创建时间:
2017-12-07



