Identification of patterns related to linkage groups or disequilibrium by factor analysis
收藏figshare.com2023-05-31 更新2025-03-22 收录
下载链接:
https://figshare.com/articles/dataset/Identification_of_patterns_related_to_linkage_groups_or_disequilibrium_by_factor_analysis/14305438/1
下载链接
链接失效反馈官方服务:
资源简介:
ABSTRACT: Empirical patterns of linkage disequilibrium (LD) can be used to increase the statistical power of genetic mapping. This study was carried out with the objective of verifying the efficacy of factor analysis (AF) applied to data sets of molecular markers of the SNP type, in order to identify linkage groups and haplotypes blocks. The SNPs data set used was derived from a simulation process of an F2 population, containing 2000 marks with information of 500 individuals. The estimation of the factorial loadings of FA was made in two ways, considering the matrix of distances between the markers (A) and considering the correlation matrix (R). The number of factors (k) to be used was established based on the graph scree-plot and based on the proportion of the total variance explained. Results indicated that matrices A and R lead to similar results. Based on the scree-plot we considered k equal to 10 and the factors interpreted as being representative of the bonding groups. The second criterion led to a number of factors equal to 50, and the factors interpreted as being representative of the haplotypes blocks. This showed the potential of the technique, making it possible to obtain results applicable to any type of population, helping or corroborating the interpretation of genomic studies. The study demonstrated that AF was able to identify patterns of association between markers, identifying subgroups of markers that reflect factor binding groups and also linkage disequilibrium groups.
摘要:关联不平衡(LD)的实证模式可用于增强遗传图谱的统计效力。本研究旨在验证将因子分析(AF)应用于单核苷酸多态性(SNP)类型分子标记数据集的效力,以识别连锁群和单倍型块。所使用的SNP数据集源于F2群体的模拟过程,包含2000个具有500个个体信息的标记。因子分析(FA)的因子负荷估计以两种方式进行,考虑标记之间的距离矩阵(A)和考虑相关矩阵(R)。所使用的因子数(k)基于图形特征值分解图和基于解释的总方差比例确定。结果显示,矩阵A和R导致相似的结果。基于特征值分解图,我们认为k等于10,并将因子解释为代表性连锁群。第二个标准导致因子数等于50,并将因子解释为代表性单倍型块。这展示了该技术的潜力,使其能够获得适用于任何类型人群的结果,有助于或证实基因组研究的解释。研究表明,因子分析(AF)能够识别标记之间的关联模式,识别反映因子结合群的标记子群以及连锁不平衡群。
提供机构:
SciELO journals



