Genome-wide identification of candidate regions associated with birth weight in Lori-Bakhtiari sheep using Random Forest algorithm
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Genome-wide_identification_of_candidate_regions_associated_with_birth_weight_in_Lori-Bakhtiari_sheep_using_Random_Forest_algorithm/30400351
下载链接
链接失效反馈官方服务:
资源简介:
Birth weight is a critical quantitative trait with significant economic implications in livestock production. This study was conducted to identify genetic loci associated with birth weight in a meat-type sheep using a Random Forest (RF) algorithm applied to genomic data. A total of 132 Lori-Bakhtiari sheep were selected based on breeding values (EBVs) for body weight. The animals in the study were genotyped using Illumina Ovine SNP50 Bead Chip. After quality control, a total of 44,796 single-nucleotide polymorphisms (SNPs) and 122 sheep were retained for downstream analyses. R program (version 4.4.0) was used for RF analyses. The top 50 ranked SNPs were detected based on RF method. The results showed that RF is an efficient method for identifying a subset of SNPs with putative functional roles in birth weight. Among 41 candidate genes identified, DOCK4** and LHCGR **genes, which were linked with the highest ranked SNPs, are known to influence body weight traits. Gene ontology results revealed additional candidate genes, including members of *SEMA *family (SEMA3D, SEMA3A), FLT1, MYCBP2, RHOBTB2 and PHIP. Gene network analysis showed functional interconnected among these genes, further supporting their potential role as candidate genes influencing birth weight. Notable, most of these genes were not detected in previous genome wide association studies (GWAS), highlighting the RF algorithm’s utility in uncovering novel genetic markers. Overall, our findings provide new insights into the genetic architecture of birth weight in sheep and suggest potential targets for genetic improvement programs in meat-type sheep.
出生体重是畜牧生产中具有重要经济价值的关键数量性状。本研究旨在利用基于基因组数据的随机森林(Random Forest, RF)算法,鉴定肉用绵羊出生体重相关的遗传位点。本研究基于体重估计育种值(Estimated Breeding Values, EBVs)筛选得到132只洛里-巴赫蒂亚里绵羊作为受试群体,所有受试绵羊均采用Illumina绵羊50K SNP基因芯片(Illumina Ovine SNP50 Bead Chip)进行基因分型。经过质量控制流程后,最终保留44796个单核苷酸多态性(Single Nucleotide Polymorphism, SNP)位点与122只绵羊用于后续分析。本研究使用R语言(版本4.4.0)完成随机森林分析,并基于该算法筛选得到排名前50的SNP位点。结果表明,随机森林算法可有效筛选出与出生体重存在潜在功能关联的SNP位点子集。在本次鉴定得到的41个候选基因中,与排名最高的SNP位点相关联的DOCK4与LHCGR基因已被证实可影响体重相关性状。基因本体(Gene Ontology, GO)富集分析进一步筛选出其他候选基因,包括SEMA家族成员(SEMA3D、SEMA3A)、FLT1、MYCBP2、RHOBTB2以及PHIP。基因网络分析显示上述基因间存在功能互作,进一步佐证了其作为影响出生体重候选基因的潜在作用。值得注意的是,其中多数基因未在既往全基因组关联研究(Genome-Wide Association Study, GWAS)中被检出,这凸显了随机森林算法在挖掘新型遗传标记方面的应用价值。综上,本研究结果为绵羊出生体重的遗传结构提供了新的见解,并为肉用绵羊的遗传改良计划提供了潜在靶点。
创建时间:
2025-10-20



