five

Ontology based text mining of gene-phenotype associations: application to candidate gene prediction

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/2532613
下载链接
链接失效反馈
官方服务:
资源简介:
Gene-phenotype associations play an important role in understanding   the disease mechanisms which is a requirement for treatment   development. A portion of gene-phenotype associations are observed   mainly experimentally and made publicly available through several   standard resources such as MGI. However, there is still a vast   amount of gene--phenotype associations buried in the biomedical   literature. Given the large amount of literature data, we need   automated text mining tools to alleviate the burden in manual   curation of gene-phenotype associations and to develop   comprehensive resources. We developed an ontology based   approach in combination with statistical methods to text mine   gene-phenotype associations from literature. Our method achieved   AUC values of 0.90 and 0.75 in recovering known gene-phenotype   associations from HPO and MGI respectively. We posit that candidate   genes and their relevant diseases should be expressed with similar   phenotypes in publications. Thus, we demonstrate the utility of our   approach by predicting disease candidate genes based on the semantic   similarities of phenotypes associated with genes and diseases. We evaluated our disease candidate prediction model on   the gene-disease associations from MGI. Our model achieved AUC   values of 0.90 and 0.87 on OMIM (human) and MGI (mouse) datasets of   gene-disease associations respectively. Our manual analysis on the   text mined data revealed that, our method can accurately extract   gene-phenotype associations which are not currently covered by the   existing public gene-phenotype resources. Overall, results indicate   that our method can precisely extract known as well as new   gene-phenotype associations from literature. This released dataset at Zenodo covers our gene-phenotype extracts from the literature. All the methods used to extract the data are available at https://github.com/bio-ontology-research-group/genepheno.
创建时间:
2020-01-24
二维码
社区交流群
二维码
科研交流群
商业服务