five

Ontolomics‑P: Advancing Proteomics Data Interpretation through GPT-4o Reannotated Topic Ontology and Data-Driven Analysis

收藏
Figshare2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_i_Ontolomics_P_i_Advancing_Proteomics_Data_Interpretation_through_GPT-4o_Reannotated_Topic_Ontology_and_Data-Driven_Analysis/28937511
下载链接
链接失效反馈
官方服务:
资源简介:
The interpretation of proteomics data often relies on functional enrichment analysis, such as Gene Ontology (GO) enrichment, to uncover the biological functions of proteins, as well as the examination of protein expression patterns across data sets like the Clinical Proteomic Tumor Analysis Consortium (CPTAC) database. However, conventional approaches to functional enrichment frequently produce extensive and redundant term lists, complicating interpretation and synthesis. Moreover, the absence of specialized tools tailored to proteomics researchers limits the efficient exploration of protein expression within specific biological contexts. To address these challenges, we developed Ontolomics-P, a user-friendly web-based tool designed to advance proteomics data interpretation. Ontolomics-P integrates topic modeling using latent Dirichlet allocation (LDA) with GO semantic similarity analysis, enabling the consolidation of redundant terms into coherent topics. These topics are further refined and reannotated using the GPT-4o language model, creating a novel topics database that provides precise and interpretable insights into shared biological functions. Additionally, Ontolomics-P incorporates quantitative proteomic data from 10 diverse cancer types archived in the CPTAC database, allowing for a comprehensive exploration of protein expression profiles from a data-driven perspective. Through detailed case studies, we demonstrate the tool’s capacity to streamline workflows, simplify interpretation, and provide actionable biological insights. Ontolomics-P represents a significant advancement in proteomics data analysis, offering innovative solutions for functional annotation, quantitative exploration, and visualization, ultimately empowering researchers to accelerate discoveries in systems biology and beyond.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作