five

PeptideForest: Semi-supervised machine learning integrating multiple search engines for peptide identification

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://www.omicsdi.org/dataset/pride/PXD056915
下载链接
链接失效反馈
官方服务:
资源简介:
We introduce PeptideForest, a semi-supervised machine learning approach that integrates the assignment of peptides to mass spectra from multiple algorithms to train a random forest classifier, thereby combining the results from different search engines. PeptideForest increases the number of peptide-to-spectrum matches that exhibit a q-value lower than 1% by 25.2 ± 1.6% compared to MS-GF+ data on samples containing mixed HEK and Escherichia coli proteomes. However, an increase in quantity does not necessarily reflect an increase in quality and this is why we devised a novel approach to determine the quality of the assigned spectra through TMT quantification of samples with known ground truths. Thereby, we could show that the increase in PSMs below 1% q-value does not come with a decrease in quantification quality and as such PeptideForest offers a possibility to gain deeper insights into bottom-up proteomics. PeptideForest has been integrated into our pipeline framework Ursgal and can therefore be combined with a wide array of algorithms.
创建时间:
2025-06-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作