five

Flagging False Positives Following Untargeted LC–MS Characterization of Histone Post-Translational Modification Combinations

收藏
acs.figshare.com2023-06-02 更新2025-03-27 收录
下载链接:
https://acs.figshare.com/articles/dataset/Flagging_False_Positives_Following_Untargeted_LC_MS_Characterization_of_Histone_Post-Translational_Modification_Combinations/4291292/1
下载链接
链接失效反馈
官方服务:
资源简介:
Epigenetic changes can be studied with an untargeted characterization of histone post-translational modifications (PTMs) by liquid chromatography–mass spectrometry (LC–MS). While prior information about more than 20 types of histone PTMs exists, little is known about histone PTM combinations (PTMCs). Because of the combinatorial explosion it is intrinsically impossible to consider all potential PTMCs in a database search. Consequentially, high-scoring false positives with unconsidered but correct alternative isobaric PTMCs can occur. Current quality controls can neither estimate the amount of unconsidered alternatives nor flag potential false positives. Here, we propose a conceptual workflow that provides such options. In this workflow, an in silico modeling of all candidate isoforms with known-to-exist PTMs is made. The most frequently occurring PTM sets of these candidate isoforms are determined and used in several database searches. This suppresses the combinatorial explosion while considering as many candidate isoforms as possible. Finally, annotations can be classified as unique or ambiguous, the latter implying false positives. This workflow was evaluated on an LC–MS data set containing 44 histone extracts. We were able to consider 60% of all candidate isoforms. Importantly, 40% of all annotations were classified as ambiguous. This highlights the need for a more thorough evaluation of modified peptide annotations.

表观遗传学变化可通过液相色谱-质谱联用(LC-MS)对组蛋白翻译后修饰(PTMs)进行非靶向表征来研究。尽管关于超过20种组蛋白PTMs的先验信息已存在,但对于组蛋白PTM组合(PTMCs)的了解却鲜为人知。由于组合爆炸的内在限制,数据库搜索中考虑所有潜在的PTMCs是根本不可能的。因此,可能会出现高分误报,其中包含未考虑但正确的同位素PTMCs。当前的质量控制无法估计未考虑的替代方案的数量,也无法标记潜在的误报。在此,我们提出了一种概念性工作流程,该流程提供了此类选项。在该工作流程中,对所有已知存在的PTMs的候选异构体进行计算机模拟。确定这些候选异构体中最频繁出现的PTM集,并在多个数据库搜索中使用。这抑制了组合爆炸,同时尽可能地考虑了所有候选异构体。最终,注释可以分类为独特或模糊,后者意味着误报。该工作流程在包含44个组蛋白提取物的LC-MS数据集上进行了评估。我们能够考虑所有候选异构体的60%。值得注意的是,所有注释中有40%被分类为模糊。这突显了对修饰肽注释进行更彻底评估的必要性。
提供机构:
acs.figshare.com
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作