Table_6_An Information Entropy-Based Approach for Computationally Identifying Histone Lysine Butyrylation.xls
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Table_6_An_Information_Entropy-Based_Approach_for_Computationally_Identifying_Histone_Lysine_Butyrylation_xls/11853246
下载链接
链接失效反馈官方服务:
资源简介:
Butyrylation plays a crucial role in the cellular processes. Due to limit of techniques, it is a challenging task to identify histone butyrylation sites on a large scale. To fill the gap, we propose an approach based on information entropy and machine learning for computationally identifying histone butyrylation sites. The proposed method achieves 0.92 of area under the receiver operating characteristic (ROC) curve over the training set by 3-fold cross validation and 0.80 over the testing set by independent test. Feature analysis implies that amino acid residues in the down/upstream of butyrylation sites would exhibit specific sequence motif to a certain extent. Functional analysis suggests that histone butyrylation was most possibly associated with four pathways (systemic lupus erythematosus, alcoholism, viral carcinogenesis and transcriptional misregulation in cancer), was involved in binding with other molecules, processes of biosynthesis, assembly, arrangement or disassembly and was located in such complex as consists of DNA, RNA, protein, etc. The proposed method is useful to predict histone butyrylation sites. Analysis of feature and function improves understanding of histone butyrylation and increases knowledge of functions of butyrylated histones.
丁酰化在细胞过程中发挥关键作用。受限于现有技术手段,大规模识别组蛋白丁酰化位点仍是一项极具挑战性的任务。为填补这一研究空白,我们提出了一种基于信息熵与机器学习的计算方法,用于鉴定组蛋白丁酰化位点。所提方法经3折交叉验证,在训练集上的受试者工作特征(Receiver Operating Characteristic, ROC)曲线下面积达0.92,独立测试集上则为0.80。特征分析表明,丁酰化位点上下游的氨基酸残基在一定程度上呈现出特异性序列基序。功能分析显示,组蛋白丁酰化极有可能与4条通路相关,分别为系统性红斑狼疮、酒精中毒、病毒致癌作用以及癌症中的转录失调;其参与分子结合、生物合成、组装、排布或解离等过程,并定位于由DNA、RNA、蛋白质等构成的复合物中。本研究所提方法可有效预测组蛋白丁酰化位点;特征与功能分析有助于深化对组蛋白丁酰化的理解,并提升人们对丁酰化组蛋白功能的认知。
创建时间:
2020-02-14



