MehNet Source Code and IDs
收藏Mendeley Data2024-01-31 更新2024-06-30 收录
下载链接:
https://data.mendeley.com/datasets/24x9hdckx5
下载链接
链接失效反馈官方服务:
资源简介:
The dataset is related to the article "MehNet: A vigesimal-based model by amino acid melting points generates unique ID numbers for protein sequences". The MehNet study aims to assign a constant value to each amino acid, thereby creating distinctions among protein sequences. The datasets used in this study were obtained from the UniProt Knowledgebase. Subsequently, these datasets underwent preprocessing steps, and identical sequences were categorized under the same headings. Each amino acid was ranked based on its respective melting point and was assigned a vigesimal digit. These generated vigesimal digits were subsequently converted to decimal values. The centerpiece of this methodology was the melting point hashing table, which was given the name "MehNet." Ultimately, each protein sequence was assigned a unique identification number. This approach successfully digitized protein sequences.
本数据集关联论文《MehNet:基于氨基酸熔点的二十进制(vigesimal)模型为蛋白质序列生成唯一标识编号》。本MehNet研究旨在为每种氨基酸赋予固定数值,以此实现不同蛋白质序列间的区分。本研究使用的数据集取自UniProt知识库(UniProt Knowledgebase)。随后,研究团队对这些数据集开展预处理操作,将完全一致的蛋白质序列归入同一类别下。研究人员依据每种氨基酸各自的熔点对其进行排序,并为其分配一个二十进制数位。将生成的二十进制数位转换为十进制数值后,本方法的核心为熔点哈希表,该表被命名为“MehNet”。最终,每条蛋白质序列都会被分配一个唯一的标识编号,该方法成功实现了蛋白质序列的数字化。
创建时间:
2024-01-31



