The raw data set for enzyme kinetic parameter prediction
收藏DataCite Commons2024-12-30 更新2025-04-16 收录
下载链接:
https://ieee-dataport.org/documents/raw-data-set-enzyme-kinetic-parameter-prediction
下载链接
链接失效反馈官方服务:
资源简介:
The data distribution is shown above.. Due to overfitting, existing models for predicting enzyme kinetic parameters either suffer from poor generalization ability or lack accuracy.To ensure a fair evaluation of the model’s performance, weused a fair and unbiased dataset created through sequenceclustering based on the CD-HIT program . The held-out test and out-ofdistribution test are represented as pie chart, with the quantitiesof both sets being approximately in a 1:1 ratio.Evaluations onboth held-out and out-of-distribution test datasets demonstratethat our model performs competitively with existing methods,while also delivering reliable uncertainty estimates.
数据分布如上图所示。受过拟合问题影响,现有的酶动力学参数预测模型往往存在泛化能力不足或预测精度欠佳的缺陷。为实现对模型性能的公平公正评估,我们依托CD-HIT工具通过序列聚类构建了公平无偏的数据集。留出测试集(held-out test)与分布外测试集(out-of-distribution test)以饼图形式展示,二者的样本量比例约为1:1。在上述两类测试集上的评估结果显示,本模型不仅与现有主流方法性能相当,还可输出可靠的不确定性估计结果。
提供机构:
IEEE DataPort
创建时间:
2024-12-30



