five

The raw data set for enzyme kinetic parameter prediction

收藏
DataCite Commons2024-12-30 更新2025-04-16 收录
下载链接:
https://ieee-dataport.org/documents/raw-data-set-enzyme-kinetic-parameter-prediction
下载链接
链接失效反馈
官方服务:
资源简介:
The data distribution is shown above.. Due to overfitting, existing models for predicting enzyme kinetic parameters either suffer from poor generalization ability or lack accuracy.To ensure a fair evaluation of the model’s performance, weused a fair and unbiased dataset created through sequenceclustering based on the CD-HIT program . The held-out test and out-ofdistribution test are represented as pie chart, with the quantitiesof both sets being approximately in a 1:1 ratio.Evaluations onboth held-out and out-of-distribution test datasets demonstratethat our model performs competitively with existing methods,while also delivering reliable uncertainty estimates.

数据分布如上图所示。受过拟合问题影响,现有的酶动力学参数预测模型往往存在泛化能力不足或预测精度欠佳的缺陷。为实现对模型性能的公平公正评估,我们依托CD-HIT工具通过序列聚类构建了公平无偏的数据集。留出测试集(held-out test)与分布外测试集(out-of-distribution test)以饼图形式展示,二者的样本量比例约为1:1。在上述两类测试集上的评估结果显示,本模型不仅与现有主流方法性能相当,还可输出可靠的不确定性估计结果。
提供机构:
IEEE DataPort
创建时间:
2024-12-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作