KSFinder 2.0 Data Repository
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14075069
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains data and models developed/generated as part of the project, KSFinder 2.0. The following are the files and a brief description of their contents.
kg_data.zip - (KSFinder 2.0's knowledge graph) This archive contains three folders - kg_train_data.csv, kg_val_data.csv, kg_train_val.csv. a) kg_train_val.csv - This file contains all the triples constituting the knowledge graph (kg); b) This file contains triples that were used train the KGE models and determine optimal hyperparameters; c) kg_val_data.csv - This file contains validation triples that were used to assess the performance of KGE models and determine optimal training epoch.
embeddings.zip - This archive contains embedding data extracted from all the models including the four KGE trained as part of KSFinder 2.0, and embeddings extracted from external models - ProtT5, ESM2, ESM3, ProstT5, Phosformer and a random embedding model.
kg_ks.zip - This archive contains two files a) kinases.csv, a file containing all kg kinases; b) substrates_motif.csv, a file containing all kg substrate_motifs along with their site position, 9-mer and 15-mer motifs.
assessments_data.zip - This archive contains classification datasets used for assessment1, assessment2 and assessment3. It contains subfolders with testing datasets with two different ratio of positives:negatives. a) 1:1 ratio and b) distribution same as training dataset.
ksf2_predictions.zip - This file contains prediction probabilities generated by KSFinder 2.0 for kinase, substrate, motif data.
kge_models_assess1.zip - This file contains the classifier models trained using embeddings from the four KGE models.
models_assess2.zip - This file contains classifier models trained using embeddings from the external models - ProtT5, ESM2, and ESM3.
models_assess3.zip - This file contains classifier models trained to assess the influence of additional features - kinase domain sequences, 15-mer motifs, protein structure based embeddings.
other_model_predictions.zip - This archive contains predictions collected/generated from other kinase-substrate prediction tools, LinkPhinder, PredKinKG, KSFinder, Phosformer-ST.
classifier_datasets.zip - This archive contains the classifier dataset used to train KSFinder 2.0 model. It contains subfolders with testing datasets with two different ratio of positives to negatives. a) 1.1 ratio and b) distribution same as training dataset.
model_ksfinder.zip - This file contains KSFinder 2.0 model.
创建时间:
2025-01-21



