Datasets used in the INTREPPPID manuscript

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/10594149

下载链接

链接失效反馈

官方服务：

资源简介：

INTREPPPID Manuscript Datasets The enclosed archive holds all the datasets used in the INTREPPPID manuscript. See the INTREPPPID documentation for details on the format of the HDF5 files. Files are organised as follows: [FORMAT]/seed_[SEED]/[TAXON]/[DATASET_NAME].h5 Where: FORMAT is whether the HDF5 is in the RAPPPID or INTREPPPID format. SEED is the random seed used to generate the dataset. They are all phone numbers found in songs. TAXON is the NCBI Taxon ID of the organism from which the dataset was generated DATASET_NAME is the name of the dataset. "Why are there only Human (9606) datasets in the INTREPPPID format?" In the manuscript, we use the INTREPPPID format to train them model on Human data, and then test the model using datasets in the RAPPPID format. INTREPPPID can only be trained on datasets with orthology data, but can be tested on datasets without since the orthologous locality loss is only used during training.

创建时间：

2024-02-09

5,000+

优质数据集

54 个

任务类型

进入经典数据集