five

Datasets used in the INTREPPPID manuscript

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10594149
下载链接
链接失效反馈
官方服务:
资源简介:
INTREPPPID Manuscript Datasets The enclosed archive holds all the datasets used in the INTREPPPID manuscript. See the INTREPPPID documentation for details on the format of the HDF5 files. Files are organised as follows: [FORMAT]/seed_[SEED]/[TAXON]/[DATASET_NAME].h5 Where: FORMAT is whether the HDF5 is in the RAPPPID or INTREPPPID format.  SEED is the random seed used to generate the dataset. They are all phone numbers found in songs. TAXON is the NCBI Taxon ID of the organism from which the dataset was generated DATASET_NAME is the name of the dataset.   "Why are there only Human (9606) datasets in the INTREPPPID format?" In the manuscript, we use the INTREPPPID format to train them model on Human data, and then test the model using datasets in the RAPPPID format. INTREPPPID can only be trained on datasets with orthology data, but can be tested on datasets without since the orthologous locality loss is only used during training.
创建时间:
2024-02-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作