miRBench datasets
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11657637
下载链接
链接失效反馈官方服务:
资源简介:
Changelog
PhyloP and PhastCons conservation scores for the target gene sequence have been added to the test/train/leftout datasets as two additional columns - 'gene_phyloP' and' gene_phastCons'.
Both of new columns contain list of conservation scores rounded to 3 decimal places, one score for each nucelotide in the gene sequence.PhyloP and PhastCons scores were obtained from:
https://hgdownload.cse.ucsc.edu/goldenPath/hg38/phyloP100way/hg38.phyloP100way.bw
https://hgdownload.cse.ucsc.edu/goldenpath/hg38/phastCons100way/hg38.phastCons100way.bw
Downloaded on 15 September 2024.
Dataset Summary
The following listed datasets were recreated via a series of post-processing pipelines (available here) to eliminate a bias between the positive and negative classes (miRNA family imbalance) discovered in previous versions of the datasets. All have a 1:1 positive to negative class ratio.
AGO2_eCLIP_Manakov2022_leftout.tsv.gz
AGO2_eCLIP_Manakov2022_test.tsv.gz
AGO2_eCLIP_Manakov2022_train.tsv.gz
AGO2_eCLIP_Klimentova2022_test.tsv.gz
AGO2_CLASH_Hejret2023_test.tsv.gz
AGO2_CLASH_Hejret2023_train.tsv.gz
The following listed dataset is the concatenated HybriDetector output of all the selected samples from the available Manakov sample files. It therefore contains only a raw version of the positive class of the Manakov dataset. It is the input to the series of post-process pipelines for the Manakov dataset.
AGO2_eCLIP_Manakov2022_full_dataset.tsv.gz
The other inputs to the post-process pipelines for the Hejret and Klimentova datasets are found at the following links.
Hejret dataset
Klimentova dataset
创建时间:
2025-01-29



