Assessing NGS-based computational methods for predicting transcriptional regulators with query gene sets
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/records/11391962
下载链接
链接失效反馈官方服务:
资源简介:
The datasets and code scripts used for project: "Assessing NGS-based computational methods for predicting transcriptional regulators with query gene sets."
Description of files in the folder:
knockTF.zip - knockTF TR perturbation experiment derived genes, including the original data sourced from the knockTF database and benchmarking used top 200/600/1000 genes.
Specifically, the KnockTF folder has the original limma-derived gene sets. Where the column "TF" indicates the perturbed TR, the column "Gene" lists the affected gene symbol. The rest of the column's results were generated using the limma method. For more details, please refer to the knockTF database.
'''
Sample_ID,TF,Gene,Mean Expr. of Treat,Mean Expr. of Control,FC,Log2FC,Rank,P_value,up_down
DataSet_01_01,ESR1,KRT4,2753.89913,90.77417,30.33795,4.92305,1,1.8795e-09,1
DataSet_01_01,ESR1,UPK1A,1551.08281,174.91983,8.86741,3.14851,2,1.87194e-09,1
DataSet_01_01,ESR1,GABBR2,576.56122,67.74582,8.51065,3.08927,3,4.51591e-08,1
DataSet_01_01,ESR1,GPNMB,3130.93968,369.18497,8.48062,3.08417,4,3.42503e-09,1
'''
Benchmark_Ranking_Lists.zip - Ranking lists collected, cleaned, and sorted by running each method over the collected TR perturbation gene sets.
The column name indicates the perturbed TR and the unique ID for identifying experiments over the same perturbed TR. Each column contains the ranking list derived by the method using a specific number of input genes. For example, in the file "BART_RANK_200.csv," the first column contains the ranking list when knocking out the "AGO1" TR.
AGO1_443
AGO1_566
AGO2_410
AHR_212
AR_112
AR_206
ESR1
POLR3D
AR
CEBPB
CREB1
AR
ASCL1
ZBTB48
ESR1
NFE2L1
MAX
NR3C1
AR
BMI1
ASCL1
MEF2B
NFYA
FOXA1
POLR3D
SIRT1
PLRG1
ZNF384
MYC
PIAS1
Commands and scripts
Command and scripts run on Ubuntu 22.04, SMU M3 high-performance computing (HPC) cluster.
Commands:
Please make sure the requirements are satisfied before running the command/scripts.
$INPUT_PATH = "/path/to/input/destination/gene_set.txt"
$OUTPUT_PATH = "/path/to/output/destination/gene_set.txt"
#BART:
bart2 geneset -i $INPUT_PATH -s hg38 --outdir $OUTPUT_PATH
#HOMER:
findMotifs.pl $INPUT_PATH human $OUTPUT_PATH
#Lisa:
lisa oneshot hg38 $INPUT_PATH --save_metadata > $OUTPUT_PATH
Code Scripts:
For the rest of the methods, please refer to files in /Scripts/
For generating the figures, please refer to GitHub: https://github.com/ZeyuL01/Benchmark_NGSmethods/blob/main/Scripts/Analysis.R
创建时间:
2024-06-05



