five

Assessing NGS-based computational methods for predicting transcriptional regulators with query gene sets

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/records/11391962
下载链接
链接失效反馈
官方服务:
资源简介:
The datasets and code scripts used for project: "Assessing NGS-based computational methods for predicting transcriptional regulators with query gene sets." Description of files in the folder: knockTF.zip - knockTF TR perturbation experiment derived genes, including the original data sourced from the knockTF database and benchmarking used top 200/600/1000 genes. Specifically, the KnockTF folder has the original limma-derived gene sets. Where the column "TF" indicates the perturbed TR, the column "Gene" lists the affected gene symbol. The rest of the column's results were generated using the limma method. For more details, please refer to the knockTF database. ''' Sample_ID,TF,Gene,Mean Expr. of Treat,Mean Expr. of Control,FC,Log2FC,Rank,P_value,up_down DataSet_01_01,ESR1,KRT4,2753.89913,90.77417,30.33795,4.92305,1,1.8795e-09,1 DataSet_01_01,ESR1,UPK1A,1551.08281,174.91983,8.86741,3.14851,2,1.87194e-09,1 DataSet_01_01,ESR1,GABBR2,576.56122,67.74582,8.51065,3.08927,3,4.51591e-08,1 DataSet_01_01,ESR1,GPNMB,3130.93968,369.18497,8.48062,3.08417,4,3.42503e-09,1 ''' Benchmark_Ranking_Lists.zip - Ranking lists collected, cleaned, and sorted by running each method over the collected TR perturbation gene sets. The column name indicates the perturbed TR and the unique ID for identifying experiments over the same perturbed TR. Each column contains the ranking list derived by the method using a specific number of input genes. For example, in the file "BART_RANK_200.csv," the first column contains the ranking list when knocking out the "AGO1" TR. AGO1_443 AGO1_566 AGO2_410 AHR_212 AR_112 AR_206 ESR1 POLR3D AR CEBPB CREB1 AR ASCL1 ZBTB48 ESR1 NFE2L1 MAX NR3C1 AR BMI1 ASCL1 MEF2B NFYA FOXA1 POLR3D SIRT1 PLRG1 ZNF384 MYC PIAS1 Commands and scripts Command and scripts run on Ubuntu 22.04, SMU M3 high-performance computing (HPC) cluster. Commands: Please make sure the requirements are satisfied before running the command/scripts. $INPUT_PATH = "/path/to/input/destination/gene_set.txt" $OUTPUT_PATH = "/path/to/output/destination/gene_set.txt" #BART: bart2 geneset -i $INPUT_PATH -s hg38 --outdir $OUTPUT_PATH #HOMER: findMotifs.pl $INPUT_PATH human $OUTPUT_PATH #Lisa: lisa oneshot hg38 $INPUT_PATH --save_metadata > $OUTPUT_PATH Code Scripts: For the rest of the methods, please refer to files in /Scripts/ For generating the figures, please refer to GitHub: https://github.com/ZeyuL01/Benchmark_NGSmethods/blob/main/Scripts/Analysis.R
创建时间:
2024-06-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作