five

Reproducibility Scripts for TCMLncDB: TCM and lncRNA Regulation in Cancer

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Reproducibility_Scripts_for_TCMLncDB_TCM_and_lncRNA_Regulation_in_Cancer/30731603
下载链接
链接失效反馈
官方服务:
资源简介:
This code repository contains all custom scripts and computational pipelines used to process raw data, perform analysis, and assemble the TCMLncDB integrated dataset. The provided code performs the following crucial steps: Data Processing and QC: Scripts like prepare_tpm_expression_data.R are used for pan-cancer TPM data preparation, while tumor_purity_estimation.R calculates tumor purity using the ESTIMATE algorithm, which is essential for accurate downstream analysis. Core Correlation Analysis: The lncRNA_PCG_correlation_analysis.R script calculates partial correlation coefficients between lncRNAs and PCGs across tumors, with the calculation adjusted for tumor purity. TCM-lncRNA Prediction: The key script, TCM_lncRNA_enrichment.R, calculating the lncRES score to quantify the functional association between TCM and lncRNAs. Reproducibility Notes and File Structure This repository shares the computational steps used to create the final database. A key note is that much of the provided code will not run immediately because the massive TCGA and GEO raw data files are not shared here due to size limitations. GEO Data: Users wishing to fully reproduce the pipeline must first download the raw GEO data files using the accession numbers provided in the input file: Supplementary_Table_S1_GEO_Datasets_List.xlsx. Workflow Guide: The file structure guides the user through the process, with code organized into processing, filtering, and core analysis steps. We have shared all code used to assemble the original integrated dataset, enabling full transparency and validation of the methodology.
创建时间:
2025-11-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作