five

Lineage-specific lncRNAs critically determine cross-species differences in tumors

收藏
DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19964974
下载链接
链接失效反馈
官方服务:
资源简介:
Overview This repository contains all data and code supporting the analyses presented in the manuscript. The study develops a lineage-specific lncRNA (LS lncRNA)-centered comparative pan-cancer framework integrating 9,058 RNA-seq samples from 13 human tumors and their matched mouse counterparts to systematically investigate how LS lncRNAs drive transcriptional divergence, reshape cancer hallmark landscapes, and influence the tumor immune microenvironment (TIME) across species. The 13 tumor types analyzed include bladder, breast, kidney, liver, lung, pancreatic, prostate, ovarian, skin (melanoma), glioblastoma, glioma, leukemia, and lymphoma. Repository Structure .├── TranscriptionalAnalysis_code/├── TIME_analysis_code/├── eGRAM-Code/├── panData-geneExp_NX/├── panData-geneExp_logTPM/├── eGRAM-inputs/├── eGRAM-results/├── UMAP-plots/└── compare-results/ Contents Description 1. TranscriptionalAnalysis_codeScripts for transcriptomic data processing, normalization, cross-species/differential expression analysis, and module construction. 2. TIME_analysis_codeScripts for cross-species immune infiltration comparison, and immune divergence module identification. 3. eGRAM-CodeSource code for the *eGRAM* (expression-based Gene Regulatory Analysis of Modules) program, which integrates lncRNA-DNA binding site (DBS) data with expression correlation to identify transcriptional regulatory modules. 4. panData-geneExp_NXPan-cancer normalized expression (NX) datasets for all 13 human and 13 mouse tumor types. NX values are z-scores computed from ComBat-corrected log2TPM matrices using scikit-learn `preprocessing.StandardScaler`, normalized jointly across all 13 cancer types and 11 normal tissue types within each species. These matrices serve as the primary input for cross-species differential expression analysis, TDG/TCG classification, and ANOSIM/t-SNE quality control. 5. panData-geneExp_logTPMPan-cancer log2(TPM + 1) expression datasets (post-TMM normalization and ComBat batch correction) for all 13 human and 13 mouse tumor types. These matrices are used as input for eGRAM module identification, LS lncRNA expression quantification, and TIME deconvolution. Genes with TPM < 0.1 in > 80% of samples have been filtered. 6. eGRAM-inputsInput files for the eGRAM program. 7. eGRAM-resultsOutput files from eGRAM analysis across all 13 human and 13 mouse tumors under three conditions (Normal, Cancer, Preserved). 8. UMAP-plotsUMAP projection outputs and visualization data for hallmark landscape and target-gene landscape analyses. 9. compare-resultsResults from cross-species comparative analyses.
提供机构:
Zenodo
创建时间:
2026-05-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作