ecDNA machine learning modeling
收藏Mendeley Data2024-06-29 更新2024-06-30 收录
下载链接:
https://zenodo.org10389804
下载链接
链接失效反馈官方服务:
资源简介:
1. ecDNA_cargo_gene_modeling_data.csv.gz The dataset contains features from 386 TCGA tumors for modeling ecDNA cargo gene prediction. It was converted from R data format with the following code. NOTE: columns 'sample' and 'gene_id' are not used for actual modeling but for identifying, and sampling purposes. library(data.table) data = readRDS("~/../Downloads/ecDNA_cargo_gene_modeling_data.rds") colnames(data)[3] = "total_cn" data.table::fwrite(data, file = "~/../Downloads/ecDNA_cargo_gene_modeling_data.csv.gz", sep = ",") 2. gcap_pcawg_WGS_result.tar.gz GCAP analysis results for PCAWG allele-specific copy number profiles derived from WGS. 3. gcap_tcga_snp6_result.tar.gz GCAP analysis results for TCGA allele-specific copy number profiles derived from SNP6 array. 4. gcap_Changkang_WES_result.tar.gz GCAP analysis results for SYSUCC Changkang allele-specific copy number profiles derived from tumor-normal paired WES. 5. tcga_overlap_gene_wgs.rds, tcga_overlap_gene_snp.rds and tcga_overlap_gene_wes.rds These datasets contain TCGA gene-level copy number results in R data format from overlapping samples (dataset above). WGS from PCAWG, SNP array, and WES from GDC portal. 6. cellline-batch1.zip & cellline-batch1.zip GCAP results of cell line batch 1 and batch 2. 7. AA_cellline_wgs.zip AA software results for cell line batch 1. 8. Batch2_AA_summary.xlsx AA software results for cell line batch 2. 9. FISH-for-supp-file.zip Extended raw FISH images from 12 CRC samples. 10. SNU216.zip Extended AA and GCAP analysis on SNU216. 11. aa_ffpe.zip and AA_summary_table_of_6_erbb2_ffpe_samples.xlsx Extended AA running files (all results) and result summary data for 6 GCAP predicted ERBB2 amp clinical samples. 12. source data of fig.4 13. source data of supp fig.2 subplots 13. source data of supp fig.15 14. GCAP result data objects for three ICB cohorts. Both gene-level and sample-level data included.
1. ecDNA_cargo_gene_modeling_data.csv.gz:该数据集包含386份肿瘤基因组图谱(The Cancer Genome Atlas, TCGA)肿瘤样本的特征数据,用于构建染色体外DNA(extrachromosomal DNA, ecDNA) cargo基因预测模型。数据集由R数据格式转换而来,转换代码如下。注意:列'sample'与'gene_id'仅用于样本识别与抽样,不参与实际建模。
library(data.table) data = readRDS("~/../Downloads/ecDNA_cargo_gene_modeling_data.rds") colnames(data)[3] = "total_cn" data.table::fwrite(data, file = "~/../Downloads/ecDNA_cargo_gene_modeling_data.csv.gz", sep = ",")
2. gcap_pcawg_WGS_result.tar.gz:基于全基因组测序(Whole Genome Sequencing, WGS)得到的泛癌症全基因组分析(Pan-Cancer Analysis of Whole Genomes, PCAWG)等位基因特异性拷贝数谱的GCAP分析结果。
3. gcap_tcga_snp6_result.tar.gz:基于SNP6阵列得到的TCGA等位基因特异性拷贝数谱的GCAP分析结果。
4. gcap_Changkang_WES_result.tar.gz:基于肿瘤-正常配对全外显子组测序(Whole Exome Sequencing, WES)得到的中山大学肿瘤防治中心(Sun Yat-sen University Cancer Center, SYSUCC)昌康队列等位基因特异性拷贝数谱的GCAP分析结果。
5. tcga_overlap_gene_wgs.rds, tcga_overlap_gene_snp.rds and tcga_overlap_gene_wes.rds:这些数据集包含R数据格式的TCGA基因层面拷贝数结果,来源于上述重叠样本集,数据来源分别为PCAWG的WGS数据、SNP阵列数据以及基因组数据共享(Genomic Data Commons, GDC)数据库的WES数据。
6. cellline-batch1.zip & cellline-batch2.zip:第1批与第2批细胞系的GCAP分析结果。
7. AA_cellline_wgs.zip:第1批细胞系的AA软件WGS分析结果。
8. Batch2_AA_summary.xlsx:第2批细胞系的AA软件分析结果汇总表。
9. FISH-for-supp-file.zip:来自12份结直肠癌(Colorectal Cancer, CRC)样本的拓展型原始荧光原位杂交(Fluorescence In Situ Hybridization, FISH)图像数据集。
10. SNU216.zip:针对SNU216细胞系的拓展型AA与GCAP联合分析结果。
11. aa_ffpe.zip and AA_summary_table_of_6_erbb2_ffpe_samples.xlsx:针对6份经GCAP预测为ERBB2扩增阳性的临床样本的拓展型AA运行文件(含全部分析结果)及结果汇总数据。其中福尔马林固定石蜡包埋(Formalin-Fixed Paraffin-Embedded, FFPE)、人表皮生长因子受体2(ERBB2/HER2)为相关专业术语。
12. source data of fig.4:图4的源数据。
13. source data of supp fig.2 subplots:补充图2各子图的源数据。
14. source data of supp fig.15:补充图15的源数据。
15. GCAP result data objects for three ICB cohorts. Both gene-level and sample-level data included.:针对3个免疫检查点阻断(Immune Checkpoint Blockade, ICB)队列的GCAP分析结果数据对象,涵盖基因层面与样本层面的两类数据。
创建时间:
2023-12-18



