five

Data Files - Discovery of putative tumor suppressors from CRISPR screens reveals rewired lipid metabolism in acute myeloid leukemia cells

收藏
DataCite Commons2025-04-01 更新2024-08-25 收录
下载链接:
https://figshare.com/articles/dataset/Data_Files_-_Discovery_of_putative_tumor_suppressors_from_CRISPR_screens_reveals_rewired_lipid_metabolism_in_acute_myeloid_leukemia_cells/16746040/1
下载链接
链接失效反馈
官方服务:
资源简介:
Files contained in here come from data files used and are related to analysis and figure generation. Code notebooks within the code folder will point to these specific data files. Not all data files used are uploaded to this specific repository to avoid redistribution of other published work (specifically HumanNet files, CCLE/DepMap CERES, clinical files - TCGA/OHSU/TARGET data, and the Cancer Gene Census from COSMIC).<br><br>Descriptions of data files contained in folder:<br>AML_age.txt - curated AML cell line data and age of derived patient.<br><br>Avana_Corrected_FC_2020_Q4.txt - Crispr cleanR corrected fold-change data of the 2020q4 Avana release.<br><br>Avana_NORM_MIXEM_FC_2020_Q4.txt - mu and sigma calculations Mixed model (k=2) for each screen's null distribution from Avana 2020q4.<br><br>avana_output_update_2020_Q4 - Primary data file used to complete figure analysis. Data file contains, depmap cell line id, entrez id, gene name, mean log2FC, CCLE expression, binary classification of mutation status, mixed z-score of gene, binary classification of cosmic TSG status, binary classification of non essential gene status, mean log2FC ranking, and hit_mix which represents PSG classification for each gene-cell line pair from of the Avana 2020q4 distribution.<br><br>bf_avana_2020q4_CRISPRcleanR_corrected.noNA - Crispr cleanR corrected bagel scores for the Avana 2020q4 distribution.<br>data_not_redistributed.xlsx - description and sources of data not uploaded to figshare to avoid redistribution of other published data. <br>dPCC-AML-qualFilt-varFilt.txt - filtered dPCC correlations related to figure 3.<br>fisher_edges_mix_hits_tsg.txt - Text file of all PSG gene pairs, and fishers test pvalue, and total count of gene observations as a hit (count not used for analysis).<br><br>fisher_net_mix_Z_fdr_0.001.txt - FDR &lt; 0.001 filtered network of all PSG gene pairs, and fishers test pvalue, and total count of gene observations as a hit (count not used for analysis). Main network used for analyses.<br><br>genes-significant-dPCC-with-chp1-cluster-zSTD-filter.txt - Genes filtered and selected for dPCC heatmap analysis of figure 3e.<br><br>Human_net_cutoff_results_updated.txt - Human net comparisons and cutoffs used for supplemental figure 4b.<br><br>Hunet_comparison_update.Rdata - Human net comparisons and cutoffs used for supplemental figure 4a.<br>JACKS_result_gene_JACKS_results.txt - Crispr cleanR corrected JACKS scores for Avana 2020q4 distribution. <br>log_normalmixEM.txt - log file of mixture model iterations of avana2020q4.<br><br>matrix-GMMZ-qualFilter-varFilter-9055genes-659cells-17aml.txt - Selecting appropriate AML cells for dpcc analysis in figure 3e.<br>metabolite_error.txt - Metabolite variance measurements used in determining viable metabolites for analysis. Metabolites that had measurements below error were not used.<br><br>Mix_Z_pr_values_updated.txt - precision recall measurements and associated mixed z-scores of pr cutoffs. used to determine FDR cutoff measurements. <br><br>NEGv1.txt - Non essential genes from bagel.<br><br>PTEN_CN.txt - PTEN copy number values from CCLE.<br>Sanger_Corrected_FC.txt - Crispr cleanR corrected fold-change data of the Sanger 2019 release.

本仓库内包含的文件均为用于分析与图表生成的数据集文件。代码文件夹中的Jupyter代码笔记本将指向这些特定数据集文件。为避免重新分发已发表的第三方作品,并非所有用到的数据集文件都上传至本仓库——具体包括HumanNet文件、癌症细胞系百科(Cancer Cell Line Encyclopedia, CCLE)/DepMap CERES数据集、临床文件(癌症基因组图谱The Cancer Genome Atlas, TCGA/俄勒冈健康与科学大学Oregon Health & Science University, OHSU/儿童肿瘤基因组学研究计划Therapeutically Applicable Research to Generate Effective Treatments, TARGET数据)以及癌症体细胞突变目录(Catalogue of Somatic Mutations in Cancer, COSMIC)的癌症基因普查数据。 本文件夹内各数据集文件说明如下: AML_age.txt:经整理的急性髓系白血病(Acute Myeloid Leukemia, AML)细胞系数据及其来源患者的年龄信息。 Avana_Corrected_FC_2020_Q4.txt:2020年第四季度Avana文库版本经Crispr cleanR校正后的折叠变化(fold-change)数据。 Avana_NORM_MIXEM_FC_2020_Q4.txt:针对Avana 2020q4数据集的每一轮筛选的无效分布,采用混合模型(k=2)计算得到的μ与σ参数值。 avana_output_update_2020_Q4:用于完成图表分析的核心数据集文件。该文件包含以下信息:DepMap细胞系ID、Entrez基因ID(Entrez ID)、基因名称、平均log2折叠变化(log2FC)、CCLE基因表达量、突变状态二分类结果、基因混合z得分、COSMIC肿瘤抑制基因(Tumor Suppressor Gene, TSG)状态二分类结果、非必需基因状态二分类结果、平均log2FC排名,以及用于表示每个基因-细胞系对的阳性选择基因(Positive Selected Gene, PSG)分类结果的hit_mix字段(基于Avana 2020q4分布计算得到)。 bf_avana_2020q4_CRISPRcleanR_corrected.noNA:Avana 2020q4数据集经Crispr cleanR校正后的Bagel算法得分数据(无缺失值)。 data_not_redistributed.xlsx:说明未上传至Figshare的数据集的来源与相关信息,以避免重新分发已发表的第三方数据。 dPCC-AML-qualFilt-varFilt.txt:经过质量过滤与变量过滤的动态皮尔逊相关系数(dynamic Pearson Correlation Coefficient, dPCC)相关性数据,对应图3的分析内容。 fisher_edges_mix_hits_tsg.txt:包含所有PSG基因对、Fisher精确检验p值以及作为命中事件的基因观测总次数的文本文件(该计数未用于本次分析)。 fisher_net_mix_Z_fdr_0.001.txt:经过错误发现率(False Discovery Rate, FDR)<0.001过滤的所有PSG基因对网络文件,包含Fisher精确检验p值以及作为命中事件的基因观测总次数(该计数未用于本次分析),为本研究分析所用的核心网络数据集。 genes-significant-dPCC-with-chp1-cluster-zSTD-filter.txt:经过筛选的基因集,用于图3e的dPCC热图分析,筛选标准包含chp1聚类与z得分标准差过滤。 Human_net_cutoff_results_updated.txt:用于补充图4b分析的HumanNet比对结果与截断阈值(cutoff)信息。 Hunet_comparison_update.Rdata:用于补充图4a分析的HumanNet比对结果与截断阈值(cutoff)信息。 JACKS_result_gene_JACKS_results.txt:Avana 2020q4数据集经Crispr cleanR校正后的JACKS得分数据。 log_normalmixEM.txt:Avana2020q4数据集混合模型迭代过程的日志文件。 matrix-GMMZ-qualFilter-varFilter-9055genes-659cells-17aml.txt:用于图3e的dPCC分析的筛选后AML细胞数据集,共包含9055个基因、659个细胞系及17个AML样本。 metabolite_error.txt:用于筛选可用代谢物的代谢物方差测量数据,测量误差阈值以下的代谢物未纳入本次分析。 Mix_Z_pr_values_updated.txt:精确召回率(precision-recall)阈值对应的精确召回率指标与关联混合z得分数据,用于确定错误发现率(FDR)的截断阈值。 NEGv1.txt:Bagel算法鉴定得到的非必需基因数据集。 PTEN_CN.txt:来自CCLE数据库的PTEN基因拷贝数数据。 Sanger_Corrected_FC.txt:2019年Sanger文库版本经Crispr cleanR校正后的折叠变化(fold-change)数据。
提供机构:
figshare
创建时间:
2021-10-15
二维码
社区交流群
二维码
科研交流群
商业服务