Cell Maps for Artificial Intelligence - Data Release
收藏doi.org2024-05-13 更新2025-01-15 收录
下载链接:
https://doi.org/10.18130/V3/DXWOS5
下载链接
链接失效反馈官方服务:
资源简介:
This collection is the 0.5 alpha data release of the the Cell Maps for Artificial Intelligence (CM4AI) Functional Genomics Data Generation Project, a component of the U.S. National Institute of Health’s (NIH) Bridge2AI program. CM4AI’s objective is to deliver machine-readable hierarchical maps of cell architecture as AI-Ready data produced from multimodal interrogation of 100 chromatin modifiers and 100 metabolic enzymes involved in cancer, neuropsychiatric, and cardiac disorders in disease-relevant cell lines under perturbed and unperturbed conditions, utilizing state-of-the-art mass spectrometry based proteomics, spatial proteomics / cell imaging, and genetic perturbations using CRISPR. CM4AI input data streams are generated using immunofluorescence (IF) subcellular microscopy for spatial proteomics data; affinity purification mass spectroscopy (AP-MS) and size exclusion mass spectroscopy (SEC-MS) methods for protein-protein interaction (PPI) data; and single-cell CRISPR-Cas perturbation screens by cell type. Input data streams are integrated via the multi-scale integrated cell (MuSIC) software pipeline employing deep learning models and community detection algorithms2, and output cell maps are packaged with provenance graphs and rich metadata as AI-Ready datasets in RO-Crate format using an extended, client-server version of the FAIRSCAPE framework. This data is Copyright (c) 2024 The Regents of the University of California except where otherwise noted. Spatial proteomics raw image data is copyright (c) 2024 The Board of Trustees of the Leland Stanford Junior University. It is licensed for reuse under Creative Commons Attribution ShareAlike NonCommercial 4.0 International License (https://creativecommons.org/licenses/by-nc-sa/4.0/). Attribution is required to the copyright holders and the authors. Any publications referencing this data or derived products. should cite the related article as well as directly citing this data collection.
本集合为美国国立卫生研究院(NIH)桥接至人工智能(Bridge2AI)计划的一部分——细胞图谱用于人工智能(CM4AI)功能基因组数据生成项目的0.5 alpha版本数据发布。CM4AI旨在提供由多模态检测生成的、可供人工智能使用的细胞架构的机器可读分层图谱,涉及100种染色质修饰因子和100种参与癌症、神经精神性疾病和心脏疾病的相关细胞系中的代谢酶。在扰动和非扰动条件下,该项目利用了基于最新技术的质谱蛋白质组学、空间蛋白质组学和细胞成像,以及通过CRISPR进行的遗传扰动。CM4AI的输入数据流通过免疫荧光(IF)亚细胞显微镜生成用于空间蛋白质组学数据;亲和纯化质谱(AP-MS)和尺寸排阻质谱(SEC-MS)方法用于蛋白质-蛋白质相互作用(PPI)数据;以及按细胞类型进行的单细胞CRISPR-Cas扰动筛选。输入数据流通过多尺度集成细胞(MuSIC)软件管道进行整合,该管道采用深度学习模型和社区检测算法,并使用扩展的客户端-服务器版本的FAIRSCAPE框架,将生成的细胞图谱与溯源图和丰富的元数据打包为AI准备数据集,以RO-Crate格式提供。本数据集的版权为加利福尼亚大学董事会的版权所有(c)2024,除非另有说明。空间蛋白质组学原始图像数据的版权为斯坦福大学董事会的版权所有(c)2024。本数据在Creative Commons Attribution ShareAlike NonCommercial 4.0 International License(https://creativecommons.org/licenses/by-nc-sa/4.0/)许可下可用于再次使用。须向版权所有者和作者提供归属,任何引用此数据或派生产品的出版物均应引用相关文章并直接引用此数据集。
提供机构:
University of Virginia Dataverse



