Consensus protein localization encodings for all OpenCell targets

Name: Consensus protein localization encodings for all OpenCell targets
Creator: figshare
Published: 2021-10-06 19:29:46
License: 暂无描述

DataCite Commons2021-10-06 更新2024-07-28 收录

下载链接：

https://figshare.com/articles/dataset/Consensus_protein_localization_encodings_for_all_OpenCell_targets/16754965

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset consists of the raw consensus encodings (or latent-space representations) of the protein localization patterns for all targets in the OpenCell library. These encodings are generated by a self-supervised machine learning model called cytoself trained on the OpenCell imaging dataset. Briefly, these encodings correspond to the flattened VQ2 layer of the VQ-VAE2-like component of the cytoself model, and are provided here as a mean over all images of each target. The resulting matrix of encodings has dimensions of 1294x9216, corresponding to the library of 1294 OpenCell targets and the 9216 dimensions of the flattened VQ2 layer of the cytoself model. This matrix is provided along with target metadata in the form of an `anndata` object. Please refer to our GitHub repo (2021-opencell-figures) for usage examples.

本数据集包含OpenCell文库（OpenCell library）中所有靶标的蛋白质定位模式（protein localization patterns）的原始共识编码（raw consensus encodings，即隐空间表征latent-space representations）。这些编码由基于OpenCell成像数据集（OpenCell imaging dataset）训练的自监督机器学习模型（self-supervised machine learning model）cytoself生成。简言之，这些编码对应cytoself模型中类VQ-VAE2组件（VQ-VAE2-like component）的展平VQ2层（flattened VQ2 layer），本次发布的编码为每个靶标的所有图像的均值。所得编码矩阵的维度为1294×9216，分别对应1294个OpenCell靶标，以及cytoself模型展平VQ2层的9216个维度。该矩阵与靶标元数据以anndata对象（anndata object）的形式一同提供。如需获取使用示例，请参阅我们的GitHub仓库（2021-opencell-figures）。

提供机构：

figshare

创建时间：

2021-10-06

5,000+

优质数据集

54 个

任务类型

进入经典数据集