Datasets used in Transitive prediction of small-molecule function through alignment of high-content screening resources

Name: Datasets used in Transitive prediction of small-molecule function through alignment of high-content screening resources
Creator: figshare
Published: 2025-05-14 04:21:21
License: 暂无描述

DataCite Commons2025-05-14 更新2026-04-25 收录

下载链接：

https://figshare.com/articles/dataset/Datasets_used_in_Transitive_prediction_of_small-molecule_function_through_alignment_of_high-content_screening_resources/29061038/1

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset supports the development of CLIPn, a contrastive-learning framework designed to align heterogeneous high-content screening (HCS) profile datasets. GitHub link: https://github.com/AltschulerWu-Lab/CLIPn Directory Structure Foldersraw_profilesHCS13/Contains raw data from 13 high-content screening (HCS) datasets. Each dataset includes meta and feature files. L1000/CDRP_feature_exp.csv: Raw L1000 expression data from the CDRP dataset.CDRP_meta_exp.csv: Metadata associated with the CDRP expression data.LINCS_feature_exp.csv: Raw L1000 expression data from the LINCS dataset.LINCS_meta_exp.csv: Metadata associated with the LINCS expression data. RxRx3/RxRx3_feature_final.csv: Profile data from the RxRx3 dataset.RxRx3_meta_final.csv: Metadata from the RxRx3 dataset. Uncharacterized_compounds/NCI_cpnData.csv: Feature data for uncharacterized compounds from the NCI dataset.NCI_cpnInfo.csv: Information about uncharacterized compounds in the NCI dataset.Prestwick_UTSW_cpnData.csv: Feature data for uncharacterized compounds from the Prestwick UTSW dataset.Prestwick_UTSW_cpnInfo.csv: Information about uncharacterized compounds from the Prestwick UTSW dataset. Data ReferenceFor raw datasets from 13 HCS database, data and analysis pipeline for dataset 1 was obtained from https://www.science.org/doi/suppl/10.1126/science.1100709/suppl_file/perlman.som.zip; for datasets 2-3, data were shared by authors; For datasets 4-5, analysis code was downloaded from https://static-content.springer.com/esm/art:10.1038/nbt.3419/MediaObjects/41587_2016_BFnbt3419_MOESM21_ESM.zip and data were shared by authors; For datasets 6-7, processed dataset was downloaded from AWS following instructions from https://github.com/carpenter-singh-lab/2022_Haghighi_NatureMethods, and replicate_level_cp_normalized.csv.gz features were used. For project datasets 8-13, datasets and analysis results were downloaded from https://zenodo.org/records/7352487. For RxRx3, dataset was obtained from https://www.rxrx.ai/rxrx3. L1000 transcript datasets were downloaded using the same link as datasets 6-7 and the processed transcript data files (named “replicate_level_l1k.csv”) were used.

提供机构：

figshare

创建时间：

2025-05-14

5,000+

优质数据集

54 个

任务类型

进入经典数据集