Datasets used in Transitive prediction of small-molecule function through alignment of high-content screening resources
收藏DataCite Commons2025-05-14 更新2026-04-25 收录
下载链接:
https://figshare.com/articles/dataset/Datasets_used_in_Transitive_prediction_of_small-molecule_function_through_alignment_of_high-content_screening_resources/29061038/1
下载链接
链接失效反馈官方服务:
资源简介:
This dataset supports the development of CLIP<sup>n</sup>, a contrastive-learning framework designed to align heterogeneous high-content screening (HCS) profile datasets.<br>GitHub link: https://github.com/AltschulerWu-Lab/CLIPn<br>Directory Structure<br>Foldersraw_profilesHCS13/Contains raw data from 13 high-content screening (HCS) datasets. Each dataset includes meta and feature files.<br>L1000/CDRP_feature_exp.csv: Raw L1000 expression data from the CDRP dataset.CDRP_meta_exp.csv: Metadata associated with the CDRP expression data.LINCS_feature_exp.csv: Raw L1000 expression data from the LINCS dataset.LINCS_meta_exp.csv: Metadata associated with the LINCS expression data.<br>RxRx3/RxRx3_feature_final.csv: Profile data from the RxRx3 dataset.RxRx3_meta_final.csv: Metadata from the RxRx3 dataset.<br>Uncharacterized_compounds/NCI_cpnData.csv: Feature data for uncharacterized compounds from the NCI dataset.NCI_cpnInfo.csv: Information about uncharacterized compounds in the NCI dataset.Prestwick_UTSW_cpnData.csv: Feature data for uncharacterized compounds from the Prestwick UTSW dataset.Prestwick_UTSW_cpnInfo.csv: Information about uncharacterized compounds from the Prestwick UTSW dataset.<br><br>Data ReferenceFor raw datasets from 13 HCS database, data and analysis pipeline for dataset 1 was obtained from https://www.science.org/doi/suppl/10.1126/science.1100709/suppl_file/perlman.som.zip; for datasets 2-3, data were shared by authors; For datasets 4-5, analysis code was downloaded from https://static-content.springer.com/esm/art:10.1038/nbt.3419/MediaObjects/41587_2016_BFnbt3419_MOESM21_ESM.zip and data were shared by authors; For datasets 6-7, processed dataset was downloaded from AWS following instructions from https://github.com/carpenter-singh-lab/2022_Haghighi_NatureMethods, and replicate_level_cp_normalized.csv.gz features were used. For project datasets 8-13, datasets and analysis results were downloaded from https://zenodo.org/records/7352487. For RxRx3, dataset was obtained from https://www.rxrx.ai/rxrx3. L1000 transcript datasets were downloaded using the same link as datasets 6-7 and the processed transcript data files (named “replicate_level_l1k.csv”) were used.
提供机构:
figshare
创建时间:
2025-05-14



