Transitive prediction of small molecule function through alignment of high-content screening resources
收藏DataCite Commons2025-07-12 更新2025-09-07 收录
下载链接:
https://springernature.figshare.com/articles/dataset/Transitive_prediction_of_small_molecule_function_through_alignment_of_high-content_screening_resources/27756177
下载链接
链接失效反馈官方服务:
资源简介:
# Datasets used in Transitive prediction of small molecule function through alignment of high-content screening resources
This dataset supports the development of CLIP<sup>n</sup>, a contrastive-learning framework designed to align heterogeneous high-content screening (HCS) profile datasets.
***GitHub link:*** https://github.com/AltschulerWu-Lab/CLIPn
## Directory Structure
### Data Files
- **HCS_datasets.pkl**: Contains 13 high-content screening (HCS) datasets from multiple studies across 20 years.
- **Hypoxia.pkl**: Contains 8 profile datasets using different assays and treated under diverse hypoxia durations.
- **Expression.pkl**: Contains 2 transcriptional profile datasets and 6 image profile datasets for multimodal analysis.
### Folders
#### raw_profiles
##### HCS13/
- Contains raw data from 13 high-content screening (HCS) datasets. Each dataset includes meta and feature files.
##### L1000/
- **CDRP_feature_exp.csv**: Raw L1000 expression data from the CDRP dataset.
- **CDRP_meta_exp.csv**: Metadata associated with the CDRP expression data.
- **LINCS_feature_exp.csv**: Raw L1000 expression data from the LINCS dataset.
- **LINCS_meta_exp.csv**: Metadata associated with the LINCS expression data.
##### RxRx3/
- **RxRx3_feature_final.csv**: Profile data from the RxRx3 dataset.
- **RxRx3_meta_final.csv**: Metadata from the RxRx3 dataset.
##### Uncharacterized_compounds/
- **NCI_cpnData.csv**: Feature data for uncharacterized compounds from the NCI dataset.
- **NCI_cpnInfo.csv**: Information about uncharacterized compounds in the NCI dataset.
- **Prestwick_UTSW_cpnData.csv**: Feature data for uncharacterized compounds from the Prestwick UTSW dataset.
- **Prestwick_UTSW_cpnInfo.csv**: Information about uncharacterized compounds from the Prestwick UTSW dataset.
提供机构:
figshare
创建时间:
2024-11-15



