Investigation of the bioactivity and ADMET properties of compounds targeting the breast cancer receptor ERα
收藏Figshare2025-10-27 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Investigation_of_the_bioactivity_and_ADMET_properties_of_compounds_targeting_the_breast_cancer_receptor_ER_/30459479/2
下载链接
链接失效反馈官方服务:
资源简介:
This study is grounded on a rigorously curated dataset of compounds targeting the breast cancer receptor ERα, constructed to support quantitative structure–activity relationship (QSAR) modeling and machine learning–based predictive analysis. The dataset, derived from the <i>Compound</i><i>_Activity</i>, <i>Compound_ADMET</i>, and <i>Molecular_Descriptor</i> files, contains SMILES representations of 2,024 small molecules. Among these, 1,974 compounds possess experimentally determined IC50 values and their corresponding pIC50 transformations, which serve as the dependent variables for quantitative modeling of bioactivity. For the same subset, ADMET annotations are provided in binary format, encompassing five pharmacokinetic and toxicity-related endpoints, including Caco-2, CYP3A4, hERG, HOB, and MN, thereby enabling a multidimensional characterization of drug-likeness and safety profiles. The remaining 50 compounds, which lack experimental bioactivity and ADMET measurements, are included as an external validation set to evaluate the generalization capability of predictive models. In addition, 729 two-dimensional molecular descriptors are computed for all compounds to capture their structural topology, constitutional properties, and physicochemical characteristics, serving as independent variables in model training, feature selection, and interpretability analysis. The accompanying <i>Descriptor_Definition</i> file provides a structured taxonomy and comprehensive definitions of all descriptors, facilitating a transparent understanding of their chemical relevance and functional significance. Collectively, this dataset establishes a robust foundation for reproducible and interpretable QSAR modeling, while offering a benchmark resource for data-driven exploration and predictive modeling in computational drug discovery.
提供机构:
Liu, Yanli; Cao, Jinhui
创建时间:
2025-10-27



