five

GDB Databases

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/5172017
下载链接
链接失效反馈
官方服务:
资源简介:
About GDB-11 enumerates small organic molecules up to 11 atoms of C, N, O and F following simple chemical stability and synthetic feasibility rules. GDB-13 enumerates small organic molecules up to 13 atoms of C, N, O, S and Cl following simple chemical stability and synthetic feasibility rules. With 977 468 314 structures, GDB-13 is the largest publicly available small organic molecule database to date. How to cite To cite GDB-11, please reference: Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physico-chemical properties, compound classes and drug discovery. Fink, T.; Reymond, J.-L. J. Chem. Inf. Model. 2007, 47, 342-353. Virtual Exploration of the Small Molecule Chemical Universe below 160 Daltons. Fink, T.; Bruggesser, H.; Reymond, J.-L. Angew. Chem. Int. Ed. 2005, 44, 1504-1508. To cite GDB-13, please reference: 970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13. Blum L. C.; Reymond J.-L. J. Am. Chem. Soc., 2009, 131, 8732-8733. To cite GDB-17, please reference: Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. Ruddigkeit Lars, van Deursen Ruud, Blum L. C.; Reymond J.-L. J. Chem. Inf. Model., 2012, 52, 2864-2875. Download You can download the databases and subsets of it using the links provided. All the molecules are stored in dearomatized, canonized SMILES format and compressed as tar/gz archive (for Windows users: Download 7-zip to open archives). GDB-17 GDB-17-Set (50 million)     GDB17.50000000.smi.gz     314 MB Lead-like Set (100-350 MW & 1-3 clogP)(11 million)     GDB17.50000000LL.smi.gz     75 MB Lead-like Set (100-350 MW & 1-3 clogP) without small rings (3-4 ring atoms)(0.8 million)   GDB17.50000000LLnoSR.smi.gz     55 MB GDB-13 Entire GDB-13 (including all C/N/O/Cl/S molecules)     gdb13.tgz     2.6 GB GDB-13 Subsets (The sum of all the subsets below correspond to the entire GDB-13 above) Graph subset (saturated hydrocarbons)     gdb13.g.tgz     1.1 MB Skeleton subset (unsaturated hydrocarbons)     gdb13.sk.tgz     14 MB Only carbon & nitrogen containing molecules     gdb13.cn.tgz     443 MB Only carbon & oxygen containing molecules     gdb13.co.tgz     299 MB Only carbon & nitrogen & oxygen containing molecules     gdb13.cno.tgz     1.8 GB Chlorine & sulphur containing molecules     gdb13.cls.tgz     189 MB GDB-13 Subsets (For details please refer to the Table 2 in J Comput Aided Mol Des 2011 25:637 to 647) GDB-13 Subset AB (~635 Millions)     AB.smi.gz     2.4 GB GDB-13 Subset ABC (~441 Millions)     ABC.smi.gz     1.7 GB GDB-13 Subset ABCD (~277 Millions)     ABCD.smi.gz     1.1 GB GDB-13 Subset ABCDE (~140 Millions)     ABCDE.smi.gz     565 MB GDB-13 Subset ABCDEF (~43 Millions)     ABCDEF.smi.gz     171 MB GDB-13 Subset ABCDEFG (~13 Millions)     ABCDEFG.smi.gz     50 MB GDB-13 Subset ABCDEFGH (~1.4 Millions)     ABCDEFGH.smi.gz     6.2 MB GDB-13 Random Sample. Annotated with frequency and log-likelihood (Please refer to Exploring the GDB-13 chemical space using deep generative models) GDB-13 Random Sample (1 Million)     gdb13.1M.freq.ll.smi.gz     14.8 MB FDB-17 FDB-17     FDB-17-fragmentset.smi.gz     62.2 MB GDB4c GDB4c (SMILES)     GDB4c.smi.gz     6.2 MB GDB4c3D (SMILES)     GDB4c3D.smi.gz     161 MB GDB4c3D (SDF)     GDB4c3D.sdf.tar.gz     2 GB Other GDBMedChem (SMILES)     GDBMedChem.smi     276 MB GDBChEMBL (SMILES)     GDBChEMBL.smi     353.6 MB GDB-13 random selection (1 million)     gdb13.rand1M.smi.gz     7.2 MB Fragment-like subset (Rule of three)     gdb13.frl.tgz     1.2 GB Dark matter universe up to 9 heavy atoms     dmu9.tgz     87 MB GDB-11 Entire GDB-11 (including all C/N/O/F molecules)     gdb11.tgz     122 MB Fragrance Like Subsets: For details please refer to Ruddigkeit et al. Journal of Cheminformatics 2014, 6:27 FragranceDB (SuperScent + Flavornet)     FragranceDB.smi     56 KB TasteDB (SuperSweet + BitterDB)     TasteDB.smi     44 KB FragranceDB.FL (Fragrance-like subset of FragranceDB)     FragranceDB.FL.smi     32 KB ChEMBL.FL (Fragrance-like subset of ChEMBL)     ChEMBL.FL.smi     452 KB PubChem.FL Fragrance-like subset of PubChem     PubChem.FL.smi     20 MB ZINC.FL (Fragrance-like subset of ZINC)     ZINC.FL.smi     1.3 MB GDB-13.FL (Fragrance-like subset of GDB-13)     GDB-13.FL.smi.gz     165 MB Terms and conditions: The GDB databases may be downloaded free of charge. In published research involving GDB, cite the appropriate references mentioned above. GDB must not be used as part of or in patents. GDB and large portions thereof must not be redistributed without the express written permission of Jean-Louis Reymond.
创建时间:
2022-09-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作