five

Data and code supplementary files for “A genomic medicine approach to identifying novel drugs” PhD thesis

收藏
Figshare2025-09-20 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Data_and_code_supplementary_files_for_A_genomic_medicine_approach_to_identifying_novel_drugs_PhD_thesis/27073753
下载链接
链接失效反馈
官方服务:
资源简介:
This study was conducted entirely in-silico.The majority of the analysis code used for this study was developed as a set of Jupyter notebooks, stored as plain python files with a ‘.py’ extension that can be opened and run in a Jupyter environment according to the README.md file found in the unzipped Supplementary File 5: SF5-code_archive.zip, which includes all chapters of this thesis. With the relevant data in place (Supplementary File 6: SF6-appendix-B_archive.zip and Supplementary File 7: SF7-results-data_archive.tar.gz) and software dependencies installed, the analyses can be repeated and this thesis generated as a pdf using a script (generate_thesis.sh), with the vast majority of graphs, charts and tables appearing in the thesis able to be reproduced, investigated and tested - allowing the review and identification of bugs and issues with the aim of providing a more robust and accurate analysis.Contents:Supplementary File 3: SF3-DUGGIE_DGI_DB.zip - The DUGGIE (DrUG-Gene IntEractions) drug-gene interaction database, consisting of a list of 1,323 approved drugs identified by ATC code, each with a gene target list of 5 or more targets. The data set contains 5,600 unique gene targets in 64,312 unique interactions with drugs, collated from the freely available online datasets STITCH, T3DB, GtoPdb, DrugBank, DSigDB, TTD and DGIdb.Supplementary File 4: SF4-STITCH_DGI_DB.zip - The STITCH drug-gene interaction database, the largest contributing database to DUGGIE, formatted and quality controlled in an identical manner to DUGGIE for comparison purposes.Supplementary File 5: SF5-code_archive.zip - Archive of bash scripts and python 3 Jupyter notebook code used to conduct this project.Supplementary File 6: SF6-appendix-B_archive.zip - Archive of scripts and results supporting the mini analysis in Appendix B of the thesis, investigating permutation issues encountered with the MAGMA gene set analysis tool.Supplementary File 7: SF7-results-data_archive.tar.gz - Archive of all result data, sufficient to recreate the thesis document using the original thesis Jupyter notebooks found in supplementary File 5.Licences:Documentation and Thesis © Copyright 2023 Mark Einon, Licensed under the Creative Commons Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0) license. See documentation-license.txt in supplementary file 5.Software © Copyright 2023 Cardiff University, Licensed under the GNU AFFERO GENERAL PUBLIC LICENSE (AGPL). See software-license.txt in supplementary file 5.DUGGIE contributing data licences:STITCH: https://creativecommons.org/licenses/by-nc-sa/4.0/T3DB: "T3DB is offered to the public as a freely available resource. Use and re-distribution of the data, in whole or in part, for commercial purposes requires explicit permission of the authors and explicit acknowledgment of the source material (T3DB) and the original publication (see below). We ask that users who download significant portions of the database cite the T3DB paper in any resulting publications." http://www.t3db.ca/downloadsGtoPdb: Contents https://creativecommons.org/licenses/by-sa/4.0/DrugBank: https://creativecommons.org/licenses/by-nc/4.0/DsigDB: "DSigDB is freely accessible: http://tanlab.ucdenver.edu/DSigDB." - User manualTTD: Unclear. Website states "All Rights Reserved" but resource structure and description in 2002 publication indicate "open-access".DGIdb: "The data used in DGIdb is all open access and where possible made available as raw data dumps in the downloads section." (https://www.dgidb.org/browse/sources)
创建时间:
2025-09-20
二维码
社区交流群
二维码
科研交流群
商业服务