five

HUBRIS - protein-protein interaction (PPI) networks

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14604607
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains code and files related to HUBRIS, a protein-protein interaction network compiled from 8 different public databases. It also includes data about the newly identified interactors of HHIP in two lung cell lines (IMR90 and 16HBE) and RNA-Seq data from the two cell lines. The associated manuscript is entitled: HHIP protein interactions in lung cells provide insight into COPD pathogenesis Authors: Dávid Deritei1,*, Hiroyuki Inuzuka2,*, Peter J. Castaldi1, Jeong Hyun Yun1, Zhonghui Xu1, Wardatul Jannat Anamika1, John M. Asara3, Feng Guo4, Xiaobo Zhou1, Kimberly Glass1,†, Wenyi Wei2,†, Edwin K. Silverman1,†,‡ 1 Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02215, USA 2 Department of Pathology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA 3 Division of Signal Transduction, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA 4 Jiangsu Key Laboratory of Immunity and Metabolism, Jiangsu International Laboratory of Immunity and Metabolism, Department of Pathogen Biology and Immunology, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China * co-first authors †  co-senior authors ‡ corresponding author; Channing Division of Network Medicine Department of Medicine,  181 Longwood Avenue, Boston, MA 02115-5804 Phone: 617-525-0856 / Fax: 617-731-1541 Email: ed.silverman@channing.harvard.edu Abstract Chronic obstructive pulmonary disease (COPD) is the third leading cause of death worldwide. The primary causes of COPD are environmental, including cigarette smoking; however, genetic susceptibility also contributes to COPD risk. Genome-Wide Association Studies (GWASes) have revealed more than 80 genetic loci associated with COPD, leading to the identification of multiple COPD GWAS genes. However, the biological relationships between the identified COPD susceptibility genes are largely unknown. Genes associated with a complex disease are often in close network proximity, i.e. their protein products often interact directly with each other and/or similar proteins. In this study, we use affinity purification mass spectrometry (AP-MS) to identify protein interactions with HHIP, a well-established COPD GWAS gene which is part of the sonic hedgehog pathway, in two disease-relevant lung cell lines (IMR90 and 16HBE). To better understand the network neighborhood of HHIP, its proximity to the protein products of other COPD GWAS genes, and its functional role in COPD pathogenesis, we create HUBRIS, a protein-protein interaction network compiled from 8 publicly available databases. We identified both common and cell type-specific protein-protein interactors of HHIP. We find that our newly identified interactions shorten the network distance between HHIP and the protein products of several COPD GWAS genes, including DSP, MFAP2, TET2, and FBLN5. These new shorter paths include proteins that are encoded by genes involved in extracellular matrix and tissue organization. We found and validated interactions to proteins that provide new insights into COPD pathobiology, including CAVIN1 (IMR90) and TP53 (16HBE). The newly discovered HHIP interactions with CAVIN1 and TP53 implicate HHIP in response to oxidative stress. Additional information: Manuscript available on BioRxiv: https://www.biorxiv.org/content/10.1101/2024.04.01.586839v1 Associated GitHub repository: https://github.com/deriteidavid/hubris RNA-Seq data (IMR90 and 16HBE cells): GSE285360 AP-MS data: ftp://massive.ucsd.edu/v07/MSV000096594/ (For more details see the documentation on the GitHub page and the Methods section of our manuscript) Description of the files: biorosetta_data/ - local files of the biorosetta gene id mapping tool for consistent remapping db_local_files/ - local saves of the 8 constituent databases of HUBRIS (date of download 01/24/2024) PPI_networks_for_analysis/ - different PPI networks derived from HUBRIS, new experimental edges and RNA-Seq data RNASeq_lists/ - two text files (for the two cell lines) containing lists of genes that are considered expressed based on the RNA-Seq data. We use these lists to filter HUBRIS into cell type-specific versions. The files are generated by RNASeq_based_filtering_hg38.py.  RNA_Seq_hg38/ - RNA-Seq data from IMR90 and 16HBE cells (tpm). For more details see the Methods section of our manuscript. all_significant_links_HHIP.xlsx - table containing the list of newly identified, significant protein-protein interactors of HHIP in 16HBE and IMR90 cells, based on analysis with SAINTExpress, with and without the CRAPome database.  create_HUBRIS.py - main script to merge the already downloaded PPI databases (with the option to download the newest versions), translate between gene IDs, and filter the network based on the representation of edges in the different databases.  See the GitHub README or the Methods section of our manuscript for more details databases.xlsx - configuration file for processing the database files (read by create_HUBRIS.py). For the files included in the db_local_files/ no re-configuration is needed, however, if the newest version of the databases is downloaded it may be necessary to adapt the file manually.   generate_PPIs_for_analysis.py - Generate 2x2x2x3 = 24 different PPI network variants generated by the parameters new_edges (True, False); cell_line (16HBE, IMR90, union); filter_crapome (True, False); filter_networks_based_on_expression (True, False). The script saves the different network variations into .gml files in PPI_networks_for_analysis/ G_merged_raw.gpickle - gpickle file of the raw HUBRIS network (no filtering)  G_hubris.gpickle - gpickle file of filtered HUBRIS (every edge must be represented in at least 2 of the 8 databases. This parameter can be changed in create_HUBRIS.py) G_hubris.gml - gml file of the filtered HUBRIS G_hubris.txt - edgelist of the filtered HUBRIS  G_hubris_lcc.gml - gml file of the largest connected component of HUBRIS (filtered) G_hubris_lcc.txt - edgelist of the largest connected component of HUBRIS (filtered) hgnc_mapping.tsv - gene id mapping from https://www.genenames.org/download/statistics-and-files/ (downlaoded 05/10/2022). hubris_functions.py - helper functions for create_HUBRIS.py RNASeq_based_filtering.py - script generating the lists of expressed genes (RNASeq_lists/) based on RNA-Seq data (rnaseq_PPI_GWAS/)
创建时间:
2025-01-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作