five

Inflammatory Bowel Disease (IBD) Interactome: text database and analyzed data of experimental research in humans between 1990-2020.

收藏
DataCite Commons2025-05-01 更新2024-07-29 收录
下载链接:
https://figshare.com/articles/dataset/Inflammatory_Bowel_Disease_IBD_Interactome_text_database_and_analyzed_data_of_experimental_research_in_humans_between_1990-2020_/16906534/1
下载链接
链接失效反馈
官方服务:
资源简介:
We present the corpus of abstracts (database), raw and analyzed data tables as well as MATLAB™ custom code associated to our paper "Network analysis of inflammatory bowel disease research: towards the interactome". In our work we quantitatively analyze, through text mining of scientific abstracts and network analysis, how representativeness of components from the different functional levels (i.e. immune-endocrine cells, molecules, genes and biological processes and modulating factors) involved in the human Inflammatory Bowel Disease (IBD) interactome and their relations have changed over time. We expected to understand how knowledge in this field has been constructed and provide insights on the principal challenges for approaching the interactome of this disease.<br> <br> <strong>Tables</strong><br> Supporting Table S1_Raw corpus of abstracts: we present the raw set of abstracts and associated metadata retrieved as a result of our PubMed query and dowloading through the bibliographic manager Zotero 5.0. Rows represent each abstract and columns metadata associated to each one of them. Specifically, the eleventh column, named "AbstractNote", contains the text of the abstract itself. Also important, the third column named "PublicationYear", contains the year of publication of the abstract. <br> Supporting Table S2_Curated corpus of abstracts: we present the set of abstracts as detailed for Supporting Table S1 but after curation steps (namely, elimination of duplicates and abstracts only studying animal models). <br> Supporting Table S3_Target Matrix of Cells, Supporting Table S4_Target Matrix of Molecules, Supporting Table S5_Target Matrix of Genes, Supporting Table S6_Target Matrix of Biological Processes and Modulating Factors: we present the set of immune-endocrine cells, molecules, genes and biological processes and modulating factors that have been involved in human IBD in recent reviews (Graham, D.B., Xavier, R.J. Nature 578, 527–539 (2020); Konstantina E. Vennou et al. Genomics 112(2), 1761-1767 (2020); Lahue, K.G. et al. Genes Immun 21, 311–325 (2020)). Rows show each of the immune-endocrine cells, molecules, genes and biological processes and modulating factors of interest and columns the semantic synonyms of each one of them. Semantic synonyms were constructed using domain-knowledge and catalogues (GeneCards, HGNG, HUGO, UniProt, Genetics Home Reference from the NIH). The occurrence of these set of immune-endocrine cells, molecules, genes and biological processes and modulating factors and their synonyms was automatical scanned in the text of the abstracts. <br> Supporting Table S7_Summary matrix for the complete set of components: we present the occurrence of each of the 1218 components (immune-endocrine cells, molecules, genes and biological processes and environmental factors) along the corpus of 25971 abstracts. Rows represent each abstract and columns each component. Cells contain a 1 if a given component in any of its synonyms appeared in a given abstract, otherwise cells contain a 0. <br> Supporting Table S8_Subset summary matrix for the complete set of components and the whole time-period: we present the occurrence of the filtered set of 214 components (immune-endocrine cells, molecules, genes and biological processes and environmental factors) along the corpus of 25971 abstracts. These filtered set of components included only those components with accumulated frequencies ≥ 10 along the corpus abstracts. Rows represent each abstract and columns each component. Cells contain a 1 if a given component in any of its synonyms appeared in a given abstract, otherwise cells contain a 0. <br> Supporting Table S9_Subset summary matrix for the complete set of components and the first decade: we present the occurrence of the filtered set of 214 components (immune-endocrine cells, molecules, genes and biological processes and environmental factors) along the corpus of 4327 abstracts corresponding to the time-period comprised between the years 1990-1999. Rows represent each abstract and columns each component. Cells contain a 1 if a given component in any of its synonyms appeared in a given abstract, otherwise cells contain a 0. <br> Supporting Table S10_Subset summary matrix for the complete set of components and the second decade: we present the occurrence of the filtered set of 214 components (immune-endocrine cells, molecules, genes and biological processes and environmental factors) along the corpus of 7567 abstracts corresponding to the time-period comprised between the years 2000-2009. Rows represent each abstract and columns each component. Cells contain a 1 if a given component in any of its synonyms appeared in a given abstract, otherwise cells contain a 0. <br> Supporting Table S11_Subset summary matrix for the complete set of components and the third decade: we present the occurrence of the filtered set of 214 components (immune-endocrine cells, molecules, genes and biological processes and environmental factors) along the corpus of 14077 abstracts corresponding to the time-period comprised between the years 2010-2020. Rows represent each abstract and columns each component. Cells contain a 1 if a given component in any of its synonyms appeared in a given abstract, otherwise cells contain a 0. <br> Supporting Table S12_Matrix of co-occurrence for the complete set of components and the whole time-period: we present the accumulated frequency of co-occurrence for each pair of the 214 components (immune-endocrine cells, molecules, genes and biological processes and environmental factors) along the corpus of 25971 abstracts. Rows and columns represent each of the 214 components. Cells contain the frequency of co-occurrence between a given pair of components. <br> Supporting Table S13_Matrix of co-occurrence for the complete set of components and the first decade: we present the accumulated frequency of co-occurrence for each pair of components (immune-endocrine cells, molecules, genes and biological processes and environmental factors) along the corpus of 4327 abstracts corresponding to the time-period comprised between the years 1990-1999. Rows and columns represent each of the 214 components. Cells contain the frequency of co-occurrence between a given pair of components. <br> Supporting Table S14_Matrix of co-occurrence for the complete set of components and the second decade: we present the accumulated frequency of co-occurrence for each pair of components (immune-endocrine cells, molecules, genes and biological processes and environmental factors) along the corpus of 7567 abstracts corresponding to the time-period comprised between the years 2000-2009. Rows and columns represent each of the 214 components. Cells contain the frequency of co-occurrence between a given pair of components. <br> Supporting Table S15_Matrix of co-occurrence for the complete set of components and the third decade: we present the accumulated frequency of co-occurrence for each pair of components (immune-endocrine cells, molecules, genes and biological processes and environmental factors) along the corpus of 14077 abstracts corresponding to the time-period comprised between the years 2010-2020. Rows and columns represent each of the 214 components. Cells contain the frequency of co-occurrence between a given pair of components. <br> Supporting Table S16_Class definition within cells’ functional level, Supporting Table S17_Class definition within molecules’ functional level: Supporting Table S18_Class definition within genes’ functional level: Supporting Table S19_Class definition within biological processes and modulating factors’ functional level:: to get a better understanding of co-occurrences between components and for visualization purposes, components within each functional level (immune-endocrine cells, molecules, genes and, biological processes and modulating factors) were grouped according to a “class”. The first column indicates the functional level, the second column indicates the corresponding class and the third one represents each respective component. <br> Supporting Table S20_Journals: we present the complete set of 2043 journals corresponding to the abstracts of our curated corpus.<br> <br> Supporting Table S21_Individual accumulated frequency: we present the individual frequencies of occurrences of each of the 1218 components we searched for in the abstracts. Rows represent each component. First column indicates the functional level (immune-endocrine cells, molecules, genes, biological processes and modulating factors), the second column the corresponding class (according to Supporting Tables S16-19), the third column the respective component and the fourth column contains the frequency value.<br> <br> <strong>MATLAB™ custom code</strong><br> Supporting Material 2: Text preprocessing, curation and automatic scanning of abstracts. Supporting Material 3: Matrix operations and graph measures<br> <br> <strong>Associated publication:</strong> Network analysis of inflammatory bowel disease research: towards the interactome
提供机构:
figshare
创建时间:
2022-05-09
二维码
社区交流群
二维码
科研交流群
商业服务