five

Genetic Associations

收藏
Databricks2024-05-09 收录
下载链接:
https://marketplace.databricks.com/details/2b3883bc-f56c-4551-b2dc-f0f35168d775/John-Snow-Labs_Genetic-Associations
下载链接
链接失效反馈
官方服务:
资源简介:
**Overview** This data package contains information on genetic associations including biochemical protein-protein interaction, genetic variation, gene chemical interaction and protein kinase interactome. **Description** This data package contains datasets on protein-protein interactions, protein post-translational modifications, gene-chemical interactions with "target" interaction types as well as associated annotation data obtained from the Biological General Repository for Interaction databases (BIOGRID) for major model organisms species; genetic association studies in the field of epilepsy from Epilepsy Genetic Association Database (epiGAD) of the International League Against Epilepsy; and interactions for the human protein Mitogen-Activated Protein (MAP) kinase published in A Human Map Kinase Interactome. **Benefits** - This data package can be useful for further gene research and genetic studies. **License Information** The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the [Data Library](https://www.johnsnowlabs.com/marketplace/) on John Snow Labs website. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes. **Included Datasets** - [Biochemical Protein Protein Interactions](https://www.johnsnowlabs.com/marketplace/biochemical-protein-protein-interactions) - This dataset includes all protein-protein interactions as well as associated annotation data obtained from the Biological General Repository for Interaction databases (BIOGRID) for major model organisms species, including involved experimental systems used to disclose the interaction. The data is a curation of thousands of publications of research experiments that found a link (interaction) between two proteins. - [Epilepsy Genetics Meta Analysis Publications](https://www.johnsnowlabs.com/marketplace/epilepsy-genetics-meta-analysis-publications) - The Epilepsy Genetic Association Database (epiGAD) of the International League Against Epilepsy is an online repository of data relating to genetic association studies in the field of epilepsy, collects results from published and unpublished research in epilepsy genetics providing data to be used for meta-analyses and other scientific purposes. - [Epilepsy Pharmacogenetics Published and Unpublished Research](https://www.johnsnowlabs.com/marketplace/epilepsy-pharmacogenetics-published-and-unpublished-research) - The Epilepsy Genetic Association Database (epiGAD) of the International League Against Epilepsy is an online repository of data relating to genetic association studies in the field of epilepsy, collects results from published and unpublished research in epilepsy genetics providing data to be used for meta-analyses and other scientific purposes. - [Epilepsy Susceptibility Genes Published and Unpublished Research](https://www.johnsnowlabs.com/marketplace/epilepsy-susceptibility-genes-published-and-unpublished-research) - The Epilepsy Genetic Association Database (epiGAD) of the International League Against Epilepsy is an online repository of data relating to genetic association studies in the field of epilepsy, collects results from published and unpublished research in epilepsy genetics providing data to be used for meta-analyses and other scientific purposes. - [Gene Alteration Clinical Condition And Intervention](https://www.johnsnowlabs.com/marketplace/gene-alteration-clinical-condition-and-intervention) - The Clinical Genomic Database (CGD) purpose is to correlate genetic data with clinical settings, matching diseases with their genetic cause. The database includes all diseases with a known genetic cause of a single gene alteration. The dataset includes descriptions for the gene and inheritance, as well as for the manifestations and interventions to consider. The dataset does not contain contiguous gene syndromes or somatic alterations unless these result from a same gene germline change. - [Gene Chemical Interactions](https://www.johnsnowlabs.com/marketplace/gene-chemical-interactions) - This dataset includes all gene-chemical interactions with "target" interaction types as well as associated annotation data obtained from the Biological General Repository for Interaction Datasets (BioGRID) for major model organisms species, noting the official symbol and aliases for every interactor as well as the experimental system used to disclose the interaction. Reference of the publication is given as first author and PubMed ID and organisms involved are also specified. - [Human Mitogen Activated Protein MAP Kinase Interactome](https://www.johnsnowlabs.com/marketplace/human-mitogen-activated-protein-map-kinase-interactome) - This dataset contains the interaction data published in A Human Map Kinase Interactome by Bandyopadhyay S. et al. - 2010, which reported all interactions for the human protein Mitogen Activated Protein (MAP) kinase as well as associated annotation data obtained from the Biological General Repository for Interaction databases (BIOGRID) for humans including the protein used to find the modification (bait), and annotations on the proteins pulled during the interaction experiment. - [Human and Animal Cell Lines](https://www.johnsnowlabs.com/marketplace/human-and-animal-cell-lines) - This dataset consists of a collection of cell lines by DSMZ (Deutsche Sammlung von Mikroorganismen und Zellkulturen Institute). This collection currently comprises more than 800 different immortalized cell cultures of primate, rodent, amphibian, fish, insect origin isolated from numerous tissues and hybridomas. - [Protein Post Translational Modifications](https://www.johnsnowlabs.com/marketplace/protein-post-translational-modifications) - This dataset includes protein post-translational modifications as well as associated annotation data obtained from the Biological General Repository for Interaction databases (BIOGRID) for major model organisms species including the type of modification, protein sequence and specific amino acid involved. **Data Engineering Overview** **We deliver high-quality data** - Each dataset goes through 3 levels of quality review - 2 Manual reviews are done by domain experts - Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints - Data is normalized into one unified type system - All dates, unites, codes, currencies look the same - All null values are normalized to the same value - All dataset and field names are SQL and Hive compliant - Data and Metadata - Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters - Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated - Data Updates - Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted **Our data is curated and enriched by domain experts** Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts: - Field names, descriptions, and normalized values are chosen by people who actually understand their meaning - Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset - Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations - The data is always kept up to date – even when the source requires manual effort to get updates - Support for data subscribers is provided directly by the domain experts who curated the data sets - Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution. **Need Help?** If you have questions about our products, contact us at [info@johnsnowlabs.com](mailto:info@johnsnowlabs.com).
提供机构:
John Snow Labs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作