five

introvoyz041/pgc-cross-disorder

收藏
Hugging Face2026-04-09 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/introvoyz041/pgc-cross-disorder
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - tabular-regression - tabular-classification tags: - gwas - summary-statistics - psychiatric-genomics - pgc - cdg - mental-health - genetics - genomics - biology - health - bioinformatics pretty_name: PGC Cross-Disorder GWAS Summary Statistics size_categories: - 1M-10M configs: - config_name: cdg2013 default: true data_files: - split: train path: data/cdg2013/*.parquet - config_name: cdg2018-bip-scz data_files: - split: train path: data/cdg2018-bip-scz/*.parquet - config_name: cdg2019 data_files: - split: train path: data/cdg2019/*.parquet - config_name: cdg2020-bip-mdd data_files: - split: train path: data/cdg2020-bip-mdd/*.parquet - config_name: cdg2025 data_files: - split: train path: data/cdg2025/*.parquet language: - en source_datasets: - pgc --- # PGC Cross-Disorder — GWAS Summary Statistics [![License: CC BY 4.0](https://img.shields.io/badge/License-CC_BY_4.0-lightgrey)](https://creativecommons.org/licenses/by/4.0/) ## Dataset Description Genome-wide association study (GWAS) summary statistics for **Cross-Disorder** phenotypes from the [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/). Each publication is available as a separate subset (config) and can be loaded independently. ## Usage ```python from datasets import load_dataset # Load a specific GWAS ds = load_dataset("OpenMed/pgc-cross-disorder", "cdg2013") print(ds) ``` ### List all available subsets ```python from datasets import get_dataset_config_names print(get_dataset_config_names("OpenMed/pgc-cross-disorder")) ``` ## Subsets | Config | Phenotype | Journal | Year | PubMed | Rows | |--------|-----------|---------|------|--------|------| | `cdg2013` | Multiple Psychiatric Disorders | The Lancet | 2013 | [23453885](https://pubmed.ncbi.nlm.nih.gov/23453885/) | — | | `cdg2018-bip-scz` | Bipolar Disorder & Schizophrenia | Cell | 2018 | [29906448](https://pubmed.ncbi.nlm.nih.gov/29906448/) | — | | `cdg2019` | Multiple Psychiatric Disorders | Cell | 2019 | [31835028](https://pubmed.ncbi.nlm.nih.gov/31835028/) | — | | `cdg2020-bip-mdd` | Bipolar Disorder & Major Depression | Biological Psychiatry | 2020 | [31926635](https://pubmed.ncbi.nlm.nih.gov/31926635/) | — | | `cdg2025` | Multiple Psychiatric Disorders | Nature | 2025 | Pending | — | ## Data Format All data is stored as **Apache Parquet** shards (10,000 rows each). Common columns: | Column | Description | |--------|-------------| | `SNP` / `ID` | SNP rsID or variant identifier | | `CHR` | Chromosome | | `BP` / `POS` | Base-pair position (typically GRCh37/hg19) | | `A1` | Effect allele | | `A2` | Non-effect allele | | `OR` / `BETA` | Odds ratio or effect size | | `SE` | Standard error | | `P` | P-value | | `_source_file` | Original source filename | > Column names vary between publications. Check each subset's schema. ## Citation Please cite the original publication (see PubMed links above) and acknowledge the PGC: > Data were obtained from the Psychiatric Genomics Consortium — https://pgc.unc.edu/ ## Terms of Use Released under **[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)**. - Cite the original publication(s) - Do not attempt to re-identify individual participants - Comply with the PGC [data use policies](https://pgc.unc.edu/for-researchers/data-access/) ## Source - [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/) - [PGC Downloads](https://pgc.unc.edu/for-researchers/download-results/) --- *Last updated: April 2026*
提供机构:
introvoyz041
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作