introvoyz041/pgc-bipolar
收藏Hugging Face2026-04-09 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/introvoyz041/pgc-bipolar
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- tabular-regression
- tabular-classification
tags:
- gwas
- summary-statistics
- psychiatric-genomics
- pgc
- bip
- mental-health
- genetics
- genomics
- biology
- health
- bioinformatics
pretty_name: PGC Bipolar Disorder GWAS Summary Statistics
size_categories:
- 1M-10M
configs:
- config_name: bip2011
default: true
data_files:
- split: train
path: data/bip2011/*.parquet
- config_name: bip2019
data_files:
- split: train
path: data/bip2019/*.parquet
- config_name: bip2021
data_files:
- split: train
path: data/bip2021/*.parquet
- config_name: bip2021_noUKBB
data_files:
- split: train
path: data/bip2021_noUKBB/*.parquet
- config_name: bip2024
data_files:
- split: train
path: data/bip2024/*.parquet
language:
- en
source_datasets:
- pgc
---
# PGC Bipolar Disorder — GWAS Summary Statistics
[](https://creativecommons.org/licenses/by/4.0/)
## Dataset Description
Genome-wide association study (GWAS) summary statistics for **Bipolar Disorder** phenotypes from the [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/).
Each publication is available as a separate subset (config) and can be loaded independently.
## Usage
```python
from datasets import load_dataset
# Load a specific GWAS
ds = load_dataset("OpenMed/pgc-bipolar", "bip2011")
print(ds)
```
### List all available subsets
```python
from datasets import get_dataset_config_names
print(get_dataset_config_names("OpenMed/pgc-bipolar"))
```
## Subsets
| Config | Phenotype | Journal | Year | PubMed | Rows |
|--------|-----------|---------|------|--------|------|
| `bip2011` | Bipolar Disorder | Nature Genetics | 2011 | [21926972](https://pubmed.ncbi.nlm.nih.gov/21926972/) | — |
| `bip2019` | Bipolar Disorder | Nature Genetics | 2019 | [31043756](https://pubmed.ncbi.nlm.nih.gov/31043756/) | — |
| `bip2021` | Bipolar Disorder | Nature Genetics | 2021 | [34002096](https://pubmed.ncbi.nlm.nih.gov/34002096/) | — |
| `bip2021_noUKBB` | Bipolar Disorder (no UK Biobank) | Nature Genetics | 2021 | [34002096](https://pubmed.ncbi.nlm.nih.gov/34002096/) | — |
| `bip2024` | Bipolar Disorder | Nature | 2024 | [39843750](https://pubmed.ncbi.nlm.nih.gov/39843750/) | — |
## Data Format
All data is stored as **Apache Parquet** shards (10,000 rows each). Common columns:
| Column | Description |
|--------|-------------|
| `SNP` / `ID` | SNP rsID or variant identifier |
| `CHR` | Chromosome |
| `BP` / `POS` | Base-pair position (typically GRCh37/hg19) |
| `A1` | Effect allele |
| `A2` | Non-effect allele |
| `OR` / `BETA` | Odds ratio or effect size |
| `SE` | Standard error |
| `P` | P-value |
| `_source_file` | Original source filename |
> Column names vary between publications. Check each subset's schema.
## Citation
Please cite the original publication (see PubMed links above) and acknowledge the PGC:
> Data were obtained from the Psychiatric Genomics Consortium — https://pgc.unc.edu/
## Terms of Use
Released under **[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)**.
- Cite the original publication(s)
- Do not attempt to re-identify individual participants
- Comply with the PGC [data use policies](https://pgc.unc.edu/for-researchers/data-access/)
## Source
- [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/)
- [PGC Downloads](https://pgc.unc.edu/for-researchers/download-results/)
---
*Last updated: April 2026*
提供机构:
introvoyz041



