introvoyz041/pgc-ocd-tourette
收藏Hugging Face2026-04-09 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/introvoyz041/pgc-ocd-tourette
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- tabular-regression
- tabular-classification
tags:
- gwas
- summary-statistics
- psychiatric-genomics
- pgc
- ocd-ts
- mental-health
- genetics
- genomics
- biology
- health
- bioinformatics
pretty_name: PGC OCD & Tourette Syndrome GWAS Summary Statistics
size_categories:
- 1M-10M
configs:
- config_name: hoarding2022
data_files:
- split: train
path: data/hoarding2022/*.parquet
default: true
- config_name: ocd2018
data_files:
- split: train
path: data/ocd2018/*.parquet
- config_name: ocd2025
data_files:
- split: train
path: data/ocd2025/*.parquet
- config_name: ocs2024
data_files:
- split: train
path: data/ocs2024/*.parquet
- config_name: ts2019
data_files:
- split: train
path: data/ts2019/*.parquet
language:
- en
source_datasets:
- pgc
---
# PGC OCD & Tourette Syndrome — GWAS Summary Statistics
[](https://creativecommons.org/licenses/by/4.0/)
## Dataset Description
Genome-wide association study (GWAS) summary statistics for **OCD & Tourette Syndrome** phenotypes from the [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/).
This dataset contains multiple GWAS publications as separate subsets (configs). Each can be loaded independently.
## Usage
```python
from datasets import load_dataset
# Load a specific GWAS (e.g., hoarding2022)
ds = load_dataset("OpenMed/pgc-ocd-tourette", "hoarding2022")
print(ds)
```
### Available Configs
```python
from datasets import get_dataset_config_names
configs = get_dataset_config_names("OpenMed/pgc-ocd-tourette")
print(configs)
```
## Subsets (Publications)
| Config | Phenotype | Journal | Year | PubMed | Rows | License |
|--------|-----------|---------|------|--------|------|---------|
| `hoarding2022` | Hoarding Symptoms | Translational Psychiatry | 2022 | [36379924](https://pubmed.ncbi.nlm.nih.gov/36379924/) | — | CC BY 4.0 |
| `ocd2018` | OCD | Molecular Psychiatry | 2018 | [28761083](https://pubmed.ncbi.nlm.nih.gov/28761083/) | 8,409,516 | CC BY 4.0 |
| `ocd2025` | OCD | Nature Genetics | 2025 | [40360802](https://pubmed.ncbi.nlm.nih.gov/40360802/) | 13,556,976 | CC BY 4.0 |
| `ocs2024` | Obsessive-Compulsive Symptoms | Molecular Psychiatry | 2024 | [38548983](https://pubmed.ncbi.nlm.nih.gov/38548983/) | 6,232,765 | CC BY 4.0 |
| `ts2019` | Tourette Syndrome | American Journal of Psychiatry | 2019 | [30818990](https://pubmed.ncbi.nlm.nih.gov/30818990/) | — | CC BY 4.0 |
## Data Format
All data has been converted to **Apache Parquet** format with shards of 10,000 rows. Common columns include:
| Column | Description |
|--------|-------------|
| `SNP` / `ID` | SNP rsID or variant identifier |
| `CHR` | Chromosome |
| `BP` / `POS` | Base-pair position (typically GRCh37/hg19) |
| `A1` / `ALT` | Effect allele |
| `A2` / `REF` | Non-effect (reference) allele |
| `OR` / `BETA` | Odds ratio or effect size |
| `SE` | Standard error |
| `P` | P-value |
| `INFO` | Imputation quality score |
| `FRQ` / `MAF` | Allele frequency |
| `_source_file` | Original source filename |
> **Note:** Column names vary between publications. The `_source_file` column tracks the original file each row came from.
## Citation
When using any subset, please cite:
1. The **original publication** (see PubMed links above)
2. The **data DOI** from Figshare (see supplementary metadata)
3. **Acknowledge the PGC:**
> "Data were obtained from the Psychiatric Genomics Consortium — https://pgc.unc.edu/"
## Terms of Use
This dataset is released under the **[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)** license.
By using PGC summary statistics you agree to:
1. Cite the original publication(s)
2. Not attempt to re-identify individual participants
3. Comply with the PGC's [data use policies](https://pgc.unc.edu/for-researchers/data-access/)
## Source
- **Consortium:** [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/)
- **PGC Downloads:** [pgc.unc.edu/for-researchers/download-results/](https://pgc.unc.edu/for-researchers/download-results/)
---
*Last updated: April 2026*
提供机构:
introvoyz041



