five

introvoyz041/pgc-ocd-tourette

收藏
Hugging Face2026-04-09 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/introvoyz041/pgc-ocd-tourette
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - tabular-regression - tabular-classification tags: - gwas - summary-statistics - psychiatric-genomics - pgc - ocd-ts - mental-health - genetics - genomics - biology - health - bioinformatics pretty_name: PGC OCD & Tourette Syndrome GWAS Summary Statistics size_categories: - 1M-10M configs: - config_name: hoarding2022 data_files: - split: train path: data/hoarding2022/*.parquet default: true - config_name: ocd2018 data_files: - split: train path: data/ocd2018/*.parquet - config_name: ocd2025 data_files: - split: train path: data/ocd2025/*.parquet - config_name: ocs2024 data_files: - split: train path: data/ocs2024/*.parquet - config_name: ts2019 data_files: - split: train path: data/ts2019/*.parquet language: - en source_datasets: - pgc --- # PGC OCD & Tourette Syndrome — GWAS Summary Statistics [![License: CC BY 4.0](https://img.shields.io/badge/License-CC_BY_4.0-lightgrey)](https://creativecommons.org/licenses/by/4.0/) ## Dataset Description Genome-wide association study (GWAS) summary statistics for **OCD & Tourette Syndrome** phenotypes from the [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/). This dataset contains multiple GWAS publications as separate subsets (configs). Each can be loaded independently. ## Usage ```python from datasets import load_dataset # Load a specific GWAS (e.g., hoarding2022) ds = load_dataset("OpenMed/pgc-ocd-tourette", "hoarding2022") print(ds) ``` ### Available Configs ```python from datasets import get_dataset_config_names configs = get_dataset_config_names("OpenMed/pgc-ocd-tourette") print(configs) ``` ## Subsets (Publications) | Config | Phenotype | Journal | Year | PubMed | Rows | License | |--------|-----------|---------|------|--------|------|---------| | `hoarding2022` | Hoarding Symptoms | Translational Psychiatry | 2022 | [36379924](https://pubmed.ncbi.nlm.nih.gov/36379924/) | — | CC BY 4.0 | | `ocd2018` | OCD | Molecular Psychiatry | 2018 | [28761083](https://pubmed.ncbi.nlm.nih.gov/28761083/) | 8,409,516 | CC BY 4.0 | | `ocd2025` | OCD | Nature Genetics | 2025 | [40360802](https://pubmed.ncbi.nlm.nih.gov/40360802/) | 13,556,976 | CC BY 4.0 | | `ocs2024` | Obsessive-Compulsive Symptoms | Molecular Psychiatry | 2024 | [38548983](https://pubmed.ncbi.nlm.nih.gov/38548983/) | 6,232,765 | CC BY 4.0 | | `ts2019` | Tourette Syndrome | American Journal of Psychiatry | 2019 | [30818990](https://pubmed.ncbi.nlm.nih.gov/30818990/) | — | CC BY 4.0 | ## Data Format All data has been converted to **Apache Parquet** format with shards of 10,000 rows. Common columns include: | Column | Description | |--------|-------------| | `SNP` / `ID` | SNP rsID or variant identifier | | `CHR` | Chromosome | | `BP` / `POS` | Base-pair position (typically GRCh37/hg19) | | `A1` / `ALT` | Effect allele | | `A2` / `REF` | Non-effect (reference) allele | | `OR` / `BETA` | Odds ratio or effect size | | `SE` | Standard error | | `P` | P-value | | `INFO` | Imputation quality score | | `FRQ` / `MAF` | Allele frequency | | `_source_file` | Original source filename | > **Note:** Column names vary between publications. The `_source_file` column tracks the original file each row came from. ## Citation When using any subset, please cite: 1. The **original publication** (see PubMed links above) 2. The **data DOI** from Figshare (see supplementary metadata) 3. **Acknowledge the PGC:** > "Data were obtained from the Psychiatric Genomics Consortium — https://pgc.unc.edu/" ## Terms of Use This dataset is released under the **[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)** license. By using PGC summary statistics you agree to: 1. Cite the original publication(s) 2. Not attempt to re-identify individual participants 3. Comply with the PGC's [data use policies](https://pgc.unc.edu/for-researchers/data-access/) ## Source - **Consortium:** [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/) - **PGC Downloads:** [pgc.unc.edu/for-researchers/download-results/](https://pgc.unc.edu/for-researchers/download-results/) --- *Last updated: April 2026*
提供机构:
introvoyz041
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作