introvoyz041/pgc-anxiety
收藏Hugging Face2026-04-09 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/introvoyz041/pgc-anxiety
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- tabular-regression
- tabular-classification
tags:
- gwas
- summary-statistics
- psychiatric-genomics
- pgc
- anx
- mental-health
- genetics
- genomics
- biology
- health
- bioinformatics
pretty_name: PGC Anxiety Disorder GWAS Summary Statistics
size_categories:
- 1M-10M
configs:
- config_name: anx2016
default: true
data_files:
- split: train
path: data/anx2016/*.parquet
- config_name: anx2026
data_files:
- split: train
path: data/anx2026/*.parquet
- config_name: anx2026_GADsymptsQuant
data_files:
- split: train
path: data/anx2026_GADsymptsQuant/*.parquet
- config_name: panic2019
data_files:
- split: train
path: data/panic2019/*.parquet
language:
- en
source_datasets:
- pgc
---
# PGC Anxiety Disorder — GWAS Summary Statistics
[](https://creativecommons.org/licenses/by/4.0/)
## Dataset Description
Genome-wide association study (GWAS) summary statistics for **Anxiety Disorder** phenotypes from the [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/).
Each publication is available as a separate subset (config) and can be loaded independently.
## Usage
```python
from datasets import load_dataset
# Load a specific GWAS
ds = load_dataset("OpenMed/pgc-anxiety", "anx2016")
print(ds)
```
### List all available subsets
```python
from datasets import get_dataset_config_names
print(get_dataset_config_names("OpenMed/pgc-anxiety"))
```
## Subsets
| Config | Phenotype | Journal | Year | PubMed | Rows |
|--------|-----------|---------|------|--------|------|
| `anx2016` | Anxiety Disorders & Factors | Molecular Psychiatry | 2016 | [26754954](https://pubmed.ncbi.nlm.nih.gov/26754954/) | — |
| `anx2026` | Anxiety Disorders (Case-Control) | Nature Genetics | 2026 | [39006447](https://pubmed.ncbi.nlm.nih.gov/39006447/) | — |
| `anx2026_GADsymptsQuant` | GAD Symptoms (Quantitative) | Nature Human Behaviour | 2026 | Pending | — |
| `panic2019` | Panic Disorder | Molecular Psychiatry | 2019 | [31712720](https://pubmed.ncbi.nlm.nih.gov/31712720/) | — |
## Data Format
All data is stored as **Apache Parquet** shards (10,000 rows each). Common columns:
| Column | Description |
|--------|-------------|
| `SNP` / `ID` | SNP rsID or variant identifier |
| `CHR` | Chromosome |
| `BP` / `POS` | Base-pair position (typically GRCh37/hg19) |
| `A1` | Effect allele |
| `A2` | Non-effect allele |
| `OR` / `BETA` | Odds ratio or effect size |
| `SE` | Standard error |
| `P` | P-value |
| `_source_file` | Original source filename |
> Column names vary between publications. Check each subset's schema.
## Citation
Please cite the original publication (see PubMed links above) and acknowledge the PGC:
> Data were obtained from the Psychiatric Genomics Consortium — https://pgc.unc.edu/
## Terms of Use
Released under **[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)**.
- Cite the original publication(s)
- Do not attempt to re-identify individual participants
- Comply with the PGC [data use policies](https://pgc.unc.edu/for-researchers/data-access/)
## Source
- [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/)
- [PGC Downloads](https://pgc.unc.edu/for-researchers/download-results/)
---
*Last updated: April 2026*
许可证:CC BY 4.0
任务类别:
- 表格回归(tabular-regression)
- 表格分类(tabular-classification)
标签:
- 全基因组关联研究(GWAS,Genome-Wide Association Study)
- 汇总统计量
- 精神基因组学
- 精神疾病基因组学联盟(PGC,Psychiatric Genomics Consortium)
- 焦虑相关(anx)
- 心理健康
- 遗传学
- 基因组学
- 生物学
- 健康
- 生物信息学
展示名称:PGC焦虑障碍全基因组关联研究汇总统计量
数据规模类别:
- 1M-10M
配置项:
- 配置名称:anx2016,默认启用,数据文件:
- 拆分方式:训练集,路径:data/anx2016/*.parquet
- 配置名称:anx2026,数据文件:
- 拆分方式:训练集,路径:data/anx2026/*.parquet
- 配置名称:anx2026_GADsymptsQuant,数据文件:
- 拆分方式:训练集,路径:data/anx2026_GADsymptsQuant/*.parquet
- 配置名称:panic2019,数据文件:
- 拆分方式:训练集,路径:data/panic2019/*.parquet
语言:
- 英语
源数据集:
- PGC
# PGC焦虑障碍 — 全基因组关联研究汇总统计量
[](https://creativecommons.org/licenses/by/4.0/)
## 数据集说明
本数据集包含来自[精神疾病基因组学联盟(PGC,Psychiatric Genomics Consortium)](https://pgc.unc.edu/)的焦虑障碍表型的全基因组关联研究(GWAS)汇总统计量。每一篇原始文献对应一个独立的子集(配置),可独立加载使用。
## 使用方法
python
from datasets import load_dataset
# 加载指定的全基因组关联研究数据集
ds = load_dataset("OpenMed/pgc-anxiety", "anx2016")
print(ds)
### 列出所有可用子集
python
from datasets import get_dataset_config_names
print(get_dataset_config_names("OpenMed/pgc-anxiety"))
## 子集详情
| 配置名称 | 表型 | 期刊 | 发表年份 | PubMed编号 | 数据行数 |
|--------|-----------|---------|------|--------|------|
| `anx2016` | 焦虑障碍及相关因子 | *Molecular Psychiatry* | 2016 | [26754954](https://pubmed.ncbi.nlm.nih.gov/26754954/) | — |
| `anx2026` | 焦虑障碍(病例-对照研究) | *Nature Genetics* | 2026 | [39006447](https://pubmed.ncbi.nlm.nih.gov/39006447/) | — |
| `anx2026_GADsymptsQuant` | 广泛性焦虑障碍(GAD,Generalized Anxiety Disorder)症状(定量表型) | *Nature Human Behaviour* | 2026 | 待公开 | — |
| `panic2019` | 惊恐障碍 | *Molecular Psychiatry* | 2019 | [31712720](https://pubmed.ncbi.nlm.nih.gov/31712720/) | — |
## 数据格式
所有数据以**Apache Parquet**分片文件存储(每个分片包含10,000条记录)。通用字段说明如下:
| 字段名 | 字段含义 |
|--------|-------------|
| `SNP` / `ID` | SNP的rs编号或变异位点标识符 |
| `CHR` | 染色体编号 |
| `BP` / `POS` | 碱基对位置(通常采用GRCh37/hg19参考基因组版本) |
| `A1` | 效应等位基因 |
| `A2` | 非效应等位基因 |
| `OR` / `BETA` | 比值比(OR)或效应量 |
| `SE` | 标准误 |
| `P` | 显著性P值 |
| `_source_file` | 原始来源文件名 |
> 不同原始文献对应的字段名称可能存在差异,请查阅各子集的Schema信息。
## 引用规范
请引用对应原始文献(详见上文PubMed链接),并致谢精神疾病基因组学联盟(PGC):
> 本数据集数据来源于精神疾病基因组学联盟 — https://pgc.unc.edu/
## 使用条款
本数据集基于**[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)**协议发布:
- 请引用相关原始文献
- 不得尝试重新识别个体参与者
- 请遵守PGC的[数据使用政策](https://pgc.unc.edu/for-researchers/data-access/)
## 数据来源
- [精神疾病基因组学联盟(PGC,Psychiatric Genomics Consortium)](https://pgc.unc.edu/)
- [PGC数据集下载页面](https://pgc.unc.edu/for-researchers/download-results/)
---
*最后更新时间:2026年4月*
提供机构:
introvoyz041



