five

introvoyz041/pgc-anxiety

收藏
Hugging Face2026-04-09 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/introvoyz041/pgc-anxiety
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - tabular-regression - tabular-classification tags: - gwas - summary-statistics - psychiatric-genomics - pgc - anx - mental-health - genetics - genomics - biology - health - bioinformatics pretty_name: PGC Anxiety Disorder GWAS Summary Statistics size_categories: - 1M-10M configs: - config_name: anx2016 default: true data_files: - split: train path: data/anx2016/*.parquet - config_name: anx2026 data_files: - split: train path: data/anx2026/*.parquet - config_name: anx2026_GADsymptsQuant data_files: - split: train path: data/anx2026_GADsymptsQuant/*.parquet - config_name: panic2019 data_files: - split: train path: data/panic2019/*.parquet language: - en source_datasets: - pgc --- # PGC Anxiety Disorder — GWAS Summary Statistics [![License: CC BY 4.0](https://img.shields.io/badge/License-CC_BY_4.0-lightgrey)](https://creativecommons.org/licenses/by/4.0/) ## Dataset Description Genome-wide association study (GWAS) summary statistics for **Anxiety Disorder** phenotypes from the [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/). Each publication is available as a separate subset (config) and can be loaded independently. ## Usage ```python from datasets import load_dataset # Load a specific GWAS ds = load_dataset("OpenMed/pgc-anxiety", "anx2016") print(ds) ``` ### List all available subsets ```python from datasets import get_dataset_config_names print(get_dataset_config_names("OpenMed/pgc-anxiety")) ``` ## Subsets | Config | Phenotype | Journal | Year | PubMed | Rows | |--------|-----------|---------|------|--------|------| | `anx2016` | Anxiety Disorders & Factors | Molecular Psychiatry | 2016 | [26754954](https://pubmed.ncbi.nlm.nih.gov/26754954/) | — | | `anx2026` | Anxiety Disorders (Case-Control) | Nature Genetics | 2026 | [39006447](https://pubmed.ncbi.nlm.nih.gov/39006447/) | — | | `anx2026_GADsymptsQuant` | GAD Symptoms (Quantitative) | Nature Human Behaviour | 2026 | Pending | — | | `panic2019` | Panic Disorder | Molecular Psychiatry | 2019 | [31712720](https://pubmed.ncbi.nlm.nih.gov/31712720/) | — | ## Data Format All data is stored as **Apache Parquet** shards (10,000 rows each). Common columns: | Column | Description | |--------|-------------| | `SNP` / `ID` | SNP rsID or variant identifier | | `CHR` | Chromosome | | `BP` / `POS` | Base-pair position (typically GRCh37/hg19) | | `A1` | Effect allele | | `A2` | Non-effect allele | | `OR` / `BETA` | Odds ratio or effect size | | `SE` | Standard error | | `P` | P-value | | `_source_file` | Original source filename | > Column names vary between publications. Check each subset's schema. ## Citation Please cite the original publication (see PubMed links above) and acknowledge the PGC: > Data were obtained from the Psychiatric Genomics Consortium — https://pgc.unc.edu/ ## Terms of Use Released under **[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)**. - Cite the original publication(s) - Do not attempt to re-identify individual participants - Comply with the PGC [data use policies](https://pgc.unc.edu/for-researchers/data-access/) ## Source - [Psychiatric Genomics Consortium (PGC)](https://pgc.unc.edu/) - [PGC Downloads](https://pgc.unc.edu/for-researchers/download-results/) --- *Last updated: April 2026*

许可证:CC BY 4.0 任务类别: - 表格回归(tabular-regression) - 表格分类(tabular-classification) 标签: - 全基因组关联研究(GWAS,Genome-Wide Association Study) - 汇总统计量 - 精神基因组学 - 精神疾病基因组学联盟(PGC,Psychiatric Genomics Consortium) - 焦虑相关(anx) - 心理健康 - 遗传学 - 基因组学 - 生物学 - 健康 - 生物信息学 展示名称:PGC焦虑障碍全基因组关联研究汇总统计量 数据规模类别: - 1M-10M 配置项: - 配置名称:anx2016,默认启用,数据文件: - 拆分方式:训练集,路径:data/anx2016/*.parquet - 配置名称:anx2026,数据文件: - 拆分方式:训练集,路径:data/anx2026/*.parquet - 配置名称:anx2026_GADsymptsQuant,数据文件: - 拆分方式:训练集,路径:data/anx2026_GADsymptsQuant/*.parquet - 配置名称:panic2019,数据文件: - 拆分方式:训练集,路径:data/panic2019/*.parquet 语言: - 英语 源数据集: - PGC # PGC焦虑障碍 — 全基因组关联研究汇总统计量 [![License: CC BY 4.0](https://img.shields.io/badge/License-CC_BY_4.0-lightgrey)](https://creativecommons.org/licenses/by/4.0/) ## 数据集说明 本数据集包含来自[精神疾病基因组学联盟(PGC,Psychiatric Genomics Consortium)](https://pgc.unc.edu/)的焦虑障碍表型的全基因组关联研究(GWAS)汇总统计量。每一篇原始文献对应一个独立的子集(配置),可独立加载使用。 ## 使用方法 python from datasets import load_dataset # 加载指定的全基因组关联研究数据集 ds = load_dataset("OpenMed/pgc-anxiety", "anx2016") print(ds) ### 列出所有可用子集 python from datasets import get_dataset_config_names print(get_dataset_config_names("OpenMed/pgc-anxiety")) ## 子集详情 | 配置名称 | 表型 | 期刊 | 发表年份 | PubMed编号 | 数据行数 | |--------|-----------|---------|------|--------|------| | `anx2016` | 焦虑障碍及相关因子 | *Molecular Psychiatry* | 2016 | [26754954](https://pubmed.ncbi.nlm.nih.gov/26754954/) | — | | `anx2026` | 焦虑障碍(病例-对照研究) | *Nature Genetics* | 2026 | [39006447](https://pubmed.ncbi.nlm.nih.gov/39006447/) | — | | `anx2026_GADsymptsQuant` | 广泛性焦虑障碍(GAD,Generalized Anxiety Disorder)症状(定量表型) | *Nature Human Behaviour* | 2026 | 待公开 | — | | `panic2019` | 惊恐障碍 | *Molecular Psychiatry* | 2019 | [31712720](https://pubmed.ncbi.nlm.nih.gov/31712720/) | — | ## 数据格式 所有数据以**Apache Parquet**分片文件存储(每个分片包含10,000条记录)。通用字段说明如下: | 字段名 | 字段含义 | |--------|-------------| | `SNP` / `ID` | SNP的rs编号或变异位点标识符 | | `CHR` | 染色体编号 | | `BP` / `POS` | 碱基对位置(通常采用GRCh37/hg19参考基因组版本) | | `A1` | 效应等位基因 | | `A2` | 非效应等位基因 | | `OR` / `BETA` | 比值比(OR)或效应量 | | `SE` | 标准误 | | `P` | 显著性P值 | | `_source_file` | 原始来源文件名 | > 不同原始文献对应的字段名称可能存在差异,请查阅各子集的Schema信息。 ## 引用规范 请引用对应原始文献(详见上文PubMed链接),并致谢精神疾病基因组学联盟(PGC): > 本数据集数据来源于精神疾病基因组学联盟 — https://pgc.unc.edu/ ## 使用条款 本数据集基于**[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)**协议发布: - 请引用相关原始文献 - 不得尝试重新识别个体参与者 - 请遵守PGC的[数据使用政策](https://pgc.unc.edu/for-researchers/data-access/) ## 数据来源 - [精神疾病基因组学联盟(PGC,Psychiatric Genomics Consortium)](https://pgc.unc.edu/) - [PGC数据集下载页面](https://pgc.unc.edu/for-researchers/download-results/) --- *最后更新时间:2026年4月*
提供机构:
introvoyz041
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作