lighteternal/psychiatric-mr-evidence-atlas
收藏Hugging Face2026-04-08 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/lighteternal/psychiatric-mr-evidence-atlas
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: Psychiatric MR Evidence Atlas
license: cc-by-4.0
task_categories:
- tabular-classification
- tabular-regression
language:
- en
tags:
- genomics
- psychiatry
- mendelian-randomization
- cross-disorder
- evidence-atlas
- research
---
# Psychiatric MR Evidence Atlas
## Overview
This repository contains a directed cross-disorder Mendelian randomization atlas built from the harmonized psychiatric GWAS summary-statistics dataset.
It evaluates pairwise directional genetic evidence between 11 psychiatric disorder groups. The atlas is intended as a research resource for structured secondary analysis alongside the harmonized dataset and graph-based prediction models.
This repository should not be interpreted as a clinical causal knowledge base. The results are better understood as a filtered MR evidence screen under explicit assumptions and with material limitations.
## Source data
The atlas was generated from:
- harmonized dataset: [lighteternal/pgc-psychiatric-gwas-harmonized](https://huggingface.co/datasets/lighteternal/pgc-psychiatric-gwas-harmonized)
That harmonized dataset was derived from public OpenMed / PGC Hugging Face repositories, including:
- [OpenMed/pgc-adhd](https://huggingface.co/datasets/OpenMed/pgc-adhd)
- [OpenMed/pgc-anxiety](https://huggingface.co/datasets/OpenMed/pgc-anxiety)
- [OpenMed/pgc-autism](https://huggingface.co/datasets/OpenMed/pgc-autism)
- [OpenMed/pgc-bipolar](https://huggingface.co/datasets/OpenMed/pgc-bipolar)
- [OpenMed/pgc-eating-disorders](https://huggingface.co/datasets/OpenMed/pgc-eating-disorders)
- [OpenMed/pgc-mdd](https://huggingface.co/datasets/OpenMed/pgc-mdd)
- [OpenMed/pgc-ocd-tourette](https://huggingface.co/datasets/OpenMed/pgc-ocd-tourette)
- [OpenMed/pgc-other](https://huggingface.co/datasets/OpenMed/pgc-other)
- [OpenMed/pgc-ptsd](https://huggingface.co/datasets/OpenMed/pgc-ptsd)
- [OpenMed/pgc-schizophrenia](https://huggingface.co/datasets/OpenMed/pgc-schizophrenia)
- [OpenMed/pgc-substance-use](https://huggingface.co/datasets/OpenMed/pgc-substance-use)
`cross_disorder` is excluded from this atlas.
## Disorder panel
The 11 disorders considered in this release are:
- ADHD
- Anxiety
- Autism
- Bipolar disorder
- Borderline personality disorder
- Eating disorders
- Major depressive disorder
- Obsessive-compulsive disorder
- Post-traumatic stress disorder
- Schizophrenia
- Substance use
## Analysis design
For each directed exposure-outcome pair:
1. Select exposure instruments at genome-wide significance (`p < 5e-8`).
2. Greedily clump instruments to one SNP per 1 Mb window.
3. Harmonize exposure and outcome alleles on shared `variant_id`.
4. Filter instruments by approximate F-statistic (`F >= 10`).
5. Apply a PRESSO-style outlier screen.
6. Estimate directional effects using:
- IVW
- MR-Egger
- Weighted median
7. Record Steiger directionality, PRESSO-style outlier statistics, and a Rucker-style method recommendation.
8. Apply Benjamini-Hochberg FDR correction to IVW p-values across the atlas.
## Coverage
This atlas attempted `110` directed disorder pairs (`11 x 10`).
Published results:
- unique directed pairs with estimable MR results: `69`
- method-specific rows: `207`
- methods per estimable pair: `3`
- IVW rows passing the atlas filter: `25`
The gap between attempted and estimable pairs is expected under the current rules. Not every disorder yields enough usable clumped exposure instruments after harmonization and F-stat filtering.
In this release, estimable exposure sets are concentrated in:
- ADHD
- Anxiety
- Bipolar disorder
- Borderline personality disorder
- Major depressive disorder
- Obsessive-compulsive disorder
- Schizophrenia
Disorders such as autism, eating disorders, PTSD, and substance use remain underpowered as MR exposures under the present instrument-selection rule.
## Output files
- `data/mr_results.parquet`
- `summary.json`
## Result schema
The main result table contains:
- `exposure`
- `outcome`
- `method`
- `beta`
- `se`
- `pval`
- `ci_lower`, `ci_upper`
- `n_instruments`
- `n_instruments_after_presso`
- `f_stat_mean`
- `presso_global_pval`
- `presso_outliers_removed`
- `steiger_correct_direction`
- `steiger_z2_ratio`
- `rucker_recommended`
- `heterogeneity_q`, `heterogeneity_pval`
- `egger_intercept`, `egger_intercept_pval`
- `fdr_qval`
- `overall_pass`
## Interpretation
`overall_pass = true` is a conservative atlas summary flag. In this release it means:
- method is `IVW`
- IVW FDR q-value is below `0.05`
- Steiger directionality check favors the reported exposure-to-outcome direction
- MR-Egger intercept does not indicate obvious directional pleiotropy
This flag is useful for ranking and filtering, but it is not equivalent to proof of causality.
## Limitations
- Summary-statistic MR in psychiatry is vulnerable to sample overlap and residual pleiotropy.
- The atlas is only as strong as the available exposure instruments; several disorders are underpowered as exposures.
- IVW significance can coexist with heterogeneity or method disagreement.
- The implementation uses approximate clumping and summary-statistic harmonization rather than full LD reference-panel modeling.
- Results are disorder-level research signals, not patient-level or mechanism-level claims.
## Appropriate use
Reasonable uses:
- prioritizing disorder pairs for follow-up analysis
- comparing directional genetic evidence across psychiatric disorders
- triangulating graph-model outputs against a separate summary-statistic method
- generating hypotheses for deeper causal or mechanistic work
Inappropriate uses:
- clinical recommendation
- direct biological mechanism claims without further evidence
- interpreting `overall_pass` as definitive causality
---
pretty_name: 精神病学孟德尔随机化证据图谱
license: cc-by-4.0
task_categories:
- 表格分类
- 表格回归
language:
- en
tags:
- 基因组学
- 精神病学
- 孟德尔随机化(Mendelian Randomization)
- 跨障碍
- 证据图谱
- 研究
---
# 精神病学孟德尔随机化证据图谱
## 概述
本仓库包含基于标准化精神病全基因组关联研究(Genome-Wide Association Study,GWAS)汇总统计数据集构建的定向跨障碍孟德尔随机化(Mendelian Randomization,MR)图谱。
其对11组精神障碍间的定向成对遗传证据进行评估。本图谱旨在作为配套标准化数据集与基于图的预测模型开展结构化二次分析的研究资源。
本仓库不应被视为临床因果知识库,其结果应被理解为在明确假设与固有局限性下筛选得到的MR证据筛查结果。
## 源数据
本图谱的构建来源如下:
- 标准化数据集:[lighteternal/pgc-psychiatric-gwas-harmonized](https://huggingface.co/datasets/lighteternal/pgc-psychiatric-gwas-harmonized)
该标准化数据集源自公开的OpenMed与精神疾病基因组学联合会(Psychiatric Genomics Consortium,PGC)Hugging Face仓库,具体包括:
- [OpenMed/pgc注意缺陷多动障碍(Attention Deficit Hyperactivity Disorder,ADHD)](https://huggingface.co/datasets/OpenMed/pgc-adhd)
- [OpenMed/pgc焦虑障碍](https://huggingface.co/datasets/OpenMed/pgc-anxiety)
- [OpenMed/pgc孤独症谱系障碍(Autism)](https://huggingface.co/datasets/OpenMed/pgc-autism)
- [OpenMed/pgc双相情感障碍(Bipolar disorder)](https://huggingface.co/datasets/OpenMed/pgc-bipolar)
- [OpenMed/pgc进食障碍](https://huggingface.co/datasets/OpenMed/pgc-eating-disorders)
- [OpenMed/pgc重度抑郁症(Major depressive disorder)](https://huggingface.co/datasets/OpenMed/pgc-mdd)
- [OpenMed/pgc强迫障碍与抽动秽语综合征(Obsessive-compulsive disorder and Tourette syndrome)](https://huggingface.co/datasets/OpenMed/pgc-ocd-tourette)
- [OpenMed/pgc其他精神障碍](https://huggingface.co/datasets/OpenMed/pgc-other)
- [OpenMed/pgc创伤后应激障碍(Post-traumatic stress disorder,PTSD)](https://huggingface.co/datasets/OpenMed/pgc-ptsd)
- [OpenMed/pgc精神分裂症(Schizophrenia)](https://huggingface.co/datasets/OpenMed/pgc-schizophrenia)
- [OpenMed/pgc物质使用障碍(Substance use)](https://huggingface.co/datasets/OpenMed/pgc-substance-use)
本图谱未纳入`cross_disorder`类别。
## 障碍组列表
本版本纳入的11种精神障碍如下:
- 注意缺陷多动障碍(ADHD)
- 焦虑障碍
- 孤独症谱系障碍
- 双相情感障碍
- 边缘型人格障碍
- 进食障碍
- 重度抑郁症
- 强迫障碍
- 创伤后应激障碍(PTSD)
- 精神分裂症
- 物质使用障碍
## 分析设计
针对每一组定向暴露-结局对:
1. 选取全基因组显著水平(`p < 5×10^-8`)下的暴露工具变量。
2. 对工具变量进行贪婪聚类,确保每1兆碱基(Mb)窗口内仅保留一个单核苷酸多态性(Single Nucleotide Polymorphism,SNP)。
3. 基于共享的`variant_id`对齐暴露与结局的等位基因。
4. 基于近似F统计量(`F ≥ 10`)对工具变量进行筛选。
5. 采用类PRESSO方法进行异常值筛查。
6. 采用以下方法估计定向效应:
- 逆方差加权法(Inverse Variance Weighted,IVW)
- MR-Egger法
- 加权中位数法
7. 记录Steiger方向性检验结果、类PRESSO异常值统计量以及类Rucker方法推荐结果。
8. 对本图谱中所有IVW检验的P值进行Benjamini-Hochberg错误发现率(False Discovery Rate,FDR)校正。
## 分析覆盖范围
本图谱共计划开展`110`组定向障碍对分析(即11×10的组合)。
已发布的结果包括:
- 可获取MR分析结果的唯一定向障碍对:`69`组
- 按方法分类的结果条目:`207`条
- 每组可分析障碍对的方法数:`3`种
- 通过本图谱筛选标准的IVW结果条目:`25`条
按照当前分析规则,计划分析与可获取结果的障碍对之间存在缺口属于正常现象。并非所有精神障碍在经过等位基因对齐与F统计量筛选后,仍能获得足够数量可用的聚类暴露工具变量。
本版本中,可作为暴露变量开展分析的障碍主要集中于:
- 注意缺陷多动障碍
- 焦虑障碍
- 双相情感障碍
- 边缘型人格障碍
- 重度抑郁症
- 强迫障碍
- 精神分裂症
按照当前工具变量筛选规则,孤独症谱系障碍、进食障碍、创伤后应激障碍与物质使用障碍作为暴露变量开展MR分析时统计效力仍显不足。
## 输出文件
- `data/mr_results.parquet`
- `summary.json`
## 结果字段说明
主结果表包含以下字段:
- `exposure`(暴露障碍)
- `outcome`(结局障碍)
- `method`(分析方法)
- `beta`(效应值)
- `se`(标准误)
- `pval`(P值)
- `ci_lower`、`ci_upper`(置信区间上下限)
- `n_instruments`(工具变量数量)
- `n_instruments_after_presso`(PRESSO筛查后剩余工具变量数量)
- `f_stat_mean`(平均F统计量)
- `presso_global_pval`(PRESSO全局P值)
- `presso_outliers_removed`(PRESSO移除的异常值数量)
- `steiger_correct_direction`(Steiger方向性检验结果)
- `steiger_z2_ratio`(Steiger Z2比值)
- `rucker_recommended`(Rucker方法推荐结果)
- `heterogeneity_q`(异质性Q统计量)、`heterogeneity_pval`(异质性P值)
- `egger_intercept`(MR-Egger截距)、`egger_intercept_pval`(MR-Egger截距P值)
- `fdr_qval`(FDR校正后Q值)
- `overall_pass`(整体通过筛选标记)
## 结果解读
`overall_pass = true`是本图谱的保守性整体筛选标记。本版本中该标记为真的含义为:
- 分析方法为`IVW`
- IVW检验的FDR校正Q值小于`0.05`
- Steiger方向性检验支持所报告的暴露至结局的定向关系
- MR-Egger截距未提示存在明显的定向多效性
该标记可用于结果排序与筛选,但并不等同于因果关系的证明。
## 局限性
本研究存在以下局限性:
- 精神病学领域的汇总统计孟德尔随机化分析易受样本重叠与残余多效性影响。
- 本图谱的分析效力取决于可用的暴露工具变量质量,部分精神障碍作为暴露变量时统计效力不足。
- IVW检验的显著性结果可能伴随异质性或不同分析方法间结果不一致的问题。
- 本分析采用近似聚类与汇总统计等位基因对齐方法,而非完整的连锁不平衡(Linkage Disequilibrium,LD)参考面板建模。
- 本图谱的结果为障碍层面的研究信号,而非患者层面或机制层面的结论。
## 适用场景
合理使用场景包括:
- 优先选择障碍对开展后续分析
- 比较不同精神障碍间的定向遗传证据
- 将基于图的预测模型输出与独立的汇总统计方法结果进行三角验证
- 为更深层次的因果或机制研究生成假说
不当使用场景包括:
- 用于临床决策推荐
- 在未获得额外证据的情况下直接提出生物学机制结论
- 将`overall_pass`标记解读为确定性因果关系证明
提供机构:
lighteternal



