five

Model-Based Causal Feature Selection for General Response Types

收藏
DataCite Commons2026-01-23 更新2024-11-05 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Model-based_causal_feature_selection_for_general_response_types/26880657/2
下载链接
链接失效反馈
官方服务:
资源简介:
Discovering causal relationships from observational data is a fundamental yet challenging task. Invariant causal prediction (ICP, Peters, Bühlmann, and Meinshausen) is a method for causal feature selection which requires data from heterogeneous settings and exploits that causal models are invariant. ICP has been extended to general additive noise models and to nonparametric settings using conditional independence tests. However, the latter often suffer from low power (or poor Type I error control) and additive noise models are not suitable for applications in which the response is not measured on a continuous scale, but reflects categories or counts. Here, we develop transformation-model (tram) based ICP, allowing for continuous, categorical, count-type, and uninformatively censored responses (these model classes, generally, do not allow for identifiability when there is no exogenous heterogeneity). As an invariance test, we propose tram-GCM based on the expected conditional covariance between environments and score residuals with uniform asymptotic level guarantees. For the special case of linear shift trams, we also consider tram-Wald, which tests invariance based on the Wald statistic. We provide an open-source R package <b>tramicp</b> and evaluate our approach on simulated data and in a case study investigating causal features of survival in critically ill patients. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.

从观测数据中挖掘因果关系是一项兼具基础性与挑战性的任务。不变因果预测(Invariant Causal Prediction,ICP,Peters、Bühlmann与Meinshausen提出)是一类因果特征选择方法,其需要来自异质环境的数据,并利用因果模型具备不变性的核心特性。ICP已被拓展至一般加性噪声模型场景,以及借助条件独立性检验实现的非参数设置中。然而,后者往往存在检验效力偏低(或一类错误控制不佳)的问题;同时加性噪声模型不适用于响应变量非连续度量、以类别或计数形式表征的应用场景。本文提出了基于变换模型(Transformation-Model,tram)的ICP方法,可适配连续、分类、计数型以及无信息删失的响应变量(这类模型族在无外生异质性的情况下通常无法实现可识别性)。作为不变性检验手段,本文提出了基于环境与得分残差间期望条件协方差的tram-GCM方法,该方法具备一致的渐近显著性水平保证。针对线性移位trams这一特殊场景,本文还提出了tram-Wald方法,其基于Wald统计量开展不变性检验。本文提供了开源R包<b>tramicp</b>,并通过模拟数据与一项探究重症患者生存因果特征的案例研究对所提方法进行了评估。本文的补充材料可在线获取,其中包含可用于复现研究工作的相关材料的标准化说明。
提供机构:
Taylor & Francis
创建时间:
2024-10-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作