Transfer Learning with Large-Scale Quantile Regression

Name: Transfer Learning with Large-Scale Quantile Regression
Creator: Taylor & Francis
Published: 2025-05-01 05:49:47
License: 暂无描述

DataCite Commons2025-05-01 更新2024-08-19 收录

下载链接：

https://tandf.figshare.com/articles/dataset/Transfer_Learning_with_Large-Scale_Quantile_Regression/25199222/1

下载链接

链接失效反馈

官方服务：

资源简介：

Quantile regression is increasingly encountered in modern big data applications due to its robustness and flexibility. We consider the scenario of learning the conditional quantiles of a specific target population when the available data may go beyond the target and be supplemented from other sources that possibly share similarities with the target. A crucial question is how to properly distinguish and use useful information from other sources to improve the quantile estimation and inference at the target. We develop transfer learning methods for high-dimensional quantile regression by detecting informative sources whose models are similar to the target and using them to improve the target model. We show that under reasonable conditions, the detection of the informative sources based on sample splitting is consistent. Compared to the naive estimator with only the target data, the transfer learning estimator achieves a much lower error rate as a function of the sample sizes, the signal-to-noise ratios, and the similarity measures among the target and the source models. Extensive simulation studies demonstrate the superiority of our proposed approach. We apply our methods to tackle the problem of detecting hard-landing risk for flight safety and show the benefits and insights gained from transfer learning of three different types of airplanes: Boeing 737, Airbus A320, and Airbus A380.

分位数回归（Quantile regression）凭借其稳健性与灵活性，在现代大数据应用中愈发得到广泛应用。我们考虑如下场景：当可用数据不仅包含目标群体的数据，还可从与目标存在相似性的其他数据源补充获取时，如何学习特定目标群体的条件分位数。核心问题在于，如何恰当甄别并利用来自其他数据源的有效信息，以改进目标群体的分位数估计与统计推断。我们针对高维分位数回归问题开发了迁移学习（transfer learning）方法：先检测出与目标模型相似的有效数据源，再利用这些数据源优化目标模型。我们证明，在合理假设条件下，基于样本拆分（sample splitting）的有效数据源检测具备一致性。相较于仅使用目标数据的朴素估计量（naive estimator），我们提出的迁移学习估计量的误差率显著更低，且该误差率随样本量、信噪比（signal-to-noise ratios）以及目标与源模型间的相似性度量变化的表现更优。大量仿真实验验证了所提方法的优越性。我们将所提方法应用于飞行安全领域的硬着陆风险（hard-landing risk）检测问题，并展示了针对波音737、空客A320及空客A380三种不同机型开展迁移学习所获得的实际收益与研究洞察。

提供机构：

Taylor & Francis

创建时间：

2024-02-09

搜集汇总

数据集介绍