Bayesian sampling for smoothing parameter estimation
收藏DataCite Commons2020-09-18 更新2025-04-16 收录
下载链接:
https://figshare.com/articles/Bayesian_sampling_for_smoothing_parameter_estimation/4697323
下载链接
链接失效反馈官方服务:
资源简介:
Kernel density estimation is one of the most important techniques for understanding the distributional properties of data. It is understood that the effectiveness of such approach depends on the choice of a kernel function and the choice of a smoothing parameter (bandwidth). This thesis has undertaken some important topics in bandwidth selection for kernel density estimation for data that behave in various nature. The first issue evolves around selecting appropriate bandwidth given the characteristics of the local data in multivariate setting. In Chapter 3, the study proposes a kernel density estimator with tail-adaptive bandwidths. The study derives posterior of bandwidth parameters based on the Kullback-Leibler information and presented an MCMC sampling algorithm to estimate bandwidths. The Monte Carlo simulation study shows that the kernel density estimator with tail-adaptive bandwidths estimated through the proposed sampling algorithm outperforms its competitor. The tail-adaptive kernel density estimator is applied to the estimation of bivariate density of the paired daily returns of the Australian Ordinary index and S&P 500 index during the period of global financial crisis. The results show that this estimator could capture richer dynamics in the tail area than the density estimator with a global bandwidth estimated through the normal reference rule and a Bayesian sampling algorithm.
The second research project investigates bandwidth selection for multimodal distributions or data that exhibits clustering behaviours. Chapter 4 proposes a cluster-adaptive bandwidth kernel density estimator for data with multimodality. This method employs a clustering algorithm to assign a different bandwidth to each cluster identified in the data set. The study derives a posterior of bandwidth parameters based on the Kullback-Leibler information and presented an MCMC sampling algorithm to estimate bandwidths. The Monte Carlo simulation study shows that when the underlying density is a mixture of normals, the kernel density estimator with cluster-adaptive bandwidths estimated through the proposed sampling algorithm outperforms its competitor. When the underlying densities are fat-tailed, the combined approach of tail- and cluster-adaptive density estimator performs the best. In an empirical study, bandwidth matrices are estimated for the cluster-adaptive kernel density estimator of eruption duration and waiting time to the next eruption collected from Old Faithful greyer, which is often analysed due to its clustering nature. The results again shows clear advantage of the proposed cluster-adaptive kernel density estimator over traditional approaches.
The third topic extends the Bayesian bandwidth selection method to volatility models of financial asset return series. The study is motivated by the fact that only limited attention in the literature has been invested on the estimation of nonparametric nonlinear type of volatility models through a Bayesian approach. Chapter 5 presents a new volatility model called the semiparametic nonlinear volatility (SNV) model. Based on financial return series of major stock indices in the world, the performance of the proposed volatility model against the competing models are examined in both in-sample and out-of-sample periods. The proposed model and the Bayesian estimation method show strong and convincing performance results. The study also evaluates the empirical value-at-risk (VaR) performance of the competing models. The proposed volatility model shows the best performance in most cases.
核密度估计(Kernel Density Estimation, KDE)是探究数据分布特征的核心技术之一。众所周知,该方法的有效性取决于核函数的选取与平滑参数(带宽,bandwidth)的选择。本论文围绕不同特性数据的核密度估计带宽选择问题展开了若干重要研究。
第一个研究主题聚焦于多元场景下基于局部数据特征的最优带宽选择。第三章中,本文提出了一种尾部自适应带宽的核密度估计器。该研究基于库拉克-莱布勒(Kullback-Leibler, KL)信息推导了带宽参数的后验分布,并提出了用于带宽估计的马尔可夫链蒙特卡洛(Markov Chain Monte Carlo, MCMC)采样算法。蒙特卡洛模拟实验结果表明,通过本文所提采样算法估计得到的尾部自适应带宽核密度估计器,其性能优于同类对比方法。随后,将该尾部自适应核密度估计器应用于全球金融危机期间澳大利亚普通股指与标普500(S&P 500)指数配对日收益率的二元密度估计。实验结果显示,相较于通过正态参考法则与贝叶斯采样算法估计得到的全局带宽密度估计器,本文所提估计器能够更精准地捕捉尾部区域的复杂动态特征。
第二项研究课题聚焦于多模态分布或具有聚类特性的数据的带宽选择问题。第四章针对多模态数据提出了一种聚类自适应带宽的核密度估计器。该方法通过聚类算法为数据集中识别出的每个聚类分配专属带宽。该研究同样基于库拉克-莱布勒信息推导了带宽参数的后验分布,并提出了用于带宽估计的马尔可夫链蒙特卡洛采样算法。蒙特卡洛模拟实验结果表明,当真实密度为正态混合分布时,通过本文所提采样算法估计得到的聚类自适应带宽核密度估计器,其性能优于同类对比方法;当真实密度为厚尾分布时,结合尾部自适应与聚类自适应的密度估计器组合方案表现最优。一项实证研究中,本文针对老忠实间歇泉(Old Faithful greyer)的喷发持续时长与下次喷发间隔时间数据集,构建了聚类自适应带宽核密度估计器并估计其带宽矩阵——该数据集因具有聚类特性而被广泛用于相关研究。实验结果再次证实,相较于传统方法,本文所提聚类自适应带宽核密度估计器具有显著优势。
第三个研究主题将贝叶斯带宽选择方法拓展至金融资产收益率序列的波动率模型领域。本研究的动机在于,当前学界针对通过贝叶斯方法估计非参数非线性波动率模型的相关研究仍较为匮乏。第五章提出了一种新型波动率模型——半参数非线性波动率(Semiparametric Nonlinear Volatility, SNV)模型。本文基于全球主要股指的收益率序列,在样本内与样本外两个维度下,对比检验了所提波动率模型与同类竞争模型的性能。所提模型与贝叶斯估计方法均展现出优异且令人信服的实验效果。本研究同时对各竞争模型的实证在险价值(Value-at-Risk, VaR)表现进行了评估。实验结果显示,在绝大多数场景下,本文所提波动率模型均表现最优。
提供机构:
Figshare
创建时间:
2017-02-27



