Least Ambiguous Set-Valued Classifiers With Bounded Error Levels
收藏Figshare2017-10-30 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/Least_Ambiguous_Set-Valued_Classifiers_with_Bounded_Error_Levels/5552596
下载链接
链接失效反馈官方服务:
资源简介:
In most classification tasks, there are observations that are ambiguous and therefore difficult to correctly label. Set-valued classifiers output sets of plausible labels rather than a single label, thereby giving a more appropriate and informative treatment to the labeling of ambiguous instances. We introduce a framework for multiclass set-valued classification, where the classifiers guarantee user-defined levels of coverage or confidence (the probability that the true label is contained in the set) while minimizing the ambiguity (the expected size of the output). We first derive oracle classifiers assuming the true distribution to be known. We show that the oracle classifiers are obtained from level sets of the functions that define the conditional probability of each class. Then we develop estimators with good asymptotic and finite sample properties. The proposed estimators build on existing single-label classifiers. The optimal classifier can sometimes output the empty set, but we provide two solutions to fix this issue that are suitable for various practical needs. Supplementary materials for this article are available online.
在绝大多数分类任务中,均存在特征观测存在歧义的样本,因此难以被准确标注。集值分类器(Set-valued Classifier)会输出若干合理标签的集合而非单一标签,由此可以更恰当地、更具信息性地处理歧义样本的标注问题。本文提出了一种面向多类别集值分类的框架,该框架下的分类器可保证满足用户自定义的覆盖度或置信度水平——即真实标签被包含在输出集合中的概率——同时最小化歧义性(即输出集合的期望规模)。本文首先推导了已知真实数据分布时的神谕分类器(Oracle Classifier),研究表明该神谕分类器可通过定义各类别条件概率的函数的水平集得到。随后本文构建了具备优良渐近性质与有限样本性质的估计器,所提出的估计器基于现有的单标签分类器构建。最优分类器有时会输出空集合,对此本文提供了两种适配不同实际需求的解决方案。本文的补充材料可在线获取。
创建时间:
2017-10-30



