five

Detailed results of "Exploring the impact of label-level noise on multi-label k-Nearest Neighbor classification"

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/knb2bfbj5r
下载链接
链接失效反馈
官方服务:
资源简介:
Detailed experimental results of the eight data assorments studied in the work: Corel5k, bibtex, birds, emotions, genbase, medical, scene, and yeast. Each collection is provided separately in a CSV file, containing with the following columns: - noise: Policy used to induce noise in the assortment. The possible values are: DAAS, PUMN, add, add-sub, sub, and swap. - percen: Percentage of the assortment that may be altered by the "noise" policy. The possible values are: 0, 10, 20, 30, 40, 50, and 60. - prob: Probability of inducing noise in the dataset. The possible values are: 0, 0.1, 0.2, 0.3, 0.4, 0.5, and 0.6. - PG_method: Prototype generation method used to reduce the size of the dataset. Possible values are: ALL, MChen, MRHC, MRSP1, MRSP2, and MRSP3. - Reduction_parameter: Parameters of the Prototype generation method at hand. Possible values are: 1.0, 10.0, 30.0, 50.0, 70.0, and 90.0. - Classifier: Multi-label learning model based on the k-Nearest Neighbor rule used. Possible values are: BRkNNaClassifier, LabelPowerset, and MLkNN. - k: Number of neighbors considered by the classifier at hand. Possible values are: 1.0, 3.0, 5.0, 7.0, 11.0, 15.0, 21.0, and 30.0. - HL: Figure of merit used to assess the classification performance of the scheme. - Size: Resulting set size after the reduction process by the Prototype generation method.

本研究涉及的八组数据集(Corel5k、bibtex、birds、emotions、genbase、medical、scene及yeast)的详细实验结果。每组数据集均以独立的逗号分隔值(CSV)文件形式存储,文件包含以下列字段: - 噪声策略(noise):用于向数据集集合引入噪声的策略,可选取值为:DAAS、PUMN、add、add-sub、sub与swap。 - 百分比(percen):可通过该“噪声”策略修改的数据集集合占比,可选取值为:0、10、20、30、40、50与60。 - 噪声引入概率(prob):向数据集引入噪声的概率,可选取值为:0、0.1、0.2、0.3、0.4、0.5与0.6。 - 原型生成方法(PG_method):用于缩减数据集规模的原型生成算法,可选取值为:ALL、MChen、MRHC、MRSP1、MRSP2及MRSP3。 - 缩减参数(Reduction_parameter):当前所用原型生成方法的对应参数,可选取值为:1.0、10.0、30.0、50.0、70.0与90.0。 - 分类器(Classifier):基于k近邻规则的多标签学习模型,可选取值为:BRkNNaClassifier、标签幂集(LabelPowerset)与多标签k近邻(MLkNN)。 - 近邻数(k):当前分类器所考虑的近邻样本数量,可选取值为:1.0、3.0、5.0、7.0、11.0、15.0、21.0与30.0。 - 性能度量指标(HL):用于评估该方案分类性能的评价指标。 - 集合规模(Size):经原型生成方法执行缩减流程后得到的最终集合规模。
创建时间:
2025-12-04
二维码
社区交流群
二维码
科研交流群
商业服务