Code underlying the PhD thesis: Label Alchemy: Transforming Noisy Data into Precious Insights in Deep Learning

Mendeley Data2024-04-28 更新2024-06-29 收录

下载链接：

https://data.4tu.nl/datasets/b00277a6-9431-47dc-9369-e9a477031e66

下载链接

链接失效反馈

官方服务：

资源简介：

Labels are essential for training Deep Neural Networks (DNNs), guiding learning with fundamental ground truth. Label quality directly impacts DNN performance and generalization with accurate labels fostering robust predictions. Noisy labels introduce errors and hinder learning, affecting performance adversely. High-quality labels aid convergence, optimizing DNN training towards accurate data distribution representation. Ensuring label accuracy is vital for DNNs' effective learning, generalization, and real-world performance. Undoubtedly, ensuring the quality of labels is not only critical but also demanding, often entailing considerable resources in terms of time and cost. As the scale of datasets grows, methods such as crowdsourcing have gained traction to expedite the labeling process. However, this approach comes with its own set of challenges, most notably the inherent susceptibility to errors and inaccuracies. For example, it was observed that the accuracy of AlexNet in classifying CIFAR-10 images plummeted from 77\% to a mere 10\% when labels were subjected to random flips. This stark drop in accuracy exemplifies the magnitude of influence that corrupted or erroneous labels can exert on the performance of DNNs. Such instances underscore the critical relationship between accurate labels and the efficacy of DNNs in understanding and effectively leveraging data.Ensuring DNN robustness is vital, involving strategies like noise label identification, filtering, and integrating noise patterns into training for resilient models. Architectural and loss function design also combats label-related challenges, enhancing DNN adaptability across applications. This thesis investigates the pivotal role of labels in DNN training and their quality impact on model performance. Strategies spanning noise recovery, robust learning frameworks, and multi-label solutions contribute to DNN resilience against noisy labels, advancing both understanding and practical applications. ***This is the code repository for each chapter of the thesis. ***

标签对于深度神经网络（Deep Neural Networks，DNNs）的训练至关重要，它们以基础的基准真值（ground truth）为依据指导模型学习。标签质量直接影响深度神经网络的性能与泛化能力，精准的标签能够助力模型生成鲁棒的预测结果。噪声标签会引入误差并阻碍模型学习，对模型性能产生负面影响。高质量的标签有助于模型收敛，优化深度神经网络的训练过程，使其能够精准表征数据分布。确保标签的准确性，对于深度神经网络的高效学习、泛化能力以及实际应用性能都至关重要。毋庸置疑，保障标签质量不仅极为关键，同时也颇具挑战，通常需要投入大量的时间与成本资源。随着数据集规模不断扩大，众包这类方法逐渐流行起来，以加快标签标注的进程。但这类方法本身也存在一系列挑战，最突出的问题在于其天生易于出现错误与标注不准确的情况。例如，研究发现，当对CIFAR-10图像的标签进行随机翻转操作后，AlexNet在该数据集上的分类准确率从77%骤降至仅10%。这种大幅的准确率下降，直观展现了受损或错误标签对深度神经网络性能所能造成的影响程度。此类案例凸显了精准标签与深度神经网络理解并有效利用数据的效能之间的关键关联。保障深度神经网络的鲁棒性至关重要，相关策略包括识别并过滤噪声标签、将噪声标签模式融入训练过程，以构建具备抗噪能力的模型。模型架构与损失函数的设计同样可以应对标签相关的挑战，提升深度神经网络在各类应用场景中的适配能力。本论文探究了标签在深度神经网络训练中的核心作用，以及标签质量对模型性能的影响机制。涵盖噪声标签恢复、鲁棒学习框架以及多标签解决方案在内的各类策略，能够提升深度神经网络对抗噪声标签的能力，推动相关理论研究与实际应用的发展。 ***本仓库对应本论文各章节的配套代码***

创建时间：

2024-04-24

5,000+

优质数据集

54 个

任务类型

进入经典数据集