Code underlying the PhD thesis: Label Alchemy: Transforming Noisy Data into Precious Insights in Deep Learning

4TU.ResearchData2024-04-22 更新2026-04-23 收录

下载链接：

https://data.4tu.nl/datasets/b00277a6-9431-47dc-9369-e9a477031e66/1

下载链接

链接失效反馈

官方服务：

资源简介：

Labels are essential for training Deep Neural Networks (DNNs), guiding learning with fundamental ground truth. Label quality directly impacts DNN performance and generalization with accurate labels fostering robust predictions. Noisy labels introduce errors and hinder learning, affecting performance adversely. High-quality labels aid convergence, optimizing DNN training towards accurate data distribution representation. Ensuring label accuracy is vital for DNNs' effective learning, generalization, and real-world performance. Undoubtedly, ensuring the quality of labels is not only critical but also demanding, often entailing considerable resources in terms of time and cost. As the scale of datasets grows, methods such as crowdsourcing have gained traction to expedite the labeling process. However, this approach comes with its own set of challenges, most notably the inherent susceptibility to errors and inaccuracies. For example, it was observed that the accuracy of AlexNet in classifying CIFAR-10 images plummeted from 77\% to a mere 10\% when labels were subjected to random flips. This stark drop in accuracy exemplifies the magnitude of influence that corrupted or erroneous labels can exert on the performance of DNNs. Such instances underscore the critical relationship between accurate labels and the efficacy of DNNs in understanding and effectively leveraging data.Ensuring DNN robustness is vital, involving strategies like noise label identification, filtering, and integrating noise patterns into training for resilient models. Architectural and loss function design also combats label-related challenges, enhancing DNN adaptability across applications. This thesis investigates the pivotal role of labels in DNN training and their quality impact on model performance. Strategies spanning noise recovery, robust learning frameworks, and multi-label solutions contribute to DNN resilience against noisy labels, advancing both understanding and practical applications.<br>***This is the code repository for each chapter of the thesis. ***<br>

标签是深度神经网络（Deep Neural Networks, DNNs）训练的核心要素，其以基础基准真值（ground truth）为依据指导模型开展学习。标签质量直接决定DNN的性能与泛化能力：准确的标签能够助力模型生成鲁棒的预测结果，而带噪声的标签则会引入误差、阻碍学习进程，对模型性能造成负面影响。高质量标签有助于模型快速收敛，优化DNN训练过程，使其能够精准表征数据分布。确保标签准确性对于DNN的有效学习、泛化能力以及实际落地性能都极为关键。毋庸置疑，保障标签质量不仅至关重要，而且颇具难度，通常需要耗费大量的时间与资金资源。随着数据集规模不断扩大，众包（crowdsourcing）等标注方法逐渐流行，以加快标注流程。但这类方法也存在一系列固有挑战，最突出的问题便是其本身极易出现标注错误与不准确的情况。例如，研究发现，当CIFAR-10图像的标签被随机翻转后，AlexNet的分类准确率从77%骤降至仅10%。这一显著的准确率下滑直观展现了受损或错误标签对DNN性能的影响程度。此类案例凸显了准确标签与DNN理解并有效利用数据的效能之间的关键关联。保障DNN的鲁棒性至关重要，相关策略包括噪声标签识别、过滤，以及将噪声模式融入训练过程以构建具备抗噪能力的模型。架构设计与损失函数的优化同样能够应对标签相关的挑战，提升DNN在各类应用场景中的适应性。本论文深入探讨了标签在DNN训练中的核心作用，以及其质量对模型性能的影响。论文提出的策略涵盖噪声标签恢复、鲁棒学习框架以及多标签解决方案，这些方法能够增强DNN对抗噪声标签的能力，推动学界对该领域的理解与实际应用进展。 ***本仓库为本论文各章节的代码实现仓库。***

创建时间：

2024-04-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集