Results comparison on the KvasirV1 dataset.

Figshare2025-11-03 更新2026-04-28 收录

下载链接：

https://figshare.com/articles/dataset/Results_comparison_on_the_KvasirV1_dataset_/30522294

下载链接

链接失效反馈

官方服务：

资源简介：

Colorectal cancer (CRC) is the leading cause of cancer disease and poses a significant threat to global health. Although deep learning models have been utilized to accurately diagnose CRC, they still face challenges in capturing the global correlations of spatial features, especially in complex textures and morphologically similar features. To overcome these challenges, we propose a hybrid model using a residual network and transformer encoder with mixed attention. The Residual Next Transformer Network (RNTNet) extracts spatial features from CRC images using ResNeXt. ResNeXt utilizes group convolution and skip connections to capture fine-grained features. Furthermore, a vision transformer (ViT) encoder containing a mixed attention block is designed using multiscale feature aggregation to provide global attention to the spatial features. In addition, a Grad-CAM module is added to visualize the model’s decision process to support oncologists with a second opinion. Two publicly available datasets, Kather and KvasirV1, were utilized for model training and testing. The model achieved classification accuracies of 97.96% and 98.20% on the KvasirV1 and Kather datasets, respectively. Model efficacy is also further confirmed by ROC curve analysis, where AUC values of 0.9895 and 0.9937 on the KvasirV1 and Kather datasets are obtained, respectively. Comparative study findings support that RNTNet delivers improvements in accuracy and efficiency compared to state-of-the-art methods.

结直肠癌（Colorectal cancer, CRC）是癌症类疾病的主要致死病因，对全球公共卫生构成重大威胁。尽管深度学习模型已被应用于结直肠癌的精准诊断，但在捕捉空间特征的全局关联方面仍存在局限，尤其在复杂纹理与形态相似的特征场景中。为破解上述难题，本文提出一种融合残差网络与混合注意力Transformer编码器的混合模型。残差Next Transformer网络（Residual Next Transformer Network, RNTNet）通过ResNeXt从结直肠癌影像中提取空间特征，ResNeXt借助分组卷积与跳跃连接来捕获细粒度特征。此外，本文设计了一种包含混合注意力模块的视觉Transformer（Vision Transformer, ViT）编码器，通过多尺度特征聚合实现空间特征的全局注意力建模。同时，本文引入梯度类激活映射（Grad-CAM）模块，对模型的决策过程进行可视化，为肿瘤科医师提供第二诊疗参考。本研究采用两个公开数据集Kather与KvasirV1开展模型训练与测试，在KvasirV1与Kather数据集上，该模型的分类准确率分别达到97.96%与98.20%。受试者工作特征曲线（Receiver Operating Characteristic curve, ROC）分析进一步验证了模型的有效性：在KvasirV1与Kather数据集上，其曲线下面积（Area Under Curve, AUC）分别为0.9895与0.9937。对比实验结果表明，相较于当前主流前沿方法，RNTNet在准确率与效率上均实现了性能提升。

创建时间：

2025-11-03

5,000+

优质数据集

54 个

任务类型

进入经典数据集