five

HDSNE a New Unsupervised Multiple Image Database Fusion Learning Algorithm with Flexible and Crispy Production of One Database: A Proof Case Study of Lung Infection Diagnose In Chest X-ray Images

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://data.mendeley.com/datasets/nttrfkg644
下载链接
链接失效反馈
官方服务:
资源简介:
HDSNE is a New Unsupervised Multiple Image Database Fusion using Learning Algorithm to Diagnose Covid-19 by Chest X-ray Images. We proposed a global data aggregation scale model with six image databases selected from specific global resources. The Hash MD5 algorithm returns a unique hash value for each image, making it appropriate for duplication removal. The hash MD5 and T-SNE algorithms are applied recursively, producing a balanced and uniform database containing equal samples per category: normal, pneumonia, and COVID-19. The dataset is then cleaned by removing non-posteroanterior (PA) view and CT images that affect the model training process. These duplicates and out-of-scope images are removed during the curation process and a refined dataset is available for download. The final version of the aggregated dataset is 441 frontal X-ray images per class. This database is a partial fulfillment for the main belonging article submitted to BMC Medical Imaging Journal

HDSNE是一种基于学习算法的新型无监督多图像数据库融合方法,可通过胸部X光影像诊断新型冠状病毒肺炎(COVID-19)。本研究构建了一套全局数据聚合尺度模型,从特定全球资源库中遴选了6个图像数据库。采用MD5哈希(Hash MD5)算法为每张图像生成唯一哈希值,可有效实现重复图像去重。通过递归应用MD5哈希与T分布邻域嵌入(T-SNE)算法,最终得到一个均衡统一的数据库,其三类样本量完全一致,分别为正常、肺炎以及COVID-19。随后对该数据集进行清洗操作,移除会干扰模型训练流程的非后前位(PA)视图影像与CT图像。在数据集整理过程中,已完成重复图像与超范围影像的剔除工作,最终可下载得到经过精细化处理的数据集。聚合后的最终数据集每类包含441张正面X光影像。 该数据库是投稿至《BMC医学影像学杂志》(BMC Medical Imaging Journal)的主论文的部分研究成果。
创建时间:
2022-12-06
二维码
社区交流群
二维码
科研交流群
商业服务