SalmonScan: A Novel Image Dataset for Machine Learning and Deep Learning Analysis in Fish Disease Detection in Aquaculture
收藏Mendeley Data2024-04-05 更新2024-06-28 收录
下载链接:
https://data.mendeley.com/datasets/x3fz2nfm4w
下载链接
链接失效反馈官方服务:
资源简介:
The SalmonScan dataset is a collection of images of salmon fish, including healthy fish and infected fish. The dataset consists of two classes of images: Fresh salmon 🐟 Infected Salmon 🐠 This dataset is ideal for various computer vision tasks in machine learning and deep learning applications. Whether you are a researcher, developer, or student, the SalmonScan dataset offers a rich and diverse data source to support your projects and experiments. So, dive in and explore the fascinating world of salmon health and disease! The SalmonScan dataset (raw) consists of 24 fresh fish and 91 infected fish. [Due to server cleaning in the past, some raw datasets have been deleted] The SalmonScan dataset (augmented) consists of approximately 1,208 images of salmon fish, classified into two classes: - Fresh salmon (healthy fish with no visible signs of disease), 456 images - Infected Salmon containing disease, 752 images Each class contains a representative and diverse collection of images, capturing a range of different perspectives, scales, and lighting conditions. The images have been carefully curated to ensure that they are of high quality and suitable for use in a variety of computer vision tasks. Data Preprocessing The input images were preprocessed to enhance their quality and suitability for further analysis. The following steps were taken: Resizing 📏: All the images were resized to a uniform size of 600 pixels in width and 250 pixels in height to ensure compatibility with the learning algorithm. Image Augmentation 📸: To overcome the small amount of images, various image augmentation techniques were applied to the input images. These included: Horizontal Flip ↩️: The images were horizontally flipped to create additional samples. Vertical Flip ⬆️: The images were vertically flipped to create additional samples. Rotation 🔄: The images were rotated to create additional samples. Cropping 🪓: A portion of the image was randomly cropped to create additional samples. Gaussian Noise 🌌: Gaussian noise was added to the images to create additional samples. Shearing 🌆: The images were sheared to create additional samples. Contrast Adjustment (Gamma) ⚖️: The gamma correction was applied to the images to adjust their contrast. Contrast Adjustment (Sigmoid) ⚖️: The sigmoid function was applied to the images to adjust their contrast. Usage To use the salmon scan dataset in your ML and DL projects, follow these steps: - Clone or download the salmon scan dataset repository from GitHub. - Use standard libraries such as numpy or pandas to convert the images into arrays, which can be input into a machine learning or deep learning model. - Split the dataset into training, validation, and test sets as per your requirement. - Preprocess the data as needed, such as resizing and normalizing the images. - Train your ML/DL model using the preprocessed training data. - Evaluate the model on the test set and make predictions on new, unseen data.
SalmonScan数据集是一类鲑鱼图像合集,涵盖健康鲑鱼与受感染鲑鱼两类样本。该数据集包含两类图像:健康鲑鱼(🐟)与受感染鲑鱼(🐠),适用于机器学习与深度学习场景下的各类计算机视觉(Computer Vision)任务。无论您是研究人员、开发者还是学生,SalmonScan数据集均可提供丰富多样的数据源,支撑您的相关项目与实验。不妨深入探索鲑鱼健康与病害的趣味研究领域!
原始SalmonScan数据集(raw)包含24条健康鲑鱼与91条受感染鲑鱼的图像(注:因过往服务器清理操作,部分原始数据集已遭删除)。增强版SalmonScan数据集则包含约1208张鲑鱼图像,同样分为两类:
- 健康鲑鱼(无可见病害迹象的健康个体):共456张
- 受感染鲑鱼(携带病害):共752张
每一类图像均具备代表性与多样性,涵盖不同视角、缩放比例与光照条件。所有图像均经过精心筛选,确保质量优异,适配多种计算机视觉任务。
## 数据预处理
为提升图像质量与后续分析适配性,对输入图像执行了以下预处理步骤:
1. 尺寸调整(Resizing):将所有图像统一调整为宽600像素、高250像素的尺寸,以适配学习算法的输入要求。
2. 图像增强(Image Augmentation):针对样本量较少的问题,对输入图像应用了多种图像增强技术以扩充样本,具体包括:
- 水平翻转(Horizontal Flip):通过水平翻转生成额外样本
- 垂直翻转(Vertical Flip):通过垂直翻转生成额外样本
- 旋转(Rotation):通过旋转生成额外样本
- 随机裁剪(Cropping):随机裁剪图像局部区域以生成额外样本
- 高斯噪声(Gaussian Noise):向图像添加高斯噪声以生成额外样本
- 错切变换(Shearing):对图像执行错切变换以生成额外样本
- 伽马对比度调整(Contrast Adjustment (Gamma)):通过伽马校正调整图像对比度
- Sigmoid对比度调整(Contrast Adjustment (Sigmoid)):通过Sigmoid函数调整图像对比度
## 使用方法
若需在机器学习(Machine Learning, ML)与深度学习(Deep Learning, DL)项目中使用该鲑鱼扫描数据集,请遵循以下步骤:
1. 从GitHub克隆或下载SalmonScan数据集仓库
2. 使用numpy、pandas等标准库将图像转换为数组格式,以供机器学习/深度学习模型输入
3. 根据需求将数据集划分为训练集、验证集与测试集
4. 根据需要对数据执行预处理操作,例如图像尺寸调整与归一化
5. 使用预处理后的训练数据训练机器学习/深度学习模型
6. 在测试集上评估模型性能,并对全新的未知数据进行预测
创建时间:
2024-03-01
搜集汇总
数据集介绍

背景与挑战
背景概述
SalmonScan数据集是一个包含健康和感染鲑鱼图像的集合,用于机器学习和深度学习的计算机视觉任务。数据集包含原始和增强后的图像,总计约1,208张,分为健康和感染两类,并经过多种预处理和增强技术处理。
以上内容由遇见数据集搜集并总结生成



