A Litopenaeus vannamei shrimp dataset with images and corresponding weight-size measurements for the development of artificial intelligence-based biomass estimation and organism detection algorithms
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/h8tcn6ykky
下载链接
链接失效反馈官方服务:
资源简介:
This dataset was compiled with the ultimate goal of developing non-invasive computer vision algorithms for assessing shrimp biometrics and biomass estimation. The main folder, labeled "DATASET," contains five sub-folders—DB1, DB2, DB3, DB4, and DB5—each filled with images of shrimps. Additionally, each sub-folder is accompanied by an Excel file that includes manually measured data for the shrimps pictured. The files are named respectively: DB1_INDUSTRIAL_FARM_1, DB2_INDUSTRIAL_FARM_2_C1, DB3_INDUSTRIAL_FARM_2_C2, DB4_ACADEMIC_POND_S1, and DB5_ACADEMIC_POND_S2.
Here’s a detailed description of the contents of each sub-folder and its corresponding Excel file:
1) DB1 includes 490 PNG images of 22 shrimps taken from one pond at an industrial farm. The associated Excel file, DB1_INDUSTRIAL_FARM_1, contains columns for:
SAMPLE: Reflecting the number of individual shrimps (22 entries or rows).
LENGTH (cm): Measuring from the rostrum (near the eyes) to the start of the tail.
WEIGHT (g): Recorded using a scale.
COMPLETE SHRIMP IMAGES: Indicates if at least one full-body image is available (1) or not (0).
2) DB2 consists of 2002 PNG images of 58 shrimps. The Excel file, DB2_INDUSTRIAL_FARM_2_C1, includes:
SAMPLE: Number of shrimps (58 entries or rows).
CEPHALOTHORAX (cm): Total length of the cephalothorax.
LENGTH (cm) and WEIGHT (g): Similar measurements as DB1.
COMPLETE SHRIMP IMAGES: Presence (1) or absence (0) of full-body images.
3) DB3 contains 1719 PNG images of 50 shrimps, with its Excel file, DB3_INDUSTRIAL_FARM_2_C2, documenting:
SAMPLE: Number of shrimps (50 entries or rows).
Measurements and categories identical to DB2.
4) DB4 encompasses 635 PNG images of 20 shrimps, detailed in the Excel file DB4_ACADEMIC_POND_S1. This includes:
SAMPLE: Number of shrimps (20 entries or rows).
CEPHALOTHORAX (cm), LENGTH (cm), WEIGHT (g), and COMPLETE SHRIMP IMAGES: Documented as in other datasets.
5) DB5 includes 661 PNG images of 20 shrimps, with DB5_ACADEMIC_POND_S2 as the corresponding Excel file. The file mirrors the structure and measurements of DB4.
The images for each foler are named "sm_n", where m is the number of shrimp sample and n is the number of picture of that shrimp.
This carefully structured dataset provides comprehensive biometric data on shrimps, facilitating the development of algorithms aimed at non-invasive measurement techniques. This will likely be pivotal in enhancing the precision of biomass estimation in aquaculture farming, utilizing advanced statistical morphology analysis and machine learning techniques.
CHANGES FROM VERSION 1:
The cephalothorax metric is the length rather than the width. That was an error in the first version. The name in the columns also had a typo, which has been corrected (from CEPHALOTORAX to CEPHALOTHORAX).
本数据集的编译初衷为开发用于评估虾类生物特征及生物量估算的非侵入式计算机视觉算法。主文件夹命名为"DATASET",包含五个子文件夹——DB1、DB2、DB3、DB4、DB5,每个子文件夹均存储有虾类图像。此外,每个子文件夹均配套一份Excel文件,内含对应拍摄虾类的人工实测数据。各子文件夹及配套Excel文件的命名依次为:DB1_INDUSTRIAL_FARM_1、DB2_INDUSTRIAL_FARM_2_C1、DB3_INDUSTRIAL_FARM_2_C2、DB4_ACADEMIC_POND_S1、DB5_ACADEMIC_POND_S2。
以下为各子文件夹及其配套Excel文件的详细内容说明:
1) DB1包含从某工业化养殖场池塘采集的22只虾类的490张PNG图像。配套Excel文件DB1_INDUSTRIAL_FARM_1包含以下列:
SAMPLE:对应单只虾的编号(共22条记录/行)。
LENGTH (cm):从吻突(眼部附近)至尾部起始处的体长。
WEIGHT (g):通过称重设备记录的体重。
COMPLETE SHRIMP IMAGES:标识是否至少包含一张完整全身图像(1代表是,0代表否)。
2) DB2包含58只虾类的2002张PNG图像。Excel文件DB2_INDUSTRIAL_FARM_2_C1包含以下字段:
SAMPLE:虾类数量(共58条记录/行)。
头胸甲(CEPHALOTHORAX)(cm):总长度。
LENGTH (cm)与WEIGHT (g):与DB1中的测量项一致。
COMPLETE SHRIMP IMAGES:标识完整全身图像的存在状态(1代表存在,0代表不存在)。
3) DB3包含50只虾类的1719张PNG图像,配套Excel文件DB3_INDUSTRIAL_FARM_2_C2记录的测量项与分类方式与DB2完全一致。
4) DB4包含20只虾类的635张PNG图像,相关数据记录于Excel文件DB4_ACADEMIC_POND_S1中,包含以下字段:
SAMPLE:虾类数量(共20条记录/行)。
头胸甲(CEPHALOTHORAX)(cm)、LENGTH (cm)、WEIGHT (g)及COMPLETE SHRIMP IMAGES:格式与其他数据集保持一致。
5) DB5包含20只虾类的661张PNG图像,配套Excel文件为DB5_ACADEMIC_POND_S2,其数据结构与测量项与DB4完全一致。
每个文件夹内的图像均以"sm_n"命名,其中m为虾类样本编号,n为该样本的图像序号。
本结构严谨的数据集提供了全面的虾类生物特征数据,可为面向非侵入式测量技术的算法开发提供支撑,有望通过先进的统计形态学分析与机器学习技术,显著提升水产养殖领域的生物量估算精度。
版本1更新说明:
头胸甲度量指标为长度而非宽度,此为第一版中的疏漏错误。列名亦存在拼写失误,现已修正(原CEPHALOTORAX改为CEPHALOTHORAX)。
创建时间:
2024-07-01



