PINGEcosystem/sss-crab-pot-detection
收藏Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/PINGEcosystem/sss-crab-pot-detection
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- object-detection
dataset_info:
features:
- name: image
dtype: image
- name: objects
sequence:
- name: bbox
sequence: float32
length: 4
- name: category
dtype: string
- name: area
dtype: float64
license: gpl
language:
- en
tags:
- sonar
- side-scan-sonar
- crab-pot
- derelict-fishing-gear
pretty_name: Ghost Pot Side-Scan Sonar Detection Dataset
size_categories:
- 1K<n<10K
---
# 🦀 Ghost Pot Side‑Scan Sonar Detection Dataset
**Side‑Scan Sonar Imagery & Annotations for Derelict Crab Pot Detection**
This dataset contains manually annotated side‑scan sonar (SSS) imagery collected across Delaware’s Inland Bays and Delaware Bay to support research on automated detection of derelict crab pots (“ghost pots”). It is designed for training and evaluating object‑detection models in turbid, shallow‑water environments where visual surveys are limited and acoustic mapping is essential.
The dataset accompanies the [GhostVision](https://github.com/PINGEcosystem/GhostVision) project, an open‑source pipeline for near–real‑time detection and mapping of derelict fishing gear.
## 📜 Publication
*In Progress*
## 📦 Dataset Overview
- Total images: 6,674
- Sensor type: Consumer‑grade Humminbird side‑scan sonar
- Spatial coverage:
- northern Rehoboth Bay near Dewey Beach
- western Indian River Bay
- southern Indian River Bay near White Creek
- Annotation format: JSON Lines (JSONL)
- Annotation type: Axis‑aligned bounding boxes
- Classes:
- Crab-Pot — high confidence
- Maybe-Crab-Pot — ambiguous or low‑confidence pot‑like target
- Coordinate system: Pixel coordinates in image space
File structure:
```Code
sss-crab-pot-detection
├── train/
│ ├── <image>.jpg
│ └── metadata.jsonl
├── valid/
│ ├── <image>.jpg
│ └── metadata.jsonl
├── train/
│ ├── <image>.jpg
│ └── metadata.jsonl
```
All annotations were created manually by trained analysts using high‑resolution sonar mosaics and cross‑validated with field notes and independent review.
## 🎯 Motivation
Derelict crab pots are a major source of ghost‑fishing mortality and benthic habitat damage. Existing removal programs rely on opportunistic sonar use during retrieval operations, meaning no formal mapping or archiving of pot locations exists. As a result, managers lack:
- basin‑wide estimates of derelict pot abundance
- spatial patterns of accumulation
- quantitative evaluation of removal efforts
This dataset enables reproducible, scalable research on automated detection, mapping, and monitoring of derelict fishing gear.
## 🗂️ Data Structure
This dataset uses JSON Lines (JSONL) format, where each line corresponds to a single image and all annotations associated with that image. This structure is natively supported by the Hugging Face Datasets library and enables efficient streaming and inspection.
### JSONL Schema
Each row has the following structure:
```json
{
"file_name": "<image filename>",
"objects": {
"bbox": [ [x, y, width, height], ... ],
"category": [ "<class>", ... ],
"area": [ <area>, ... ]
}
}
```
### Field Definitions
- file_name — relative path to the image file
- objects.bbox — list of bounding boxes in [x, y, width, height] format
- objects.category — list of class labels ("Crab-Pot" or "Maybe-Crab-Pot")
- objects.area — list of pixel areas for each bounding box
### Example: Image With No Detections
```json
{
"file_name": "Contact_334_sslo_png_jpg.rf.9c8a769f3c36deba4278527e714e09b7.jpg",
"objects": {
"bbox": [],
"category": [],
"area": []
}
}
```
### Example: Image With Annotations
```json
{
"file_name": "Contact_203_sslo_png_jpg.rf.d9b44748a1bf7d9a30126e649741d745.jpg",
"objects": {
"bbox": [
[309, 307, 26, 21],
[196, 198, 29.5, 19]
],
"category": ["Crab-Pot", "Crab-Pot"],
"area": [546, 560.5]
}
}
```
### Class Labels
- Crab-Pot — high-confidence derelict pot
- Maybe-Crab-Pot — ambiguous or low‑confidence pot‑like target
These are stored as strings for readability and compatibility with the Hugging Face viewer.
## 🚀 Loading the Dataset
```python
from datasets import load_dataset
ds = load_dataset(PINGEcosystem/sss-crab-pot-detection)
# Get test dataset
ds = ds['test']
# Access bounding boxes for the first image
ds[0]["objects"]["bbox"]
```
## 🧪 Recommended Uses
This dataset is suitable for:
- Object detection (YOLO, DETR, RT‑DETR, RF‑DETR, etc.)
- Acoustic target classification
- Weakly supervised or semi‑supervised learning
- Spatiotemporal persistence modeling
- Benchmarking confidence/persistence fusion methods
- Research on ghost‑gear mapping and marine debris monitoring
## ⚠️ Limitations
- Sonar appearance varies with substrate, tow speed, and turbidity.
- Ambiguous targets are labeled as Maybe-Crab-Pot to avoid forcing false certainty.
- Not all pots are visible in every pass; some are buried or obscured.
- Dataset is region‑specific (Delaware), though methods generalize broadly.
- Users should consider domain adaptation when applying models to other regions or sonar systems.
## 📜 License
This dataset is released under the GPL license.
## 🙌 Acknowledgments
This dataset was developed with support from:
- University of Delaware -- Center for Coastal Sediments Hydrodynamics and Engineering Lab (CSHEL)
- Delaware Sea Grant
- 2024 Autonomous Systems Bootcamp
- NOAA's Project ABLE
- NOAA Marine Debris Program
- Delaware Department of Natural Resources and Environmental Control (DNREC)
- Community volunteers participating in ghost‑gear surveys
任务类别:
- 目标检测
数据集信息:
特征:
- 名称:image,数据类型:图像
- 名称:objects,序列类型:
- 名称:bbox,序列类型:float32,长度:4
- 名称:category,数据类型:字符串
- 名称:area,数据类型:float64
许可证:GNU通用公共许可证(GPL)
语言:
- 英语
标签:
- 声呐
- 侧扫声呐(side-scan sonar)
- 蟹笼
- 废弃渔具
友好名称:幽灵蟹笼侧扫声呐检测数据集
尺寸类别:1000 < 样本数 < 10000
# 🦀 幽灵蟹笼侧扫声呐检测数据集
**用于废弃蟹笼(“幽灵蟹笼”)自动检测的侧扫声呐(side-scan sonar, SSS)图像与标注**
本数据集包含在特拉华内陆海湾及特拉华湾采集的人工标注侧扫声呐图像,用于支持废弃蟹笼(又称“幽灵蟹笼”)自动检测的相关研究。其设计初衷是为在视觉调查受限、声学测绘至关重要的浑浊浅水环境中,训练与评估目标检测模型提供支撑。
本数据集配套[GhostVision](https://github.com/PINGEcosystem/GhostVision)项目,该项目是一套用于废弃渔具近实时检测与测绘的开源流程。
## 📜 出版物状态
*待发表*
## 📦 数据集概览
- 图像总数:6674张
- 传感器类型:消费级Humminbird侧扫声呐
- 空间覆盖范围:
- 迪威海滩附近的雷霍博特湾北部
- 印第安河湾西部
- 怀特溪附近的印第安河湾南部
- 标注格式:JSON行格式(JSON Lines, JSONL)
- 标注类型:轴对齐边界框
- 类别:
- 蟹笼(高置信度)
- 疑似蟹笼(模糊或低置信度的笼状目标)
- 坐标系:图像空间内的像素坐标系
文件结构:
sss-crab-pot-detection
├── train/
│ ├── <image>.jpg
│ └── metadata.jsonl
├── valid/
│ ├── <image>.jpg
│ └── metadata.jsonl
├── train/
│ ├── <image>.jpg
│ └── metadata.jsonl
所有标注均由经过培训的分析人员使用高分辨率声呐镶嵌图完成,并通过野外记录与独立评审进行交叉验证。
## 🎯 研究动机
废弃蟹笼是造成幽灵渔获死亡与底栖生境破坏的主要诱因之一。现有清除项目仅在回收作业期间临时使用声呐,因此未对蟹笼位置进行正式测绘或存档。由此导致管理者缺失:
- 全流域废弃蟹笼丰度估算数据
- 蟹笼堆积的空间分布模式
- 清除工作效果的定量评估
本数据集可支持废弃渔具自动检测、测绘与监测领域的可复现、可扩展研究。
## 🗂️ 数据结构
本数据集采用JSON行格式(JSONL),每一行对应单张图像及其关联的所有标注。该结构原生支持Hugging Face Datasets库,可实现高效的流式加载与数据检视。
### JSONL Schema
每一行数据具有如下结构:
json
{
"file_name": "<图像文件名>",
"objects": {
"bbox": [ [x, y, width, height], ... ],
"category": [ "<类别>", ... ],
"area": [ <面积>, ... ]
}
}
### 字段定义
- file_name:图像文件的相对路径
- objects.bbox:以[x, y, 宽, 高]格式表示的边界框列表
- objects.category:类别标签列表(“蟹笼”或“疑似蟹笼”)
- objects.area:每个边界框的像素面积列表
### 无检测目标的图像示例
json
{
"file_name": "Contact_334_sslo_png_jpg.rf.9c8a769f3c36deba4278527e714e09b7.jpg",
"objects": {
"bbox": [],
"category": [],
"area": []
}
}
### 带标注的图像示例
json
{
"file_name": "Contact_203_sslo_png_jpg.rf.d9b44748a1bf7d9a30126e649741d745.jpg",
"objects": {
"bbox": [
[309, 307, 26, 21],
[196, 198, 29.5, 19]
],
"category": ["蟹笼", "蟹笼"],
"area": [546, 560.5]
}
}
### 类别标签
- 蟹笼:高置信度废弃蟹笼
- 疑似蟹笼:模糊或低置信度的笼状目标
为提升可读性并兼容Hugging Face数据集查看器,上述标签以字符串形式存储。
## 🚀 数据集加载方法
python
from datasets import load_dataset
ds = load_dataset("PINGEcosystem/sss-crab-pot-detection")
# 获取测试集
ds = ds['test']
# 访问第一张图像的边界框信息
ds[0]["objects"]["bbox"]
## 🧪 推荐应用场景
本数据集适用于:
- 目标检测(如YOLO、DETR、RT-DETR、RF-DETR等)
- 声学目标分类
- 弱监督或半监督学习
- 时空持久性建模
- 置信度/持久性融合方法的基准测试
- 幽灵渔具测绘与海洋垃圾监测相关研究
## ⚠️ 数据集局限性
- 声呐成像效果会随底质、拖曳速度与浑浊度变化而不同。
- 模糊目标被标记为“疑似蟹笼”,以避免强制赋予虚假确定性。
- 并非所有蟹笼均可在单次扫测中被捕捉到,部分蟹笼可能被掩埋或遮挡。
- 数据集具有区域特异性(仅限特拉华州海域),但相关方法可广泛推广。
- 若将模型应用于其他区域或声呐系统,需考虑域自适应处理。
## 📜 许可证
本数据集采用GNU通用公共许可证(GPL)发布。
## 🙌 致谢
本数据集的开发得到了以下机构与个人的支持:
- 特拉华大学海岸沉积物水动力与工程实验室(CSHEL)
- 特拉华海洋资助项目(Delaware Sea Grant)
- 2024年自主系统训练营
- 美国国家海洋和大气管理局(NOAA)ABLE项目
- NOAA海洋垃圾项目
- 特拉华州自然资源与环境控制部(DNREC)
- 参与幽灵渔具调查的社区志愿者
提供机构:
PINGEcosystem



