waltgrace/fiber-optic-drones-labels
收藏Hugging Face2026-04-08 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/waltgrace/fiber-optic-drones-labels
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- object-detection
- image-classification
language:
- en
tags:
- drones
- fiber-optic
- object-detection
- vlm-labeled
- data-label-factory
size_categories:
- 1K<n<10K
pretty_name: Fiber-Optic Drones — Labels Only
configs:
- config_name: default
data_files:
- split: train
path: data.parquet
---
# Fiber-Optic Drones — Labels Only
**Bounding-box annotations** for 2,260 fiber-optic-drone candidate images, with VLM verification verdicts. **No image pixels are included** — only labels and metadata. This is the zero-redistribution-risk release.
If you want the labels **with** the original images, see the sister dataset:
[`waltgrace/fiber-optic-drones`](https://huggingface.co/datasets/waltgrace/fiber-optic-drones).
## What's in here
- **2,260** images
- **8,759** bounding boxes (Falcon Perception)
- **5,114** boxes (58%) verified YES by Qwen 2.5-VL-3B
- **5 categories**: `fiber optic spool`, `cable spool`, `drone`, `quadcopter`, `fiber optic drone`
- **5 buckets**: `positive/fiber_spool_drone`, `positive/spool_only`, `negative/drones_no_spool`, `distractor/round_things`, `background/empty`
The full pipeline that produced these labels:
[`walter-grace/data-label-factory`](https://github.com/walter-grace/data-label-factory) — runs entirely on a 16 GB Apple Silicon Mac.
## Schema
```python
from datasets import load_dataset
ds = load_dataset("waltgrace/fiber-optic-drones-labels", split="train")
ds[0]
# {
# "image_id": 123,
# "file_name": "positive/fiber_spool_drone/yt_3l6rNFzmv-o_00078.jpg",
# "bucket": "positive/fiber_spool_drone",
# "width": 640,
# "height": 360,
# "r2_key": "raw_v2/positive/fiber_spool_drone/yt_...jpg",
# "n_bboxes": 12,
# "n_approved": 8,
# "bboxes": {
# "annotation_id": [1, 2, 3, ...],
# "category": ["cable spool", "drone", ...],
# "x1": [...], "y1": [...], "x2": [...], "y2": [...],
# "area": [...],
# "vlm_verdict": ["YES", "NO", "YES", ...], # Qwen 2.5-VL verdict
# "vlm_reasoning": ["The main object is a cable spool.", ...],
# },
# }
```
Bbox coordinates are in **pixel space** (not normalized), origin top-left.
## Bonus files
The repo also includes the raw label JSONs for users who don't want to go through HF Datasets:
- `run2.coco.json` — original Falcon Perception COCO file (2,260 images, 8,759 anns, 5 cats)
- `run2.verified.json` — Qwen 2.5-VL per-bbox verdicts and reasoning
## How was this labeled?
Stage 1 — **Falcon Perception** (TII) drew 8,759 candidate bounding boxes across 2,260 web-scraped images using 5 query prompts.
Stage 2 — **Qwen 2.5-VL-3B-Instruct** (Alibaba) cropped each bbox + context and answered "Is this a `<category>`? YES / NO / UNSURE" with reasoning.
Both stages ran locally on a base-model Apple Silicon Mac via [MLX Expert Sniper](https://huggingface.co/waltgrace/mlx-expert-sniper) (Falcon ~1.5 GB resident, Qwen ~2.5 GB).
## Per-query agreement (Falcon ↔ Qwen)
| Query | Falcon detections | Qwen approved | Agreement |
|---|---:|---:|---:|
| cable spool | 2,798 | ~2,460 | 88% |
| quadcopter | 1,805 | ~1,460 | 81% |
| drone | 2,186 | ~1,750 | 80% |
| fiber optic spool | 1,397 | ~800 | 57% |
| fiber optic drone | 573 | ~440 | 77% |
`fiber optic spool` is the niche query — Falcon overfires, Qwen rejects 43%.
## License & redistribution
**Labels:** Apache 2.0. Use them however you want.
**Images:** NOT included. They were gathered from DuckDuckGo, Wikimedia, Openverse, and YouTube — original copyright belongs to each source. Use the labels with images you've gathered yourself, or use the [full dataset](https://huggingface.co/datasets/waltgrace/fiber-optic-drones) under its CC-BY-NC research-use license.
## Citation
```bibtex
@dataset{walter-grace-2026-fiber-optic-drones-labels,
author = {walter-grace},
title = {Fiber-Optic Drones — Labels Only},
year = 2026,
publisher = {Hugging Face},
url = {https://huggingface.co/datasets/waltgrace/fiber-optic-drones-labels},
}
```
## Reproduce
```bash
pip install git+https://github.com/walter-grace/data-label-factory
data_label_factory pipeline --project projects/drones.yaml
```
提供机构:
waltgrace



