five

waltgrace/fiber-optic-drones-labels

收藏
Hugging Face2026-04-08 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/waltgrace/fiber-optic-drones-labels
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - object-detection - image-classification language: - en tags: - drones - fiber-optic - object-detection - vlm-labeled - data-label-factory size_categories: - 1K<n<10K pretty_name: Fiber-Optic Drones — Labels Only configs: - config_name: default data_files: - split: train path: data.parquet --- # Fiber-Optic Drones — Labels Only **Bounding-box annotations** for 2,260 fiber-optic-drone candidate images, with VLM verification verdicts. **No image pixels are included** — only labels and metadata. This is the zero-redistribution-risk release. If you want the labels **with** the original images, see the sister dataset: [`waltgrace/fiber-optic-drones`](https://huggingface.co/datasets/waltgrace/fiber-optic-drones). ## What's in here - **2,260** images - **8,759** bounding boxes (Falcon Perception) - **5,114** boxes (58%) verified YES by Qwen 2.5-VL-3B - **5 categories**: `fiber optic spool`, `cable spool`, `drone`, `quadcopter`, `fiber optic drone` - **5 buckets**: `positive/fiber_spool_drone`, `positive/spool_only`, `negative/drones_no_spool`, `distractor/round_things`, `background/empty` The full pipeline that produced these labels: [`walter-grace/data-label-factory`](https://github.com/walter-grace/data-label-factory) — runs entirely on a 16 GB Apple Silicon Mac. ## Schema ```python from datasets import load_dataset ds = load_dataset("waltgrace/fiber-optic-drones-labels", split="train") ds[0] # { # "image_id": 123, # "file_name": "positive/fiber_spool_drone/yt_3l6rNFzmv-o_00078.jpg", # "bucket": "positive/fiber_spool_drone", # "width": 640, # "height": 360, # "r2_key": "raw_v2/positive/fiber_spool_drone/yt_...jpg", # "n_bboxes": 12, # "n_approved": 8, # "bboxes": { # "annotation_id": [1, 2, 3, ...], # "category": ["cable spool", "drone", ...], # "x1": [...], "y1": [...], "x2": [...], "y2": [...], # "area": [...], # "vlm_verdict": ["YES", "NO", "YES", ...], # Qwen 2.5-VL verdict # "vlm_reasoning": ["The main object is a cable spool.", ...], # }, # } ``` Bbox coordinates are in **pixel space** (not normalized), origin top-left. ## Bonus files The repo also includes the raw label JSONs for users who don't want to go through HF Datasets: - `run2.coco.json` — original Falcon Perception COCO file (2,260 images, 8,759 anns, 5 cats) - `run2.verified.json` — Qwen 2.5-VL per-bbox verdicts and reasoning ## How was this labeled? Stage 1 — **Falcon Perception** (TII) drew 8,759 candidate bounding boxes across 2,260 web-scraped images using 5 query prompts. Stage 2 — **Qwen 2.5-VL-3B-Instruct** (Alibaba) cropped each bbox + context and answered "Is this a `<category>`? YES / NO / UNSURE" with reasoning. Both stages ran locally on a base-model Apple Silicon Mac via [MLX Expert Sniper](https://huggingface.co/waltgrace/mlx-expert-sniper) (Falcon ~1.5 GB resident, Qwen ~2.5 GB). ## Per-query agreement (Falcon ↔ Qwen) | Query | Falcon detections | Qwen approved | Agreement | |---|---:|---:|---:| | cable spool | 2,798 | ~2,460 | 88% | | quadcopter | 1,805 | ~1,460 | 81% | | drone | 2,186 | ~1,750 | 80% | | fiber optic spool | 1,397 | ~800 | 57% | | fiber optic drone | 573 | ~440 | 77% | `fiber optic spool` is the niche query — Falcon overfires, Qwen rejects 43%. ## License & redistribution **Labels:** Apache 2.0. Use them however you want. **Images:** NOT included. They were gathered from DuckDuckGo, Wikimedia, Openverse, and YouTube — original copyright belongs to each source. Use the labels with images you've gathered yourself, or use the [full dataset](https://huggingface.co/datasets/waltgrace/fiber-optic-drones) under its CC-BY-NC research-use license. ## Citation ```bibtex @dataset{walter-grace-2026-fiber-optic-drones-labels, author = {walter-grace}, title = {Fiber-Optic Drones — Labels Only}, year = 2026, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/waltgrace/fiber-optic-drones-labels}, } ``` ## Reproduce ```bash pip install git+https://github.com/walter-grace/data-label-factory data_label_factory pipeline --project projects/drones.yaml ```
提供机构:
waltgrace
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作