Ultralytics/COCO8
收藏Hugging Face2024-11-14 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Ultralytics/COCO8
下载链接
链接失效反馈官方服务:
资源简介:
---
comments: true
description: >-
Explore the Ultralytics COCO8 dataset, a versatile and manageable set of 8
images perfect for testing object detection models and training pipelines.
keywords: >-
COCO8, Ultralytics, dataset, object detection, YOLO11, training, validation,
machine learning, computer vision
license: agpl-3.0
task_categories:
- object-detection
size_categories:
- n<1K
language:
- en
pretty_name: COCO8
---
# COCO8 Dataset
## Introduction
[Ultralytics](https://www.ultralytics.com/) COCO8 is a small, but versatile [object detection](https://www.ultralytics.com/glossary/object-detection) dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com/) and [YOLO11](https://github.com/ultralytics/ultralytics).
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO8 dataset, the `coco8.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8.yaml).
!!! example "ultralytics/cfg/datasets/coco8.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco8.yaml"
```
## Usage
To train a YOLO11n model on the COCO8 dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco8.yaml model=yolo11n.pt epochs=100 imgsz=640
```
## Sample Images and Annotations
Here are some examples of images from the COCO8 dataset, along with their corresponding annotations:
<img src="https://github.com/ultralytics/docs/releases/download/0/mosaiced-training-batch-1.avif" alt="Dataset sample image" width="800">
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO8 dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
## FAQ
### What is the Ultralytics COCO8 dataset used for?
The Ultralytics COCO8 dataset is a compact yet versatile object detection dataset consisting of the first 8 images from the COCO train 2017 set, with 4 images for training and 4 for validation. It is designed for testing and debugging object detection models and experimentation with new detection approaches. Despite its small size, COCO8 offers enough diversity to act as a sanity check for your training pipelines before deploying larger datasets. For more details, view the [COCO8 dataset](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8.yaml).
### How do I train a YOLO11 model using the COCO8 dataset?
To train a YOLO11 model using the COCO8 dataset, you can employ either Python or CLI commands. Here's how you can start:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco8.yaml model=yolo11n.pt epochs=100 imgsz=640
```
For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
### Why should I use Ultralytics HUB for managing my COCO8 training?
Ultralytics HUB is an all-in-one web tool designed to simplify the training and deployment of YOLO models, including the Ultralytics YOLO11 models on the COCO8 dataset. It offers cloud training, real-time tracking, and seamless dataset management. HUB allows you to start training with a single click and avoids the complexities of manual setups. Discover more about [Ultralytics HUB](https://hub.ultralytics.com/) and its benefits.
### What are the benefits of using mosaic augmentation in training with the COCO8 dataset?
Mosaic augmentation, demonstrated in the COCO8 dataset, combines multiple images into a single image during training. This technique increases the variety of objects and scenes in each training batch, improving the model's ability to generalize across different object sizes, aspect ratios, and contexts. This results in a more robust object detection model. For more details, refer to the [training guide](#usage).
### How can I validate my YOLO11 model trained on the COCO8 dataset?
Validation of your YOLO11 model trained on the COCO8 dataset can be performed using the model's validation commands. You can invoke the validation mode via CLI or Python script to evaluate the model's performance using precise metrics. For detailed instructions, visit the [Validation](../../modes/val.md) page.
提供机构:
Ultralytics
搜集汇总
数据集介绍

构建方式
在计算机视觉领域,构建高效且具有代表性的数据集是推动目标检测算法发展的基石。Ultralytics/COCO8数据集源自著名的COCO train 2017集合,其构建方式遵循了精炼与实用的原则。该数据集从原始训练集中截取了前8幅图像,并精心划分为4幅用于训练、4幅用于验证,形成了一个微型但结构完整的子集。这种构建策略旨在保留原始数据分布的基本特征,同时通过极简的样本规模,为研究人员和开发者提供一个快速验证模型与流程的轻量级平台。
使用方法
对于希望快速验证目标检测模型或实验新方法的实践者而言,COCO8数据集提供了极为简明的使用路径。用户可通过Ultralytics提供的Python接口或命令行工具直接调用。典型的使用流程包括加载预训练的YOLO11模型,并通过指定`coco8.yaml`配置文件启动训练任务,同时可灵活设置训练轮次与输入图像尺寸等参数。这种高度封装的方法显著降低了入门门槛,使得开发者能够将精力集中于核心算法调试,而非繁琐的数据准备与工程配置环节。
背景与挑战
背景概述
在计算机视觉领域,大规模标注数据集是推动目标检测技术发展的基石。微软COCO数据集自2015年由Tsung-Yi Lin等研究者提出以来,已成为评估模型在复杂场景下识别常见物体能力的标准基准。作为其衍生子集,Ultralytics/COCO8数据集由Ultralytics机构精心构建,选取了COCO train 2017集中的前8幅图像,旨在为研究人员和工程师提供一个轻量级、高可管理性的测试平台。该数据集虽规模精简,却保留了原始数据在类别多样性和场景复杂性上的核心特征,主要用于快速验证目标检测模型架构的可行性、调试训练流程,以及在投入大规模数据前进行完整性检查,显著降低了实验的初始门槛与计算成本。
当前挑战
尽管COCO8数据集在快速原型验证方面展现出独特价值,但其核心挑战亦不容忽视。从领域问题视角看,目标检测任务本身面临物体尺度多变、遮挡严重以及背景干扰等固有难题,而极小的数据规模使得模型难以从中学习到足够的泛化特征,极易导致过拟合,限制了其在真实复杂场景下的性能评估效力。就构建过程而言,如何从海量COCO数据中选取最具代表性的极小样本子集,以确保其虽‘微小’却能涵盖关键视觉模式,是一项严峻挑战;同时,维持与原始数据集在标注质量、类别分布上的一致性,也需要精心的设计与校验,以避免引入偏差而影响测试的可靠性。
常用场景
经典使用场景
在计算机视觉领域,目标检测模型的开发与测试常需高效便捷的数据支持。Ultralytics/COCO8数据集作为COCO train 2017子集的精简版本,其经典使用场景集中于模型调试与训练流程验证。研究者利用其包含的8张图像(4张训练、4张验证),能够快速执行目标检测算法的原型测试,尤其适用于YOLO系列模型的初步训练与错误排查。该数据集虽规模微小,却涵盖了多样化的物体与场景,为模型在有限资源下的性能评估提供了理想基准。
解决学术问题
目标检测研究常面临大规模数据集训练耗时、计算资源消耗巨大的挑战。COCO8数据集的推出,有效解决了模型开发初期快速迭代与算法验证的学术需求。它使研究者能够在几分钟内完成训练循环,从而专注于模型架构优化、超参数调整及训练策略的探索。该数据集作为“健全性检查”工具,帮助识别训练管道中的潜在错误,降低了因数据规模庞大而导致的调试复杂性,推动了目标检测方法的高效创新。
实际应用
在实际工程应用中,COCO8数据集常被嵌入自动化机器学习流程,用于快速验证新部署的目标检测系统。例如,在工业视觉检测或嵌入式设备开发中,工程师可利用该数据集对YOLO11等模型进行轻量级测试,确保训练管道配置正确,避免直接使用大规模数据带来的时间与成本开销。此外,教育领域也借助其简明特性,向学生演示目标检测的基本原理与模型训练步骤,降低了机器学习入门的实践门槛。
数据集最近研究
最新研究方向
在计算机视觉领域,目标检测技术的快速发展催生了模型调试与验证的精细化需求。Ultralytics/COCO8数据集作为轻量级测试基准,其前沿研究聚焦于高效模型验证与训练流程优化。当前热点事件包括YOLO系列模型的迭代更新,如YOLO11的发布,推动了该数据集在模型快速原型验证和训练策略测试中的应用。研究者利用COCO8的紧凑特性,探索数据增强技术如马赛克增强对模型泛化能力的提升,以及在小样本场景下模型性能的鲁棒性评估。这一趋势不仅加速了新算法的实验周期,还降低了计算资源消耗,对边缘计算和实时检测系统的开发具有重要参考价值。
以上内容由遇见数据集搜集并总结生成



