five

An image-caption dataset

收藏
DataCite Commons2026-05-07 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.18856453
下载链接
链接失效反馈
官方服务:
资源简介:
This is an image–caption dataset comprising 30k images, 210k captions, and 304k bounding box annotations, designed to support Video Analytics Research and Applications. It contains real-world images depicting activities and events relevant to surveillance, public safety, and abnormal behavior detection.   Contents: 30,000 images  210,000 captions (seven captions per image) 304,043 bounding box annotations (multiple bounding boxes per image)  File Formats:  Images: PNG, JPG, JPEG Bounding Boxes: TXT Captions: CSV The dataset is intended for: Vision–Language Modeling Real-time Object Detection Model Development Text-based Image Retrieval for Video Analytics Multimodal Reasoning
提供机构:
Zenodo
创建时间:
2026-03-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作