five

SACo-VEval

收藏
魔搭社区2026-01-06 更新2025-11-22 收录
下载链接:
https://modelscope.cn/datasets/facebook/SACo-VEval
下载链接
链接失效反馈
官方服务:
资源简介:
# SA-Co/VEval Dataset **License** each domain has its own License * SA-Co/VEval - SA-V: CC-BY-NC 4.0 * SA-Co/VEval - YT-Temporal-1B: CC-BY-NC 4.0 * SA-Co/VEval - SmartGlasses: CC-by-4.0 **SA-Co/VEval** is an evaluation dataset comprising of 3 domains, each domain has a val and test split. * SA-Co/VEval - SA-V: videos are from the [SA-V dataset](https://ai.meta.com/datasets/segment-anything-video/) * SA-Co/VEval - YT-Temporal-1B: videos are from the [YT-Temporal-1B](https://cove.thecvf.com/datasets/704) * SA-Co/VEval - SmartGlasses: egocentric videos from [Smart Glasses](https://huggingface.co/datasets/facebook/SACo-VEval/blob/main/media/saco_sg.tar.gz) This Hugging Face dataset repo contains the following contents: ``` datasets/facebook/SACo-VEval/tree/main/ ├── annotation/ │ ├── saco_veval_sav_test.json │ ├── saco_veval_sav_val.json │ ├── saco_veval_smartglasses_test.json │ ├── saco_veval_smartglasses_val.json │ ├── saco_veval_yt1b_test.json │ ├── saco_veval_yt1b_val.json └── media/ ├── saco_sg.tar.gz └── yt1b_start_end_time.json ``` * annotation * all the GT json files * media * `saco_sg.tar.gz`: the preprocessed JPEGImages for SA-Co/VEval - SmartGlasses * `yt1b_start_end_time.json`: the Youtube video ids and the start and end time used in SA-Co/VEval - YT-Temporal-1B More detail to prepare the complete SA-Co/VEval Dataset can be found in the [SAM 3 Github](https://github.com/facebookresearch/sam3/tree/main/scripts/eval/veval). ## Annotation Format The format is similar to the [YTVIS](https://youtube-vos.org/dataset/vis/) format. In the annotation json, e.g. `saco_veval_sav_test.json` there are 5 fields: * info: * A dict containing the dataset info * E.g. {'version': 'v1', 'date': '2025-09-24', 'description': 'SA-Co/VEval SA-V Test'} * videos * A list of videos that are used in the current annotation json * It contains {id, video_name, file_names, height, width, length} * annotations * A list of **positive** masklets and their related info * It contains {id, segmentations, bboxes, areas, iscrowd, video_id, height, width, category_id, noun_phrase} * video_id should match to the `videos - id` field above * category_id should match to the `categories - id` field below * segmentations is a list of [RLE](https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/mask.py) * categories * A **globally** used noun phrase id map, which is true across all 3 domains. * It contains {id, name} * name is the noun phrase * video_np_pairs * A list of video-np pairs, including both **positive** and **negative** used in the current annotation json * It contains {id, video_id, category_id, noun_phrase, num_masklets} * video_id should match the `videos - id` above * category_id should match the `categories - id` above * when `num_masklets > 0` it is a positive video-np pair, and the presenting masklets can be found in the annotations field * when `num_masklets = 0` it is a negative video-np pair, meaning no masklet presenting at all ``` data { "info": info "videos": [video] "annotations": [annotation] "categories": [category] "video_np_pairs": [video_np_pair] } video { "id": int "video_name": str # e.g. sav_000000 "file_names": List[str] "height": int "width": width "length": length } annotation { "id": int "segmentations": List[RLE] "bboxes": List[List[int, int, int, int]] "areas": List[int] "iscrowd": int "video_id": str "height": int "width": int "category_id": int "noun_phrase": str } category { "id": int "name": str } video_np_pair { "id": int "video_id": str "category_id": int "noun_phrase": str "num_masklets" int } ``` SAM 3 Github [sam3/examples/saco_veval_vis_example.ipynb](https://github.com/facebookresearch/sam3/blob/main/examples/saco_veval_vis_example.ipynb) shows some examples of the data format and data visualization.

# SA-Co/VEval 数据集 **授权协议**:每个领域均拥有独立授权协议 * SA-Co/VEval - SA-V:采用CC-BY-NC 4.0协议 * SA-Co/VEval - YT-Temporal-1B:采用CC-BY-NC 4.0协议 * SA-Co/VEval - SmartGlasses:采用CC-by-4.0协议 **SA-Co/VEval**是一款评估数据集,涵盖3个领域,每个领域均包含验证集(val)与测试集(test)划分: * SA-Co/VEval - SA-V:视频源自[SA-V数据集](https://ai.meta.com/datasets/segment-anything-video/) * SA-Co/VEval - YT-Temporal-1B:视频源自[YT-Temporal-1B数据集](https://cove.thecvf.com/datasets/704) * SA-Co/VEval - SmartGlasses:第一人称视角视频源自[Smart Glasses数据集](https://huggingface.co/datasets/facebook/SACo-VEval/blob/main/media/saco_sg.tar.gz) 本Hugging Face数据集仓库包含以下内容: datasets/facebook/SACo-VEval/tree/main/ ├── annotation/ │ ├── saco_veval_sav_test.json │ ├── saco_veval_sav_val.json │ ├── saco_veval_smartglasses_test.json │ ├── saco_veval_smartglasses_val.json │ ├── saco_veval_yt1b_test.json │ └── saco_veval_yt1b_val.json └── media/ ├── saco_sg.tar.gz └── yt1b_start_end_time.json * annotation文件夹:存放所有真值(Ground Truth,简称GT)JSON标注文件 * media文件夹: * `saco_sg.tar.gz`:SA-Co/VEval - SmartGlasses领域的预处理JPEG图像文件 * `yt1b_start_end_time.json`:SA-Co/VEval - YT-Temporal-1B领域所用YouTube视频的ID及起止时间信息 完整SA-Co/VEval数据集的准备细节可参考[SAM 3 GitHub仓库](https://github.com/facebookresearch/sam3/tree/main/scripts/eval/veval)。 ## 标注格式 该标注格式与[YTVIS数据集](https://youtube-vos.org/dataset/vis/)格式类似。 以`saco_veval_sav_test.json`为例,标注JSON文件包含5个核心字段: * info: * 包含数据集元信息的字典 * 示例:`{"version": "v1", "date": "2025-09-24", "description": "SA-Co/VEval SA-V Test"}` * videos: * 当前标注文件中所用视频的列表 * 每个元素包含`id`、`video_name`、`file_names`、`height`、`width`、`length`字段 * annotations: * 所有正样本掩码块(masklets)及其关联信息的列表 * 每个元素包含`id`、`segmentations`、`bboxes`、`areas`、`iscrowd`、`video_id`、`height`、`width`、`category_id`、`noun_phrase`字段: * `video_id`需与上述`videos`字段中的`id`保持一致 * `category_id`需与下述`categories`字段中的`id`保持一致 * `segmentations`为行程长度编码(RLE)列表,格式参考[COCO数据集掩码工具](https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/mask.py) * categories: * 全局统一的名词短语ID映射表,在3个领域中通用 * 每个元素包含`id`、`name`字段: * `name`为对应的名词短语 * video_np_pairs: * 当前标注文件中所用的视频-名词短语对列表,涵盖正样本与负样本 * 每个元素包含`id`、`video_id`、`category_id`、`noun_phrase`、`num_masklets`字段: * `video_id`需与上述`videos`字段中的`id`保持一致 * `category_id`需与上述`categories`字段中的`id`保持一致 * 当`num_masklets > 0`时为正样本视频-名词短语对,其对应的掩码块可在`annotations`字段中检索到 * 当`num_masklets = 0`时为负样本视频-名词短语对,表示该样本中无对应掩码块 data { "info": info "videos": [video] "annotations": [annotation] "categories": [category] "video_np_pairs": [video_np_pair] } video { "id": int "video_name": str # 示例:sav_000000 "file_names": List[str] "height": int "width": int "length": int } annotation { "id": int "segmentations": List[RLE] "bboxes": List[List[int, int, int, int]] "areas": List[int] "iscrowd": int "video_id": str "height": int "width": int "category_id": int "noun_phrase": str } category { "id": int "name": str } video_np_pair { "id": int "video_id": str "category_id": int "noun_phrase": str "num_masklets": int } SAM 3 GitHub仓库中的[sam3/examples/saco_veval_vis_example.ipynb](https://github.com/facebookresearch/sam3/blob/main/examples/saco_veval_vis_example.ipynb)展示了该数据集格式的示例与数据可视化效果。
提供机构:
maas
创建时间:
2025-11-20
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
SACo-VEval是一个多领域评估数据集,包含SA-V、YT-Temporal-1B和SmartGlasses三个领域的视频数据,每个领域均有验证和测试分割。数据集提供了详细的注释信息,包括视频元数据、物体分割掩码、边界框和类别标签,适用于视频理解和物体分割任务的评估。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作