five

BM-Bench

收藏
魔搭社区2025-12-04 更新2025-05-31 收录
下载链接:
https://modelscope.cn/datasets/ByteDance-Seed/BM-Bench
下载链接
链接失效反馈
官方服务:
资源简介:
[![Paper](https://img.shields.io/badge/%20arXiv-Paper-red)](https://arxiv.org/abs/2506.03107) [![Prohect-Page](https://img.shields.io/badge/%20Project-Website-blue)](https://boese0601.github.io/bytemorph/) [![Benchmaek](https://img.shields.io/badge/🤗%20Huggingface-Benchmark-yellow)](https://huggingface.co/datasets/ByteDance-Seed/BM-Bench) [![Dataset-Demo](https://img.shields.io/badge/🤗%20Huggingface-Dataset_Demo-yellow)](https://huggingface.co/datasets/ByteDance-Seed/BM-6M-Demo) [![Dataset](https://img.shields.io/badge/🤗%20Huggingface-Dataset-yellow)](https://huggingface.co/datasets/ByteDance-Seed/BM-6M) [![Gradio-Demo](https://img.shields.io/badge/🤗%20Huggingface-Gradio_Demo-yellow)](https://huggingface.co/spaces/Boese0601/ByteMorpher-Demo) [![Checkpoint](https://img.shields.io/badge/🤗%20Huggingface-Checkpoint-yellow)](https://huggingface.co/ByteDance-Seed/BM-Model) [![Code](https://img.shields.io/badge/%20Github-Code-blue)](https://github.com/ByteDance-Seed/BM-code) # Dataset Card for ByteMorph-Bench ByteMorph-Bench is a benchmark dataset for evaluating instruction-guided image editing models, focusing on the challenging task of non-rigid image manipulations. ByteMorph-Bench contains image editing pairs showcasing a wide variety of non-rigid motion types. ## Dataset Details ### Description We categorize non-rigid motion based editing into 5 types based on editing capabilities: (0) Camera Zoom: The camera position for taking these images moves closer (zoom in) or moves further away (zoom out); (1) Camera Motion: The camera position for taking these image is moving to left or right or up or down; (2) Object Motion: The object or objects in the images move or have non-rigid motions; (3) Human Motion: The human or people in the images moves or have body motions or facial expressions change; (4) Interaction: The human or people have interaction with the objects, or the human or objects interact with each other. ### Dataset Sources [optional] Original videos are generated by [Seaweed](https://seaweed.video/) and sampled into frames as source-target image editing pairs. These frames are further captioned by VLM and categorized into 5 editing types according to the captions. ## Intended use Primary intended uses: The primary use of ByteMorph is research on text-to-image and instruction-based image editing. Primary intended users: The model's primary intended users are researchers and hobbyists in computer vision, image generation, image processing, and AIGC. ## Dataset Structure ``` { "edit_type": "0_camera_zoom", # editing type "image_id": "100893989", # original video name for sampled image pairs "src_img": "...", # source image "tgt_img": "...", # target image after editing "edit_prompt": "The camera angle shifts to a closer view, more people appear in the frame, and the individuals are now engaged in a discussion or negotiation.", # VLM caption of the editing "edit_prompt_rewrite_instruction": "Zoom in the camera angle, add more people to the frame, and adjust the individuals' actions to show them engaged in a discussion or negotiation.", # Rewrite the VLM caption as an editing instruction "src_img_caption": "Several individuals are present, including three people wearing camouflage uniforms, blue helmets, and blue vests labeled "UN." ... ", # the caption of the source image "tgt_img_caption": "Several individuals are gathered in an outdoor setting. Two people wearing blue helmets and blue vests with "UN" written on them are engaged in a discussion. ... ", # the caption of the target image } ``` ### How to use ByteMorph-Bench Please preprocess this evaluation dataset and visualize the images with following script. ```python import os import json from datasets import load_dataset from PIL import Image from io import BytesIO from tqdm import tqdm # Load dataset ds = load_dataset("Boese0601/ByteMorph-Bench", split="test") # Define output root directory output_root = "./output_bench" for example in tqdm(ds): edit_type = example["edit_type"] image_id = example["image_id"] # Make subfolder by editing type subfolder = os.path.join(output_root, edit_type) os.makedirs(subfolder, exist_ok=True) # Reconstruct source and target images source_img = example["src_img"] target_img = example["tgt_img"] # Concatenate side by side w, h = source_img.size combined = Image.new("RGB", (w * 2, h)) combined.paste(source_img, (0, 0)) combined.paste(target_img, (w, 0)) # Save combined image out_img_path = os.path.join(subfolder, f"{image_id}.png") combined.save(out_img_path) # Save JSON file out_json_path = os.path.join(subfolder, f"{image_id}.json") json_content = { "edit": example["edit_prompt"], "edit_rewrite": example["edit_prompt_rewrite_instruction"], "input": example["src_img_caption"], "output": example["tgt_img_caption"], } with open(out_json_path, "w") as f: json.dump(json_content, f, indent=2) ``` Then use the script in [this repo](https://github.com/ByteDance-Seed/BM-code/tree/main/ByteMorph-Eval) for quantitative evaluation. ## Bibtex citation ```bibtex @article{chang2025bytemorph, title={ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions}, author={Chang, Di and Cao, Mingdeng and Shi, Yichun and Liu, Bo and Cai, Shengqu and Zhou, Shijie and Huang, Weilin and Wetzstein, Gordon and Soleymani, Mohammad and Wang, Peng}, journal={arXiv preprint arXiv:2506.03107}, year={2025} } ``` ## Disclaimer Your access to and use of this dataset are at your own risk. We do not guarantee the accuracy of this dataset. The dataset is provided “as is” and we make no warranty or representation to you with respect to it and we expressly disclaim, and hereby expressly waive, all warranties, express, implied, statutory or otherwise. This includes, without limitation, warranties of quality, performance, merchantability or fitness for a particular purpose, non-infringement, absence of latent or other defects, accuracy, or the presence or absence of errors, whether or not known or discoverable. In no event will we be liable to you on any legal theory (including, without limitation, negligence) or otherwise for any direct, special, indirect, incidental, consequential, punitive, exemplary, or other losses, costs, expenses, or damages arising out of this public license or use of the licensed material.The disclaimer of warranties and limitation of liability provided above shall be interpreted in a manner that, to the extent possible, most closely approximates an absolute disclaimer and waiver of all liability.

[![论文](https://img.shields.io/badge/%20arXiv-Paper-red)](https://arxiv.org/abs/2506.03107) [![项目主页](https://img.shields.io/badge/%20Project-Website-blue)](https://boese0601.github.io/bytemorph/) [![基准测试](https://img.shields.io/badge/🤗%20Huggingface-Benchmark-yellow)](https://huggingface.co/datasets/ByteDance-Seed/BM-Bench) [![数据集演示](https://img.shields.io/badge/🤗%20Huggingface-Dataset_Demo-yellow)](https://huggingface.co/datasets/ByteDance-Seed/BM-6M-Demo) [![数据集](https://img.shields.io/badge/🤗%20Huggingface-Dataset-yellow)](https://huggingface.co/datasets/ByteDance-Seed/BM-6M) [![Gradio演示](https://img.shields.io/badge/🤗%20Huggingface-Gradio_Demo-yellow)](https://huggingface.co/spaces/Boese0601/ByteMorpher-Demo) [![模型检查点](https://img.shields.io/badge/🤗%20Huggingface-Checkpoint-yellow)](https://huggingface.co/ByteDance-Seed/BM-Model) [![代码](https://img.shields.io/badge/%20Github-Code-blue)](https://github.com/ByteDance-Seed/BM-code) # ByteMorph-Bench 数据集卡片 ByteMorph-Bench 是一款用于评估指令引导图像编辑模型的基准数据集,重点聚焦于极具挑战性的非刚性图像编辑任务。该数据集包含覆盖各类非刚性运动类型的图像编辑配对样本。 ## 数据集详情 ### 描述 我们依据编辑能力将基于非刚性运动的图像编辑划分为5类: (0) 相机缩放(Camera Zoom):拍摄此类图像时的相机位置拉近(放大)或推远(缩小); (1) 相机运动(Camera Motion):拍摄此类图像时的相机向左、右、上或下移动; (2) 物体运动(Object Motion):图像中的单个或多个物体发生移动或呈现非刚性运动; (3) 人体运动(Human Motion):图像中的人类个体发生移动、肢体动作变化或面部表情改变; (4) 交互互动(Interaction):人类与物体产生互动,或人类与物体之间发生交互行为。 ### 数据集来源 [可选] 原始视频由 [Seaweed](https://seaweed.video/) 生成,并采样为帧图像以构成源-目标图像编辑配对样本。随后通过视觉语言模型(VLM)对这些帧进行字幕标注,并依据标注内容将其划分为5类编辑类型。 ## 预期用途 核心预期用途:ByteMorph 的核心用途为文本到图像生成以及基于指令的图像编辑相关研究。 核心目标用户:该数据集的核心目标用户为计算机视觉、图像生成、图像处理以及生成式人工智能(AIGC)领域的研究人员与爱好者。 ## 数据集结构 { "edit_type": "0_camera_zoom", # 编辑类型 "image_id": "100893989", # 采样图像配对对应的原始视频名称 "src_img": "...", # 源图像 "tgt_img": "...", # 编辑后的目标图像 "edit_prompt": "相机视角拉近,画面中出现更多人物,且这些个体正处于讨论或协商状态。", # 编辑任务的VLM标注字幕 "edit_prompt_rewrite_instruction": "将相机视角拉近至特写,向画面中添加更多人物,并调整人物动作使其处于讨论或协商状态。", # 将VLM标注字幕改写为编辑指令 "src_img_caption": "画面中有多名个体,包括三名身着迷彩服、蓝色头盔以及印有“UN”字样蓝色背心的人员。 ... ", # 源图像的字幕标注 "tgt_img_caption": "多名个体聚集于户外场景。两名身着蓝色头盔与印有“UN”字样蓝色背心的人员正在进行讨论。 ... ", # 目标图像的字幕标注 } ### 如何使用ByteMorph-Bench 请使用以下脚本对该评估数据集进行预处理并可视化图像: python import os import json from datasets import load_dataset from PIL import Image from io import BytesIO from tqdm import tqdm # 加载数据集 ds = load_dataset("Boese0601/ByteMorph-Bench", split="test") # 定义输出根目录 output_root = "./output_bench" for example in tqdm(ds): edit_type = example["edit_type"] image_id = example["image_id"] # 按编辑类型创建子文件夹 subfolder = os.path.join(output_root, edit_type) os.makedirs(subfolder, exist_ok=True) # 重构源图像与目标图像 source_img = example["src_img"] target_img = example["tgt_img"] # 横向拼接两张图像 w, h = source_img.size combined = Image.new("RGB", (w * 2, h)) combined.paste(source_img, (0, 0)) combined.paste(target_img, (w, 0)) # 保存拼接后的图像 out_img_path = os.path.join(subfolder, f"{image_id}.png") combined.save(out_img_path) # 保存JSON文件 out_json_path = os.path.join(subfolder, f"{image_id}.json") json_content = { "edit": example["edit_prompt"], "edit_rewrite": example["edit_prompt_rewrite_instruction"], "input": example["src_img_caption"], "output": example["tgt_img_caption"], } with open(out_json_path, "w") as f: json.dump(json_content, f, indent=2) 随后可使用 [此代码仓库](https://github.com/ByteDance-Seed/BM-code/tree/main/ByteMorph-Eval) 中的脚本进行量化评估。 ## BibTeX 引用 bibtex @article{chang2025bytemorph, title={ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions}, author={Chang, Di and Cao, Mingdeng and Shi, Yichun and Liu, Bo and Cai, Shengqu and Zhou, Shijie and Huang, Weilin and Wetzstein, Gordon and Soleymani, Mohammad and Wang, Peng}, journal={arXiv preprint arXiv:2506.03107}, year={2025} } ## 免责声明 您对本数据集的访问与使用均由您自行承担风险。我们不保证本数据集的准确性。本数据集按“现状”提供,我们未就其作出任何明示或默示的担保、陈述或保证,并明确免除所有明示、默示、法定或其他形式的担保,包括但不限于质量、性能、适销性或特定用途适用性、不侵权、无潜在或其他缺陷、准确性,以及是否存在已知或未知错误的担保。在任何情况下,我们均不会就因本公开许可或许可材料的使用而产生的任何直接、特殊、间接、附带、继发性、惩罚性、惩戒性或其他损失、成本、费用或损害承担任何法律责任(包括但不限于过失责任)或其他责任。上述免责声明与责任限制条款应在最大可能的范围内被解释为近乎绝对的免责与所有责任的放弃。
提供机构:
maas
创建时间:
2025-05-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作