Complex Image Manipulation via Natural Language Instructions (CIM-NLI)
收藏arXiv2023-10-25 更新2024-06-21 收录
下载链接:
https://github.com/dair-iitd/NeuroSIM
下载链接
链接失效反馈官方服务:
资源简介:
CIM-NLI数据集是由印度理工学院德里分校的研究团队开发的一个用于图像通过自然语言指令进行复杂操作的新数据集。该数据集包含18000条数据,每条数据包括源图像、自然语言指令和目标图像。数据集中的图像包含多个对象,指令涉及添加、移除和改变对象的操作。创建过程中,使用Blender软件生成源图像和目标图像,并通过模板生成自然语言指令。CIM-NLI数据集主要用于训练和测试能够理解和执行复杂多跳自然语言指令的图像操作模型,旨在解决图像编辑中自然语言理解和复杂操作执行的问题。
The CIM-NLI Dataset is a novel dataset developed by a research team from the Indian Institute of Technology Delhi for complex image manipulation via natural language instructions. It contains 18,000 entries, each consisting of a source image, a natural language instruction, and a target image. The images in the dataset feature multiple objects, and the instructions cover operations including adding, removing, and modifying objects. During dataset construction, Blender software was employed to generate both the source and target images, while natural language instructions were generated using templates. The CIM-NLI dataset is primarily used for training and testing image manipulation models that can understand and execute complex multi-hop natural language instructions, aiming to address the challenges of natural language understanding and complex operation execution in image editing.
提供机构:
印度理工学院德里分校
创建时间:
2023-05-24



