five

JourneyDB

收藏
魔搭社区2026-05-22 更新2025-05-24 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/JourneyDB
下载链接
链接失效反馈
官方服务:
资源简介:
--- task_categories: - image-to-text language: - en size_categories: - 1M<n<10M --- # JourneyDB [[Project Page]](https://journeydb.github.io) [[Paper]](https://arxiv.org/abs/2307.00716) [[Code]](https://github.com/JourneyDB/JourneyDB) [[HuggingFace]](https://huggingface.co/datasets/JourneyDB/JourneyDB) [[OpenDataLab]]() ![image](./assets/jdb_teaser_small.jpg) ## Dataset Description ### Summary **JourneyDB** is a large-scale generated image understanding dataset that contains **4,429,295** high-resolution Midjourney images, annotated with corresponding **text prompt**, **image caption** and **visual question answering**. ### Supported Tasks **JourneyDB** supports **4** downstream tasks, i.e. **Prompt Inversion**, **Style Retrieval**, **Image Caption**, and **Visual Question Answering**. We evaluate many existing methods on these tasks and provide a comprehensive benchmark. Please see our [Paper](https://arxiv.org/abs/2307.00716) for more details. ## Dataset Details ### Data Collection For each image instance, we acquire the corresponding text prompts used to generate the images with Midjourney. Furthermore, we employ GPT3.5 to generate the caption and VAQ groundtruth. ![image](./assets/jdb_data_collection.jpg) ### Data Instances We provide several examples to show the contents of each dataset instance. ![image](./assets/jdb_samples_small.jpeg) ### Data Splits We provide detailed statistics for each split subset in the following table. We randomly split the whole dataset into roughly 20 : 1 to obtain the training and validation set. The training set contains 4,189,737 labeled images and 1,385,317 labeled prompts. The validation set contains 235,156 images and 82,093 prompts. And we additionally sample a testing set for manual filtering. The testing set contains 5,402 images and 5,171 prompts. | | Image | Prompt | Labeled Image | Labeled Prompt | Style QA | Content QA | |----------------|:---------:|:---------:|:-------------:|:--------------:|:---------:|:----------:| | Training Set | 4,453,193 | 1,643,375 | 4,189,737 | 1,385,317 | 7,056,394 | 8,775,971 | | Validation Set | 234,156 | 82,093 | 234,156 | 82,093 | 311,569 | 374,310 | | Testing Set | 5,402 | 5,171 | 5,402 | 5,171 | 10,040 | 11,369 | | Total | 4,692,751 | 1,730,639 | 4,429,295 | 1,472,581 | 7,378,003 | 9,161,650 | ## Acquirements ### License The JourneyDB dataset is available under the customised [Terms of Usage](./assets/Terms_of_Usage.md). ### Citation ``` @article{sun2023journeydb, title={Journeydb: A benchmark for generative image understanding}, author={Sun, Keqiang and Pan, Junting and Ge, Yuying and Li, Hao and Duan, Haodong and Wu, Xiaoshi and Zhang, Renrui and Zhou, Aojun and Qin, Zipeng and Wang, Yi and others}, journal={Advances in neural information processing systems}, volume={36}, pages={49659--49678}, year={2023} } ``` ### Contributions [Junting Pan](https://junting.github.io)\*, [Keqiang Sun](https://keqiangsun.github.io)\*, [Yuying Ge](https://geyuying.github.io), [Hao Li](https://cpsxhao.github.io), [Haodong Duan](https://kennymckormick.github.io), [Xiaoshi Wu](https://github.com/tgxs002), [Renrui Zhang](https://github.com/ZrrSkywalker), [Aojun Zhou](https://scholar.google.com/citations?user=cC8lXi8AAAAJ&hl=en), [Zipeng Qin](https://www.linkedin.cn/incareer/in/zipeng-bruce-qin-846a65119), [Yi Wang](https://shepnerd.github.io), [Jifeng Dai](https://jifengdai.org), [Yu Qiao](http://mmlab.siat.ac.cn/yuqiao/), [Hongsheng Li](https://www.ee.cuhk.edu.hk/~hsli/)<sup>+</sup> (\* equal contribution, <sup>+</sup> corresponding author) ### Contact If you have any problem or suggestion, please feel free to open an issue or send emails to the contributors.

--- 任务类别: - 图像到文本 语言: - 英语 样本量级: - 100万 < 样本数 < 1000万 --- # JourneyDB [[项目页面]](https://journeydb.github.io) [[论文]](https://arxiv.org/abs/2307.00716) [[代码]](https://github.com/JourneyDB/JourneyDB) [[HuggingFace]](https://huggingface.co/datasets/JourneyDB/JourneyDB) [[OpenDataLab]]() ![图像](./assets/jdb_teaser_small.jpg) ## 数据集描述 ### 概述 **JourneyDB** 是一款大规模生成式图像理解数据集,包含**4,429,295**张高分辨率Midjourney图像,并标注了对应的文本提示词(text prompt)、图像字幕(image caption)以及视觉问答(Visual Question Answering)标注。 ### 支持任务 **JourneyDB** 支持4类下游任务,分别为提示词反转(Prompt Inversion)、风格检索(Style Retrieval)、图像字幕生成以及视觉问答(Visual Question Answering)。我们在上述任务上评测了多款现有方法,并提供了全面的基准测试结果。更多细节请参阅我们的[论文](https://arxiv.org/abs/2307.00716)。 ## 数据集详情 ### 数据采集 针对每一个图像样本,我们获取了使用Midjourney生成该图像时所使用的文本提示词。此外,我们采用GPT-3.5(大语言模型)生成图像字幕与视觉问答的标注真值。 ![图像](./assets/jdb_data_collection.jpg) ### 数据样本 我们提供了若干示例以展示每个数据集样本的具体内容。 ![图像](./assets/jdb_samples_small.jpeg) ### 数据划分 我们在下表中给出了各划分子集的详细统计信息。我们将全量数据集按照约20:1的比例随机划分为训练集与验证集。训练集包含4,189,737张带标注图像与1,385,317个带标注提示词;验证集包含235,156张图像与82,093个提示词。此外我们额外采样了一个测试集用于人工筛选,该测试集包含5,402张图像与5,171个提示词。 | | 图像总量 | 提示词总量 | 带标注图像数 | 带标注提示词数 | 风格问答数 | 内容问答数 | |----------------|:---------:|:---------:|:-------------:|:--------------:|:---------:|:----------:| | 训练集 | 4,453,193 | 1,643,375 | 4,189,737 | 1,385,317 | 7,056,394 | 8,775,971 | | 验证集 | 234,156 | 82,093 | 234,156 | 82,093 | 311,569 | 374,310 | | 测试集 | 5,402 | 5,171 | 5,402 | 5,171 | 10,040 | 11,369 | | 总计 | 4,692,751 | 1,730,639 | 4,429,295 | 1,472,581 | 7,378,003 | 9,161,650 | ## 获取方式 ### 许可协议 JourneyDB数据集采用定制化的[使用条款](./assets/Terms_of_Usage.md)进行发布。 ### 引用格式 @article{sun2023journeydb, title={Journeydb: A benchmark for generative image understanding}, author={Sun, Keqiang and Pan, Junting and Ge, Yuying and Li, Hao and Duan, Haodong and Wu, Xiaoshi and Zhang, Renrui and Zhou, Aojun and Qin, Zipeng and Wang, Yi and others}, journal={Advances in neural information processing systems}, volume={36}, pages={49659--49678}, year={2023} } ### 贡献者 [Junting Pan](https://junting.github.io)*, [Keqiang Sun](https://keqiangsun.github.io)*, [Yuying Ge](https://geyuying.github.io), [Hao Li](https://cpsxhao.github.io), [Haodong Duan](https://kennymckormick.github.io), [Xiaoshi Wu](https://github.com/tgxs002), [Renrui Zhang](https://github.com/ZrrSkywalker), [Aojun Zhou](https://scholar.google.com/citations?user=cC8lXi8AAAAJ&hl=en), [Zipeng Qin](https://www.linkedin.cn/incareer/in/zipeng-bruce-qin-846a65119), [Yi Wang](https://shepnerd.github.io), [Jifeng Dai](https://jifengdai.org), [Yu Qiao](http://mmlab.siat.ac.cn/yuqiao/), [Hongsheng Li](https://www.ee.cuhk.edu.hk/~hsli/)<sup>+</sup> (* 同等贡献,<sup>+</sup> 通讯作者) ### 联系方式 若您有任何问题或建议,欢迎提交Issue或联系贡献者。
提供机构:
maas
创建时间:
2025-05-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作