undefined443/JourneyDB-recaption

Name: undefined443/JourneyDB-recaption
Creator: undefined443
Published: 2026-03-31 08:49:31
License: 暂无描述

Hugging Face2026-03-31 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/undefined443/JourneyDB-recaption

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-nc-4.0 task_categories: - image-to-text - text-to-image language: - en tags: - journeydb - midjourney - recaption - vision-language - ai-generated size_categories: - 1M<n<10M configs: - config_name: default data_files: - split: train path: data.parquet --- # JourneyDB Recaption Recaptioned version of the [JourneyDB](https://huggingface.co/datasets/JourneyDB/JourneyDB) dataset using Qwen vision-language models. ## Dataset Description JourneyDB is a large-scale dataset of AI-generated images from Midjourney. This recaptioned version provides detailed visual descriptions generated by a vision-language model, which are more accurate than the original generation prompts for describing actual image content. ### Statistics | Metric | Count | | ---------- | ----------------- | | Total rows | 3,389,605 | | File size | ~246 MB (Parquet) | ### Columns | Column | Type | Description | | ----------------- | ------ | -------------------------------------- | | `img_path` | string | Relative path to image in JourneyDB | | `width` | int | Image width in pixels | | `height` | int | Image height in pixels | | `aesthetic_score` | float | Aesthetic score (may be null for some) | | `caption` | string | Generated visual description | | `model` | string | Model used for recaptioning | ### Recaption Models | Model | Count | | --------------------------------------------------------------------------------- | --------- | | [Qwen/Qwen3-VL-8B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct) | 2,249,747 | | [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) | 1,139,858 | - **Prompt style**: COCO-style short caption ("Describe this image in one simple sentence...") ### Preprocessing Before recaptioning, the following filters were applied to the original JourneyDB dataset: - **Resolution filter**: min(width, height) >= 512 - **Aspect ratio filter**: max(width, height) / min(width, height) <= 2.0 ### Example ```python { "img_path": "./000/728deb7c-a5e2-463c-8f75-5f62dae521ac.jpg", "width": 1024, "height": 1024, "aesthetic_score": 7.276855, "caption": "A girl sits at a table with books, looking directly at the camera.", "model": "Qwen/Qwen2.5-VL-7B-Instruct" } ``` ## Usage ```python from datasets import load_dataset dataset = load_dataset("undefined443/JourneyDB-recaption") ``` Access the data: ```python for sample in dataset["train"]: img_path = sample["img_path"] caption = sample["caption"] width, height = sample["width"], sample["height"] aesthetic_score = sample["aesthetic_score"] # Use with JourneyDB images ``` ## License This dataset inherits the license from JourneyDB (CC-BY-NC-4.0). ## Related - [JourneyDB](https://huggingface.co/datasets/JourneyDB/JourneyDB) - Original dataset - [Qwen3-VL-8B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct) - Vision-language model used for recaptioning - [Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) - Vision-language model used for recaptioning

提供机构：

undefined443

5,000+

优质数据集

54 个

任务类型

进入经典数据集