Fhrozen/relaion-synthetic

Name: Fhrozen/relaion-synthetic
Creator: Fhrozen
Published: 2026-01-28 08:32:13
License: 暂无描述

Hugging Face2026-01-28 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/Fhrozen/relaion-synthetic

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 configs: - config_name: default data_files: - split: train path: data/train-* dataset_info: features: - name: image dtype: image - name: filename dtype: string - name: url dtype: string - name: text dtype: string - name: width dtype: int64 - name: height dtype: int64 - name: top_caption dtype: string - name: all_captions list: string - name: dense_caption dtype: string - name: vqa dtype: string - name: objects dtype: string - name: text_content dtype: string splits: - name: train num_bytes: 344524450516 num_examples: 5144102 download_size: 330644002120 dataset_size: 344524450516 --- # Relaion Synthetic - LLM-Annotated [Original Source](https://huggingface.co/datasets/laion/relaion-synthetic-115m) ## 📌 Introduction This dataset comprises images and annotations from the original Relaion Synthetic Dataset. Out of the 115M images, a subset of **5.1M images** has been annotated with automatic methods (Image-text-to-text models). ## Captions The annotations include four annotation columns: - `dense_caption`: A dense annotation about the image - `vqa`: Visual Question-Answers related to the image. JSON dictionary embedded as a string. - `objects`: Object found in the image. JSON dictionary embedded as a string. - `text_content`: OCRed text found in the image. JSON dictionary embedded as a string. obtained from a Qwen3 VLM (https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Thinking-FP8). **System Prompt:** ```python sys_prompt = """You are a professional JSON data generator. Your responses must ALWAYS be valid, parseable JSON. CRITICAL RULES: - Output ONLY valid JSON, no additional text before or after - Use double quotes for all strings - Escape special characters properly (\\n, \\", \\\\) - Boolean values must be lowercase: true, false - Null values must be lowercase: null - Do not use trailing commas - Ensure all brackets and braces are properly closed""" ``` **User Prompt:** ```python prompt = """Analyze this image and provide a detailed annotation in VALID JSON format. STEP 1: CHECK FOR WATERMARKS If you detect significant watermarks (Getty Images, shutterstock logos, large copyright overlays), respond with: {"watermark_detected": true, "status": "rejected"} Otherwise, proceed to STEP 2. STEP 2: GENERATE COMPREHENSIVE ANNOTATION Return a JSON object with these exact fields: { "watermark_detected": false, "dense_caption": "<Write a detailed 3-5 sentence paragraph describing the scene. Include: overall atmosphere, main objects and their spatial locations (left/right/center, foreground/background), colors, textures, lighting, relationships between objects, and any actions or emotions conveyed.>", "objects": [ {"object_name": "<name>", "attributes": "<color, material, condition>", "location_hint": "<position in frame>"} ], "text_content": { "has_text": <true or false>, "transcription": "<actual text from signs, labels, books, etc. or null>", "context": "<what the text is on or null>" }, "vqa_dataset": [ {"question": "<perception/counting/reasoning question>", "answer": "<answer>", "type": "<Perception|Counting|Reasoning|OCR>"} ] } REQUIREMENTS: - Generate 5-10 VQA pairs covering different question types - Do NOT ask about watermarks, timestamps, or camera metadata - List 3-10 key objects with their attributes - Keep all text in a single line (no literal newlines in strings) - Ensure the response is ONLY the JSON object, nothing else OUTPUT ONLY VALID JSON - NO MARKDOWN, NO EXPLANATIONS.""" ``` The request JSON is: ```python data = { "model": "llm-model", "messages": [ {"role": "system", "content": [{"type": "text", "text": sys_prompt}]}, {"role": "user", "content": [ {"type": "text", "text": prompt}, {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"} } ]} ], "stream": False, "temperature": 0.7, "max_completion_tokens": 8192, } ``` ## Licensing The generated prompts and descriptions are licensed under the Apache 2.0 license. The images obtained from the original repository remain under their respective licenses. In the event of any license issue, an image will be removed without prior notice. ## 🙏 Acknowledgement All credits to the original Laion team.

提供机构：

Fhrozen

5,000+

优质数据集

54 个

任务类型

进入经典数据集