Synthetic-Cyclic-Perception_exp1
收藏魔搭社区2025-12-05 更新2025-07-19 收录
下载链接:
https://modelscope.cn/datasets/DevQuasar/Synthetic-Cyclic-Perception_exp1
下载链接
链接失效反馈官方服务:
资源简介:
[<img src="https://raw.githubusercontent.com/csabakecskemeti/devquasar/main/dq_logo_black-transparent.png" width="200"/>](https://devquasar.com)
'Make knowledge free for everyone'
<a href='https://ko-fi.com/L4L416YX7C' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi6.png?v=6' border='0' alt='Buy Me a Coffee at ko-fi.com' /></a>
# Synthetic-Cyclic-Perception

Synthetic-Cyclic-Perception is a synthetic visual dataset created by iteratively generating and describing images in a cyclic fashion. Each cycle begins with a seed prompt that feeds into a diffusion model to generate an initial image. A vision model then provides a detailed description of the generated image, which becomes the prompt for the next cycle. This process is repeated across multiple cycles and batches, creating a dataset where images and their descriptions evolve progressively.
## Dataset Composition
- Seed Prompts: The dataset generation starts with carefully selected, action-oriented prompts (e.g., “A person kayaking on a calm river”).
- Cycles: Each batch begins with a unique prompt and progresses through a set number of cycles, updating the image based on the most recent description.
- Metadata: Each image generated includes metadata such as the prompt, the vision model's description, filename, and batch details, saved in a centralized metadata JSON file for easy reference and analysis.
## Methodology
- Image Generation: Using the [stabilityai/stable-diffusion-3-medium](https://huggingface.co/stabilityai/stable-diffusion-3-medium) model, an image is generated based on the current prompt.
- Vision Description: The [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) describes the image, offering a nuanced textual interpretation.
- Cyclic Prompt Update: The description is parsed and used as the next prompt in the cycle, thus creating an evolving image-description sequence.
[<img src="https://raw.githubusercontent.com/csabakecskemeti/devquasar/main/dq_logo_black-transparent.png" width="200"/>](https://devquasar.com)
"让知识惠及每一个人"
<a href='https://ko-fi.com/L4L416YX7C' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi6.png?v=6' border='0' alt='在ko-fi上为我买一杯咖啡' /></a>
# 合成循环感知(Synthetic-Cyclic-Perception)

合成循环感知(Synthetic-Cyclic-Perception)是一种通过循环迭代生成与描述图像的方式构建的合成视觉数据集。每一轮循环均以种子提示词(Seed Prompts)为起点,将其输入扩散模型(diffusion model)以生成初始图像;随后由视觉模型(vision model)对生成的图像进行详细描述,该描述将作为下一轮循环的提示词。该流程经多轮循环与多批次重复执行,最终构建出图像与描述持续渐进演化的数据集。
## 数据集构成
- 种子提示词(Seed Prompts):数据集生成以精心筛选的行动导向型提示词为起点,例如“一名正在平静河面上划皮划艇的人”。
- 循环(Cycles):每个批次均以独特的提示词为起点,经过预设数量的循环迭代,基于最新的描述更新图像。
- 元数据(Metadata):每张生成的图像均附带元数据,包括提示词、视觉模型生成的描述、文件名以及批次信息,这些元数据将保存至中心化的元数据JSON文件中,便于查阅与分析。
## 构建方法
- 图像生成:基于当前提示词,使用[stabilityai/stable-diffusion-3-medium](https://huggingface.co/stabilityai/stable-diffusion-3-medium)模型生成图像。
- 视觉描述:使用[meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct)对图像进行描述,输出细致入微的文本解读。
- 循环提示词更新:将生成的描述文本解析后作为下一轮循环的提示词,由此形成持续演化的图像-描述序列。
提供机构:
maas
创建时间:
2025-07-14



