five

Synthetic-Cyclic-Perception_exp1

收藏
魔搭社区2025-12-05 更新2025-07-19 收录
下载链接:
https://modelscope.cn/datasets/DevQuasar/Synthetic-Cyclic-Perception_exp1
下载链接
链接失效反馈
官方服务:
资源简介:
[<img src="https://raw.githubusercontent.com/csabakecskemeti/devquasar/main/dq_logo_black-transparent.png" width="200"/>](https://devquasar.com) 'Make knowledge free for everyone' <a href='https://ko-fi.com/L4L416YX7C' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi6.png?v=6' border='0' alt='Buy Me a Coffee at ko-fi.com' /></a> # Synthetic-Cyclic-Perception ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e6d37e02dee9bcb9d9fa18/rfwG2vGbmo-cJ0zPOtsSK.png) Synthetic-Cyclic-Perception is a synthetic visual dataset created by iteratively generating and describing images in a cyclic fashion. Each cycle begins with a seed prompt that feeds into a diffusion model to generate an initial image. A vision model then provides a detailed description of the generated image, which becomes the prompt for the next cycle. This process is repeated across multiple cycles and batches, creating a dataset where images and their descriptions evolve progressively. ## Dataset Composition - Seed Prompts: The dataset generation starts with carefully selected, action-oriented prompts (e.g., “A person kayaking on a calm river”). - Cycles: Each batch begins with a unique prompt and progresses through a set number of cycles, updating the image based on the most recent description. - Metadata: Each image generated includes metadata such as the prompt, the vision model's description, filename, and batch details, saved in a centralized metadata JSON file for easy reference and analysis. ## Methodology - Image Generation: Using the [stabilityai/stable-diffusion-3-medium](https://huggingface.co/stabilityai/stable-diffusion-3-medium) model, an image is generated based on the current prompt. - Vision Description: The [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) describes the image, offering a nuanced textual interpretation. - Cyclic Prompt Update: The description is parsed and used as the next prompt in the cycle, thus creating an evolving image-description sequence.

[<img src="https://raw.githubusercontent.com/csabakecskemeti/devquasar/main/dq_logo_black-transparent.png" width="200"/>](https://devquasar.com) "让知识惠及每一个人" <a href='https://ko-fi.com/L4L416YX7C' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi6.png?v=6' border='0' alt='在ko-fi上为我买一杯咖啡' /></a> # 合成循环感知(Synthetic-Cyclic-Perception) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e6d37e02dee9bcb9d9fa18/rfwG2vGbmo-cJ0zPOtsSK.png) 合成循环感知(Synthetic-Cyclic-Perception)是一种通过循环迭代生成与描述图像的方式构建的合成视觉数据集。每一轮循环均以种子提示词(Seed Prompts)为起点,将其输入扩散模型(diffusion model)以生成初始图像;随后由视觉模型(vision model)对生成的图像进行详细描述,该描述将作为下一轮循环的提示词。该流程经多轮循环与多批次重复执行,最终构建出图像与描述持续渐进演化的数据集。 ## 数据集构成 - 种子提示词(Seed Prompts):数据集生成以精心筛选的行动导向型提示词为起点,例如“一名正在平静河面上划皮划艇的人”。 - 循环(Cycles):每个批次均以独特的提示词为起点,经过预设数量的循环迭代,基于最新的描述更新图像。 - 元数据(Metadata):每张生成的图像均附带元数据,包括提示词、视觉模型生成的描述、文件名以及批次信息,这些元数据将保存至中心化的元数据JSON文件中,便于查阅与分析。 ## 构建方法 - 图像生成:基于当前提示词,使用[stabilityai/stable-diffusion-3-medium](https://huggingface.co/stabilityai/stable-diffusion-3-medium)模型生成图像。 - 视觉描述:使用[meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct)对图像进行描述,输出细致入微的文本解读。 - 循环提示词更新:将生成的描述文本解析后作为下一轮循环的提示词,由此形成持续演化的图像-描述序列。
提供机构:
maas
创建时间:
2025-07-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作