117k_human_coherence_flux1.0_V_flux1.1Blueberry

Name: 117k_human_coherence_flux1.0_V_flux1.1Blueberry
Creator: maas
Published: 2025-12-05 16:21:42
License: 暂无描述

魔搭社区2025-12-05 更新2025-02-01 收录

下载链接：

https://modelscope.cn/datasets/Rapidata/117k_human_coherence_flux1.0_V_flux1.1Blueberry

下载链接

链接失效反馈

官方服务：

资源简介：

# Rapidata Image Generation Alignment Dataset <a href="https://www.rapidata.ai"> <img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="400" alt="Dataset visualization"> </a> This Dataset is a 1/3 of a 340k human annotation dataset that was split into three modalities: Preference, Coherence, Text-to-Image Alignment. - Link to the Preference dataset: https://huggingface.co/datasets/Rapidata/117k_human_preferences_flux1.0_V_flux1.1Blueberry - Link to the Text-2-Image Alignment dataset: https://huggingface.co/datasets/Rapidata/117k_human_alignment_flux1.0_V_flux1.1Blueberry It was collected in ~2 Days using the Rapidata Python API https://docs.rapidata.ai If you get value from this dataset and would like to see more in the future, please consider liking it. ## Overview This dataset focuses on human comparative evaluations of AI-generated images. Participants were shown two images—one generated by Flux 1.0 and the other by Flux 1.1Blueberry—and asked, "Which image is more plausible to exist and has fewer odd or impossible-looking things?" Each pair of images was reviewed by at least 26 participants, generating a robust set of 117,000+ individual votes. ## Key Features - **Massive Scale**: 117,000+ individual human preference votes from all over the world - **Diverse Prompts**: 281 carefully curated prompts testing various aspects of image generation - **Leading Models**: Comparisons between two state-of-the-art image generation models - **Rigorous Methodology**: Uses pairwise comparisons with built-in quality controls - **Rich Demographic Data**: Includes annotator information about age, gender, and geographic location ## Applications This dataset is invaluable for: - Training and fine-tuning image generation models - Understanding global preferences in AI-generated imagery - Developing better evaluation metrics for generative models - Researching cross-cultural aesthetic preferences - Benchmarking new image generation models ## Data Collection Powered by Rapidata What traditionally would take weeks or months of data collection was accomplished in just 24 hours through Rapidata's innovative annotation platform. Our technology enables: - Lightning-fast data collection at massive scale - Global reach across 145+ countries - Built-in quality assurance mechanisms - Comprehensive demographic representation - Cost-effective large-scale annotation ## About Rapidata Rapidata's technology makes collecting human feedback at scale faster and more accessible than ever before. Visit [rapidata.ai](https://www.rapidata.ai/) to learn more about how we're revolutionizing human feedback collection for AI development. We created the dataset using our in-house developed [API](https://docs.rapidata.ai/), which you can access to gain near-instant human intelligence at your fingertips.

# Rapidata图像生成对齐数据集（Rapidata Image Generation Alignment Dataset） <a href="https://www.rapidata.ai"> <img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="400" alt="数据集可视化"> </a> 本数据集为34万人类标注数据集的三分之一，该原始数据集被划分为三个模态：偏好性（Preference）、一致性（Coherence）以及文本到图像对齐（Text-to-Image Alignment）。 - 偏好数据集链接：https://huggingface.co/datasets/Rapidata/117k_human_preferences_flux1.0_V_flux1.1Blueberry - 文本到图像对齐数据集链接：https://huggingface.co/datasets/Rapidata/117k_human_alignment_flux1.0_V_flux1.1Blueberry 本数据集依托Rapidata Python API（https://docs.rapidata.ai）耗时约2天完成采集。如果您从本数据集获益并希望未来看到更多同类资源，欢迎点赞支持。 ## 数据集概览本数据集聚焦于AI生成图像的人类对比评估。参与者将看到两张图像——一张由Flux 1.0生成，另一张由Flux 1.1Blueberry生成——并被要求回答：“哪张图像更符合现实存在的合理性，且更少出现怪异或不合逻辑的内容？”每一组图像对均由至少26名参与者进行评审，最终生成了超11.7万份个体投票，形成了规模可观的高质量标注数据集。 ## 核心特性 - **超大规模标注**：来自全球的超11.7万份人类偏好投票数据 - **多样化提示词集**：精心筛选的281条提示词，覆盖图像生成任务的多个维度 - **顶尖模型对比**：两款当前最先进的图像生成模型之间的性能对照 - **严谨实验范式**：采用成对比较范式，并内置质量管控机制 - **丰富人口统计信息**：包含标注者的年龄、性别及地理位置等数据 ## 应用场景本数据集可广泛应用于： - 图像生成模型的训练与微调 - 解析AI生成图像的全球审美偏好 - 优化生成式模型的评估指标 - 跨文化审美偏好相关研究 - 新型图像生成模型的性能基准测试 ## 依托Rapidata的数据采集流程传统上需要数周乃至数月的数据采集工作，依托Rapidata的创新标注平台仅用24小时即可完成。本平台技术具备以下优势： - 超大规模下的极速数据采集 - 覆盖145个以上国家的全球覆盖范围 - 内置质量保障机制 - 全面的人口统计样本代表性 - 高性价比的大规模标注服务 ## 关于Rapidata Rapidata的技术使大规模人类反馈采集变得更快、更易获取。请访问[rapidata.ai](https://www.rapidata.ai/)了解我们如何革新AI开发中的人类反馈采集流程。本数据集依托我们自研的API（https://docs.rapidata.ai/）构建，通过该接口您可轻松获取近乎实时的人类智能服务。

提供机构：

maas

创建时间：

2025-01-25

5,000+

优质数据集

54 个

任务类型

进入经典数据集