117k_human_alignment_flux1.0_V_flux1.1Blueberry

Name: 117k_human_alignment_flux1.0_V_flux1.1Blueberry
Creator: maas
Published: 2025-12-05 16:21:41
License: 暂无描述

魔搭社区2025-12-05 更新2025-02-01 收录

下载链接：

https://modelscope.cn/datasets/Rapidata/117k_human_alignment_flux1.0_V_flux1.1Blueberry

下载链接

链接失效反馈

官方服务：

资源简介：

# Rapidata Image Generation Alignment Dataset <a href="https://www.rapidata.ai"> <img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="400" alt="Dataset visualization"> </a> This Dataset is a 1/3 of a 340k human annotation dataset that was split into three modalities: Preference, Coherence, Text-to-Image Alignment. - Link to the Preference dataset: https://huggingface.co/datasets/Rapidata/117k_human_preferences_flux1.0_V_flux1.1Blueberry - Link to the Coherence dataset: https://huggingface.co/datasets/Rapidata/117k_human_coherence_flux1.0_V_flux1.1Blueberry It was collected in ~2 Days using the Rapidata Python API https://docs.rapidata.ai If you get value from this dataset and would like to see more in the future, please consider liking it. ## Overview This dataset focuses on human comparative evaluations of AI-generated images. Given a prompt, participants were shown two images—one generated by Flux 1.0 and the other by Flux 1.1Blueberry—and asked, "Which image better fits the description?" Each pair of images was reviewed by at least 26 participants, generating a robust set of 117,000+ individual votes. ## Key Features - **Massive Scale**: 117,000+ individual human preference votes from all over the world - **Diverse Prompts**: 281 carefully curated prompts testing various aspects of image generation - **Leading Models**: Comparisons between two state-of-the-art image generation models - **Rigorous Methodology**: Uses pairwise comparisons with built-in quality controls - **Rich Demographic Data**: Includes annotator information about age, gender, and geographic location ## Applications This dataset is invaluable for: - Training and fine-tuning image generation models - Understanding global preferences in AI-generated imagery - Developing better evaluation metrics for generative models - Researching cross-cultural aesthetic preferences - Benchmarking new image generation models ## Data Collection Powered by Rapidata What traditionally would take weeks or months of data collection was accomplished in just 24 hours through Rapidata's innovative annotation platform. Our technology enables: - Lightning-fast data collection at massive scale - Global reach across 145+ countries - Built-in quality assurance mechanisms - Comprehensive demographic representation - Cost-effective large-scale annotation ## About Rapidata Rapidata's technology makes collecting human feedback at scale faster and more accessible than ever before. Visit [rapidata.ai](https://www.rapidata.ai/) to learn more about how we're revolutionizing human feedback collection for AI development. We created the dataset using our in-house developed [API](https://docs.rapidata.ai/), which you can access to gain near-instant human intelligence at your fingertips.

# Rapidata 图像生成对齐数据集 <a href="https://www.rapidata.ai"> <img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="400" alt="数据集可视化"> </a> 本数据集为34万人类标注数据集的三分之一，该原始数据集被划分为偏好、一致性、文本到图像对齐三个模态。 - 偏好数据集链接：https://huggingface.co/datasets/Rapidata/117k_human_preferences_flux1.0_V_flux1.1Blueberry - 一致性数据集链接：https://huggingface.co/datasets/Rapidata/117k_human_coherence_flux1.0_V_flux1.1Blueberry 本数据集通过Rapidata Python API（https://docs.rapidata.ai）耗时约2天完成采集。若您从本数据集获益并希望未来获取更多同类资源，欢迎点赞支持。 ## 数据集概述本数据集聚焦于AI生成图像的人类对比评估任务。实验流程为：向参与者提供文本提示词，同时展示两张图像——分别由Flux 1.0与Flux 1.1Blueberry生成，并询问参与者“哪张图像更贴合该描述”。每一组图像对均由至少26名参与者进行评审，最终累计获得超过11.7万条有效个体投票，形成了高质量的标注数据集。 ## 核心特性 - **超大规模标注**：覆盖全球的11.7万余条人类偏好投票 - **多样化提示词**：281条精心筛选的提示词，覆盖图像生成任务的多维度测试场景 - **前沿模型对比**：涵盖两款当前顶尖的图像生成模型的对比评估 - **严谨实验方法**：采用内置质量管控机制的成对比较范式 - **丰富人口统计数据**：包含标注者的年龄、性别与地理位置等信息 ## 应用场景本数据集可广泛应用于以下方向： - 图像生成模型的训练与微调 - 探究AI生成图像的全球用户偏好 - 优化生成式模型的评估指标体系 - 开展跨文化审美偏好相关研究 - 为新型图像生成模型提供性能基准测试 ## 基于Rapidata的数据采集流程传统数据采集往往需要数周乃至数月时间，而依托Rapidata的创新标注平台，本数据集仅用24小时便完成了全部采集工作。该平台的技术优势包括： - 支持超大规模数据的极速采集 - 覆盖全球145余个国家的标注群体 - 内置完善的质量保障机制 - 具备全面的人口统计样本代表性 - 实现大规模标注的低成本高效运营 ## 关于Rapidata Rapidata的技术让大规模人类反馈采集比以往任何时候都更加快捷易用。欢迎访问[rapidata.ai](https://www.rapidata.ai/)，了解我们如何革新AI开发领域的人类反馈采集流程。本数据集依托我们自主研发的[API](https://docs.rapidata.ai/)完成创建，您可通过该接口快速获取即时人类智能标注服务。

提供机构：

maas

创建时间：

2025-01-25

5,000+

优质数据集

54 个

任务类型

进入经典数据集