xAI_Aurora_t2i_human_preferences
收藏魔搭社区2025-12-05 更新2025-02-01 收录
下载链接:
https://modelscope.cn/datasets/Rapidata/xAI_Aurora_t2i_human_preferences
下载链接
链接失效反馈官方服务:
资源简介:
<style>
.vertical-container {
display: flex;
flex-direction: column;
gap: 60px;
}
.image-container img {
max-height: 250px; /* Set the desired height */
margin:0;
object-fit: contain; /* Ensures the aspect ratio is maintained */
width: auto; /* Adjust width automatically based on height */
box-sizing: content-box;
}
.image-container {
display: flex; /* Aligns images side by side */
justify-content: space-around; /* Space them evenly */
align-items: center; /* Align them vertically */
gap: .5rem
}
.container {
width: 90%;
margin: 0 auto;
}
.text-center {
text-align: center;
}
.score-amount {
margin: 0;
margin-top: 10px;
}
.score-percentage {Score:
font-size: 12px;
font-weight: semi-bold;
}
</style>
# Rapidata Aurora Preference
<a href="https://www.rapidata.ai">
<img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="400" alt="Dataset visualization">
</a>
This T2I dataset contains over 400k human responses from over 86k individual annotators, collected in just ~2 Days using the [Rapidata Python API](https://docs.rapidata.ai), accessible to anyone and ideal for large scale evaluation.
Evaluating Aurora across three categories: preference, coherence, and alignment.
Explore our latest model rankings on our [website](https://www.rapidata.ai/benchmark).
If you get value from this dataset and would like to see more in the future, please consider liking it.
## Overview
This T2I dataset contains over 400k human responses from over 86k individual annotators, collected in just ~2 Days.
Evaluating Aurora across three categories: preference, coherence, and alignment.
The evaluation consists of 1v1 comparisons between Aurora and six other models: Imagen-3, Flux-1.1-pro, Flux-1-pro, DALL-E 3, Midjourney-5.2, and Stable Diffusion 3.
## Data collection
Since Aurora is not available through an API, the images were collected manually through the user interface. The date following each model name indicates when the images were generated.
## Alignment
The alignment score quantifies how well an video matches its prompt. Users were asked: "Which image matches the description better?".
<div class="vertical-container">
<div class="container">
<div class="text-center">
<q>A black colored banana.</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Aurora </h3>
<div class="score-percentage">Score: 100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/yQPSJEcWRfivcjrGp8krU.png" width=500>
</div>
<div>
<h3 class="score-amount">Midjourney </h3>
<div class="score-percentage">Score: 0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/son8CR5OVrdegg7I4Mbco.jpeg" width=500>
</div>
</div>
</div>
<div class="container">
<div class="text-center">
<q>A blue and white cat next to a blanket and shelf with grey bottle.</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Aurora</h3>
<div class="score-percentage">Score: 0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/gYDtYOZUhmSRyi-V_3OlJ.png" width=500>
</div>
<div>
<h3 class="score-amount">Flux-1.1</h3>
<div class="score-percentage">Score: 100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/hX4mEeCyeL7rYCqin_gYD.jpeg" width=500>
</div>
</div>
</div>
</div>
## Coherence
The coherence score measures whether the generated video is logically consistent and free from artifacts or visual glitches. Without seeing the original prompt, users were asked: "Which image feels less weird or unnatural when you look closely? I.e., has fewer strange-looking visual errors or glitches?"
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3 class="score-amount">Aurora </h3>
<div class="score-percentage">Score: 100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/k10zjbxD-gAaluB1awhHr.png" width=500>
</div>
<div>
<h3 class="score-amount">Dalle-3 </h3>
<div class="score-percentage">Score: 0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/3qB-eDmruMhSwJZz2oyXf.jpeg" width=500>
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3 class="score-amount">Aurora </h3>
<div class="score-percentage">Score: 0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/GOixvtA3ZFKhVCQzRtc_e.png" width=500>
</div>
<div>
<h3 class="score-amount">Flux-1</h3>
<div class="score-percentage">Score: 100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/muNHD0yHGJdXmxsKzLqAz.jpeg" width=500>
</div>
</div>
</div>
</div>
## Preference
The preference score reflects how visually appealing participants found each image, independent of the prompt. Users were asked: "Which image do you prefer?"
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3 class="score-amount">Aurora</h3>
<div class="score-percentage">Score: 100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/PQIoMJrzWEb-p0GHVFQvY.png" width=500>
</div>
<div>
<h3 class="score-amount">Stable-diffusion</h3>
<div class="score-percentage">Score: 0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/mDaGW2DP3rWdTvxUfta94.jpeg" width=500>
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3 class="score-amount">Aurora </h3>
<div class="score-percentage">Score: 0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/BwSswQHcos9N12ZtLj_qO.png" width=500>
</div>
<div>
<h3 class="score-amount">Flux-1 </h3>
<div class="score-percentage">Score: 100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/HSufN3gB6lFywcZue2Stw.jpeg" width=500>
</div>
</div>
</div>
</div>
## About Rapidata
Rapidata's technology makes collecting human feedback at scale faster and more accessible than ever before. Visit [rapidata.ai](https://www.rapidata.ai/) to learn more about how we're revolutionizing human feedback collection for AI development.
# Rapidata Aurora 偏好数据集
<a href="https://www.rapidata.ai">
<img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="400" alt="数据集可视化">
</a>
本**文本到图像(Text-to-Image, T2I)**数据集收录了来自8.6万余名标注者的40余万条人类反馈,仅用约2天时间便通过[Rapidata Python API](https://docs.rapidata.ai)完成采集,面向所有用户开放,是大规模模型评估的理想工具。
本次评估围绕偏好性、连贯性与对齐性三大维度展开。
可前往我们的[官方网站](https://www.rapidata.ai/benchmark)查看最新的模型排名榜单。
若您从本数据集获益并希望未来获取更多同类资源,欢迎为其点赞。
## 概述
本文本到图像数据集收录了来自8.6万余名标注者的40余万条人类反馈,采集周期仅约2天。本次评估围绕偏好性、连贯性与对齐性三大维度对Aurora进行测评。
本次评估采用1v1对比形式,将Aurora与其余6款模型进行比对,分别为Imagen-3、Flux-1.1-pro、Flux-1-pro、DALL-E 3、Midjourney-5.2以及Stable Diffusion 3。
## 数据采集
由于Aurora未开放API接口,因此所有图像均通过官方用户界面手动采集。每个模型名称后的日期代表该模型生成对应图像的时间。
## 对齐性
对齐性评分用于量化生成图像与输入提示词的匹配程度。用户被要求回答:"哪张图片更贴合描述文本?"
<div class="vertical-container">
<div class="container">
<div class="text-center">
<q>A black colored banana.</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Aurora </h3>
<div class="score-percentage">得分:100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/yQPSJEcWRfivcjrGp8krU.png" width=500>
</div>
<div>
<h3 class="score-amount">Midjourney </h3>
<div class="score-percentage">得分:0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/son8CR5OVrdegg7I4Mbco.jpeg" width=500>
</div>
</div>
</div>
<div class="container">
<div class="text-center">
<q>A blue and white cat next to a blanket and shelf with grey bottle.</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Aurora</h3>
<div class="score-percentage">得分:0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/gYDtYOZUhmSRyi-V_3OlJ.png" width=500>
</div>
<div>
<h3 class="score-amount">Flux-1.1</h3>
<div class="score-percentage">得分:100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/hX4mEeCyeL7rYCqin_gYD.jpeg" width=500>
</div>
</div>
</div>
</div>
## 连贯性
连贯性评分用于衡量生成图像是否具备逻辑自洽性,且无视觉伪影(artifacts)或视觉瑕疵。在不查看原始提示词的前提下,用户需要回答:"仔细观察后,哪张图片看起来更不怪异、更自然?换言之,哪张图片的视觉错误或瑕疵更少?"
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3 class="score-amount">Aurora </h3>
<div class="score-percentage">得分:100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/k10zjbxD-gAaluB1awhHr.png" width=500>
</div>
<div>
<h3 class="score-amount">Dalle-3 </h3>
<div class="score-percentage">得分:0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/3qB-eDmruMhSwJZz2oyXf.jpeg" width=500>
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3 class="score-amount">Aurora </h3>
<div class="score-percentage">得分:0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/GOixvtA3ZFKhVCQzRtc_e.png" width=500>
</div>
<div>
<h3 class="score-amount">Flux-1</h3>
<div class="score-percentage">得分:100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/muNHD0yHGJdXmxsKzLqAz.jpeg" width=500>
</div>
</div>
</div>
</div>
## 偏好性
偏好性评分用于反映参与者对每张图片的视觉美观程度评价,不受提示词影响。用户需要回答:"你更倾向于选择哪张图片?"
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3 class="score-amount">Aurora</h3>
<div class="score-percentage">得分:100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/PQIoMJrzWEb-p0GHVFQvY.png" width=500>
</div>
<div>
<h3 class="score-amount">Stable-diffusion</h3>
<div class="score-percentage">得分:0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/mDaGW2DP3rWdTvxUfta94.jpeg" width=500>
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3 class="score-amount">Aurora </h3>
<div class="score-percentage">得分:0%</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/BwSswQHcos9N12ZtLj_qO.png" width=500>
</div>
<div>
<h3 class="score-amount">Flux-1 </h3>
<div class="score-percentage">得分:100%</div>
<img style="border: 3px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/HSufN3gB6lFywcZue2Stw.jpeg" width=500>
</div>
</div>
</div>
</div>
## 关于Rapidata
Rapidata的技术让大规模人类反馈采集比以往任何时候都更加快捷、易用。请访问[rapidata.ai](https://www.rapidata.ai/),了解我们如何革新AI开发中的人类反馈采集流程。
提供机构:
maas
创建时间:
2025-01-31



