five

ArtEmis v2.0

收藏
OpenDataLab2026-05-17 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/ArtEmis_v2_dot_0
下载链接
链接失效反馈
官方服务:
资源简介:
捕获视觉,语言和情感之间联系的数据集有限,导致对人类智力的情感方面缺乏理解。作为朝这个方向迈出的一步,最近引入了ArtEmis数据集,作为对图像的情感反应以及这些选择的情感的语言解释的大规模数据集。我们观察到对实例丰富的情绪有明显的情感偏见,这使得训练有素的神经说话者在描述代表性不足的情绪时不太准确。我们证明,以同样的方式收集新数据并不能有效缓解这种情绪偏见。为了解决这个问题,我们提出了一种对比数据收集方法,以平衡ArtEmis与新的互补数据集,从而使一对相似的图像具有对比的情绪 (一个正面和一个负面)。我们使用建议的方法收集了260,533个实例,我们将它们与ArtEmis结合起来,创建了数据集的第二次迭代。新的组合数据集被称为ArtEmis v2.0,具有平衡的情绪分布,并通过解释揭示了相关绘画中的更多细节。我们的实验表明,与有偏见的数据集相比,在新数据集上训练的神经说话者分别提高了20% 和7% 的苹果酒和流星评估指标。最后,我们还表明,在所有情绪类别中,神经说话者的每个情绪的表现都得到了改善,这在代表性不足的情绪上显着提高。

Datasets that capture the connections among vision, language, and emotion are scarce, leading to a limited understanding of the affective dimensions of human intelligence. As a step toward addressing this gap, the ArtEmis dataset was recently introduced as a large-scale corpus of affective responses to images and linguistic explanations for the emotions underlying these choices. We observe a pronounced emotional bias toward well-represented emotion categories, which causes trained neural speakers to generate less accurate descriptions for underrepresented emotions. We demonstrate that collecting new data via the same approach fails to effectively mitigate this emotional bias. To address this issue, we propose a contrastive data collection method to balance ArtEmis with a new complementary dataset, in which pairs of semantically similar images are assigned contrasting emotions (one positive and one negative). We collected 260,533 instances using the proposed method, and combined them with ArtEmis to create the second iteration of the dataset. The newly combined dataset, named ArtEmis v2.0, features a balanced emotional distribution and reveals more details about the associated artworks through linguistic explanations. Our experiments show that neural speakers trained on the new dataset achieve relative improvements of 20% and 7% on the CIDEr and METEOR evaluation metrics, respectively, compared to those trained on the biased original dataset. Finally, we also demonstrate that the performance of neural speakers improves across all emotion categories, with particularly significant gains for underrepresented emotions.
提供机构:
OpenDataLab
创建时间:
2023-02-13
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
ArtEmis v2.0是一个大规模数据集,旨在通过结合视觉、语言和情感来理解人类智力的情感方面。它采用对比数据收集方法,平衡了情绪分布,新增了260,533个实例,从而提高了神经说话者在苹果酒和流星评估指标上的表现,并在所有情绪类别中改善了性能。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作