CUBE-MT: A Cultural Benchmark for Multimodal Knowledge Graph Construction with Generative AI
收藏Figshare2025-05-13 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/CUBE-MT_A_Cultural_Benchmark_for_Multimodal_Knowledge_Graph_Construction_with_Generative_AI/29052230/1
下载链接
链接失效反馈官方服务:
资源简介:
Cultural heritage institutions (GLAM) have a societal role of guaranteeing access to their collections independently of location or background, as a pledge for knowledge equity. However, around one billion people (15% of the world’s population) experience some form of disability that hinders such access, especially when considering the multimedia gap: differences in the use of different types of media to convey content to different audiences. Knowledge Graphs are increasingly becoming more multimodal by supporting images and text. However, the extent to which they address the multimedia gap for people with disabilities is not well understood due to the lack of appropriate cultural evaluation frameworks. To address this, here we propose CUBE-MT, a benchmark and dataset that leverages generative AI to build Multimodal Knowledge Graphs (MMKGs) with surrogate, multimedia representations that adapt to the sensory capacities of cultural heritage collection users. We extend the CUBE (CUltural BEnchmark for Text-to-Image models) benchmark and provide 7 modalities (text, images, Braille, speech, music, and 3D models); a collection of prompts to account for their cultural awareness; and a dataset with the resulting MMKG linked to Wikidata. We show usage and evaluate the effectiveness of our approach in an expert survey and a user study with people with aphasia, focusing on perceptual and comprehension differences between original MMKGS objects and those generated by AI.
文化遗产机构(GLAM)肩负着不受地域与身份背景限制,向公众开放馆藏资源的社会职能,以此践行知识公平的承诺。然而全球约有10亿人(占世界总人口的15%)存在各类残疾障碍,这阻碍了他们获取馆藏资源,尤其在媒体鸿沟问题下更为突出——即不同类型媒体在面向多元受众传递内容时存在适配性差异。
知识图谱(Knowledge Graph)正逐步向多模态范式演进,目前已可支持图像与文本两类模态数据。但由于缺乏适配文化遗产领域的专属评估框架,当前学界对知识图谱在多大程度上能够弥合残疾群体面临的媒体鸿沟仍缺乏清晰认知。
为此,本研究提出CUBE-MT基准与数据集:该工作借助生成式AI(Generative AI)构建多模态知识图谱(Multimodal Knowledge Graphs,下文简称MMKGs),并生成适配文化遗产馆藏用户感官能力的替代型多媒体表征。我们对CUBE(CUltural BEnchmark for Text-to-Image models,面向文本到图像模型的文化基准测试集)基准进行扩展,新增7类模态:文本、图像、盲文、语音、音乐与3D模型;同时构建了具备文化感知性的提示词集合,并生成与维基数据(Wikidata)关联的多模态知识图谱数据集。我们通过专家调研与针对失语症患者的用户研究,验证了所提方法的可用性与有效性,重点对比了原始多模态知识图谱对象与AI生成对象在感知与理解层面的差异。
提供机构:
Penuela, Albert Merono; Jain, Nitisha
创建时间:
2025-05-13



