CMDBench

Name: CMDBench
Creator: 美金刚实验室
Published: 2024-06-02 09:10:41
License: 暂无描述

arXiv2024-06-02 更新2024-06-21 收录

下载链接：

https://github.com/megagonlabs/CMDBench

下载链接

链接失效反馈

官方服务：

资源简介：

CMDBench是由美金刚实验室创建的多模态数据发现基准，专注于复合AI系统中的粗到细数据发现。该数据集整合了来自Wikipedia的文档和表格数据，并引入了从Wikidata提取的知识图作为新的数据模态。CMDBench旨在通过模拟企业数据平台的复杂性，评估多模态数据检索器在实际环境中的性能。数据集的应用领域包括搜索、问答、聊天、事实检查等知识密集型任务，旨在解决企业数据平台中多模态数据源的发现挑战。

CMDBench is a multimodal data discovery benchmark created by Meijin Laboratory, focusing on coarse-to-fine data discovery in composite AI systems. This dataset integrates textual documents and tabular data from Wikipedia, and introduces knowledge graphs extracted from Wikidata as a new data modality. CMDBench aims to evaluate the performance of multimodal data retrievers in real-world scenarios by simulating the complexity of enterprise data platforms. Its application areas cover knowledge-intensive tasks such as search, question answering, chat, and fact-checking, and it is designed to address the challenge of multimodal data source discovery in enterprise data platforms.

提供机构：

美金刚实验室

创建时间：

2024-06-02

5,000+

优质数据集

54 个

任务类型

进入经典数据集