five

GePBench|几何感知数据集|多模态大语言模型数据集

收藏
arXiv2024-12-31 更新2025-01-02 收录
几何感知
多模态大语言模型
下载链接:
http://arxiv.org/abs/2412.21036v1
下载链接
链接失效反馈
资源简介:
GePBench是由南京大学国家软件新技术重点实验室开发的一个新颖基准数据集,旨在评估多模态大语言模型在几何感知方面的能力。该数据集包含20,000张几何图形和250,000道多选题,涵盖了空间感知、形状理解等六个关键维度。数据集的构建过程通过专门的数据合成引擎生成结构化文本描述,并转化为几何图形,进而生成相应的多选题。GePBench的应用领域主要集中在提升多模态大语言模型在几何感知方面的基础能力,为复杂的视觉推理和决策任务奠定基础。
提供机构:
南京大学国家软件新技术重点实验室
创建时间:
2024-12-31
AI搜集汇总
数据集介绍
main_image_url
构建方式
GePBench数据集的构建采用了系统化的数据合成流程。首先,通过随机采样预定义的几何形状,并为其分配位置、大小和方向等属性,生成结构化的文本描述。随后,利用Matplotlib库将这些描述渲染为几何图形,并通过添加高斯噪声、椒盐噪声等视觉噪声来增加任务的难度。最后,基于生成的图形和文本描述,通过模板化的流程构建了涵盖六个关键维度的多项选择题,包括存在性、计数、位置、大小、参考和关系等。整个数据集包含20,000张几何图形和250,000道多项选择题,分为简单和困难两个难度级别。
使用方法
GePBench数据集的使用方法主要包括以下几个方面。首先,研究人员可以利用该数据集对多模态大语言模型进行几何感知能力的评估,通过分析模型在六个关键维度上的表现,识别其在空间感知和形状理解方面的不足。其次,数据集可以用于模型的训练和微调,通过引入几何感知任务,提升模型在复杂视觉任务中的表现。此外,研究人员还可以通过对比不同模型在简单和困难任务上的表现,深入探讨模型在处理复杂几何问题时的能力差异。最后,数据集的结果可以为模型架构的优化提供指导,帮助开发更强大的多模态大语言模型。
背景与挑战
背景概述
GePBench是由南京大学国家软件新技术重点实验室的研究团队于2024年提出的一项新型基准测试,旨在评估多模态大语言模型(MLLMs)在几何感知方面的能力。随着MLLMs在视觉与语言理解融合领域的显著进展,现有的基准测试主要关注于现实生活中的复杂场景,而忽略了在非日常环境中至关重要的基本感知技能,尤其是几何感知能力。GePBench通过生成几何图形及其对应的多选问题,专注于评估模型在空间关系和抽象视觉模式理解方面的表现。该数据集的构建基于一个专门的数据合成引擎,生成了20K张图像和250K个多选问题,涵盖了位置、大小、存在性、计数、参考和关系等六个关键维度。实验结果表明,当前最先进的MLLMs在几何感知任务中表现显著不足,而基于GePBench训练的模型在多种下游任务中表现出显著提升,凸显了几何感知作为高级多模态应用基础的重要性。
当前挑战
GePBench面临的挑战主要体现在两个方面。首先,在领域问题方面,现有的多模态大语言模型在几何感知任务中表现不佳,尤其是在空间推理和形状理解等基础视觉能力上存在显著缺陷。例如,GPT-4o和Claude-3.5-Sonnet在尺寸判断任务中的准确率分别仅为23.3%和20.8%,远低于随机猜测的水平。这表明当前模型在处理几何图形时缺乏基本的空间感知能力。其次,在数据构建过程中,GePBench通过数据合成引擎生成几何图形及其对应的多选问题,这一过程需要确保生成的图形具有多样性和复杂性,同时避免过度重叠或超出边界的形状,以保证数据的可解释性。此外,为了增加任务的难度,GePBench在部分图形中添加了视觉噪声,如高斯噪声和椒盐噪声,这进一步挑战了模型在视觉退化条件下的表现。这些挑战凸显了GePBench在评估和提升多模态大语言模型几何感知能力方面的重要价值。
常用场景
经典使用场景
GePBench数据集主要用于评估多模态大语言模型(MLLMs)在几何感知任务中的表现。通过生成包含几何图形的图像及其对应的多选问题,GePBench能够系统地测试模型在空间感知、形状理解等基础视觉能力上的表现。该数据集特别适用于研究模型在处理抽象几何图形时的能力,填补了现有基准测试在几何感知领域的空白。
解决学术问题
GePBench解决了多模态大语言模型在几何感知任务中的评估不足问题。现有基准测试主要关注现实场景中的复杂任务,而忽略了模型在基础几何感知能力上的表现。GePBench通过专注于几何图形,提供了对模型空间感知、形状理解等核心能力的全面评估,揭示了当前先进模型在这些任务中的显著缺陷,为未来的模型改进提供了重要参考。
实际应用
GePBench的实际应用场景广泛,特别是在需要精确空间感知和抽象视觉模式理解的领域。例如,在医学图像分析中,医生需要准确识别和解读复杂的几何结构;在化石分类中,研究人员依赖对几何形状的精确理解。通过提升模型在这些任务中的表现,GePBench为相关领域的自动化处理提供了技术支持,推动了多模态大语言模型在实际应用中的进一步发展。
数据集最近研究
最新研究方向
GePBench作为评估多模态大语言模型(MLLMs)几何感知能力的新型基准,近期研究聚焦于揭示当前先进模型在几何感知任务中的显著不足。尽管MLLMs在视觉与语言理解的融合方面取得了显著进展,但在处理几何形状、空间关系等基础感知任务时表现欠佳。GePBench通过生成结构化几何图形及其对应的多选问题,全面评估模型在空间感知、形状理解等维度的能力。研究表明,即使是表现最佳的模型如Gemini-1.5-pro,在几何感知任务中的平均准确率也仅为69.4%,凸显了模型在基础几何理解方面的迫切需求。此外,通过GePBench数据训练的模型在多种下游任务中表现出显著提升,进一步验证了几何感知在推动多模态应用中的重要性。
相关研究论文
  • 1
    GePBench: Evaluating Fundamental Geometric Perception for Multimodal Large Language Models南京大学国家软件新技术重点实验室 · 2024年
以上内容由AI搜集并总结生成
用户留言
有没有相关的论文或文献参考?
这个数据集是基于什么背景创建的?
数据集的作者是谁?
能帮我联系到这个数据集的作者吗?
这个数据集如何下载?
点击留言
数据主题
具身智能
数据集  4098个
机构  8个
大模型
数据集  439个
机构  10个
无人机
数据集  37个
机构  6个
指令微调
数据集  36个
机构  6个
蛋白质结构
数据集  50个
机构  8个
空间智能
数据集  21个
机构  5个
5,000+
优质数据集
54 个
任务类型
进入经典数据集
热门数据集

Figshare

Figshare是一个在线数据共享平台,允许研究人员上传和共享各种类型的研究成果,包括数据集、论文、图像、视频等。它旨在促进科学研究的开放性和可重复性。

figshare.com 收录

Canadian Census

**Overview** The data package provides demographics for Canadian population groups according to multiple location categories: Forward Sortation Areas (FSAs), Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs), Federal Electoral Districts (FEDs), Health Regions (HRs) and provinces. **Description** The data are available through the Canadian Census and the National Household Survey (NHS), separated or combined. The main demographic indicators provided for the population groups, stratified not only by location but also for the majority by demographical and socioeconomic characteristics, are population number, females and males, usual residents and private dwellings. The primary use of the data at the Health Region level is for health surveillance and population health research. Federal and provincial departments of health and human resources, social service agencies, and other types of government agencies use the information to monitor, plan, implement and evaluate programs to improve the health of Canadians and the efficiency of health services. Researchers from various fields use the information to conduct research to improve health. Non-profit health organizations and the media use the health region data to raise awareness about health, an issue of concern to all Canadians. The Census population counts for a particular geographic area representing the number of Canadians whose usual place of residence is in that area, regardless of where they happened to be on Census Day. Also included are any Canadians who were staying in that area on Census Day and who had no usual place of residence elsewhere in Canada, as well as those considered to be 'non-permanent residents'. National Household Survey (NHS) provides demographic data for various levels of geography, including provinces and territories, census metropolitan areas/census agglomerations, census divisions, census subdivisions, census tracts, federal electoral districts and health regions. In order to provide a comprehensive overview of an area, this product presents data from both the NHS and the Census. NHS data topics include immigration and ethnocultural diversity; aboriginal peoples; education and labor; mobility and migration; language of work; income and housing. 2011 Census data topics include population and dwelling counts; age and sex; families, households and marital status; structural type of dwelling and collectives; and language. The data are collected for private dwellings occupied by usual residents. A private dwelling is a dwelling in which a person or a group of persons permanently reside. Information for the National Household Survey does not include information for collective dwellings. Collective dwellings are dwellings used for commercial, institutional or communal purposes, such as a hotel, a hospital or a work camp. **Benefits** - Useful for canada public health stakeholders, for public health specialist or specialized public and other interested parties. for health surveillance and population health research. for monitoring, planning, implementation and evaluation of health-related programs. media agencies may use the health regions data to raise awareness about health, an issue of concern to all canadians. giving the addition of longitude and latitude in some of the datasets the data can be useful to transpose the values into geographical representations. the fields descriptions along with the dataset description are useful for the user to quickly understand the data and the dataset. **License Information** The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the [Data Library](https://www.johnsnowlabs.com/marketplace/) on John Snow Labs website. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes. **Included Datasets** - [Canadian Population and Dwelling by FSA 2011](https://www.johnsnowlabs.com/marketplace/canadian-population-and-dwelling-by-fsa-2011) - This Canadian Census dataset covers data on population, total private dwellings and private dwellings occupied by usual residents by forward sortation area (FSA). It is enriched with the percentage of the population or dwellings versus the total amount as well as the geographical area, province, and latitude and longitude. The whole Canada's population is marked as 100, referring to 100% for the percentages. - [Detailed Canadian Population Statistics by CMAs and CAs 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-cmas-and-cas-2011) - This dataset covers the population statistics of Canada by Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs). It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by FED 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-fed-2011) - This dataset covers the population statistics of Canada from 2011 by Federal Electoral District of 2013 Representation Order. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by Health Region 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-health-region-2011) - This dataset covers the population statistics of Canada by health region. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by Province 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-province-2011) - This dataset covers the population statistics of Canada by provinces and territories. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. **Data Engineering Overview** **We deliver high-quality data** - Each dataset goes through 3 levels of quality review - 2 Manual reviews are done by domain experts - Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints - Data is normalized into one unified type system - All dates, unites, codes, currencies look the same - All null values are normalized to the same value - All dataset and field names are SQL and Hive compliant - Data and Metadata - Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters - Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated - Data Updates - Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted **Our data is curated and enriched by domain experts** Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts: - Field names, descriptions, and normalized values are chosen by people who actually understand their meaning - Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset - Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations - The data is always kept up to date – even when the source requires manual effort to get updates - Support for data subscribers is provided directly by the domain experts who curated the data sets - Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution. **Need Help?** If you have questions about our products, contact us at [info@johnsnowlabs.com](mailto:info@johnsnowlabs.com).

Databricks 收录

中国省级灾害统计空间分布数据集(1999-2020年)

该数据集为中国省级灾害统计空间分布数据集,时间为1999-2020年。该数据集包含中国各省自然灾害、地质灾害、地震灾害、森林火灾、森林病虫鼠害、草原灾害六类灾害的详细数据。数据量为206MB,数据格式为excel。

国家地球系统科学数据中心 收录

LinkedIn Salary Insights Dataset

LinkedIn Salary Insights Dataset 提供了全球范围内的薪资数据,包括不同职位、行业、地理位置和经验水平的薪资信息。该数据集旨在帮助用户了解薪资趋势和市场行情,支持职业规划和薪资谈判。

www.linkedin.com 收录

Arizona Cities by Population

A dataset listing Arizona cities by population for 2024.

www.arizona-demographics.com 收录