Do Large Vision Language Models Undertstand 3D shapes? 3D object shape matching benchmark.
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14681298
下载链接
链接失效反馈官方服务:
资源简介:
Sample of the 3D shape matching benchmark used in Do large language vision models understand 3D shapes?
GITHUB project path
Benchmark for testing the ability of Vision Language Models (LVM) to recognize and match 3D objects of the exact same 3D shapes but with different orientation/materials/textures/ enviromnts and light conditions.
Files:
Test_Images.zip:
Sample of the test images used for the benchmark: Each folder contains images used for object/shape matching tests.The folder is structured: “class_dir/object_dir/image.jpg”The images in the leaf/edge subfolder all belong to the exact same 3D object instance but with different orientation/materials/environment.In addition to the jpg image, a png image with the same name is supplied that gives the mask of the object in the jpg image.
Example_Tests.zip:
Example for full image tests, Each of the subfolders contain an example test composed of a 4 panels image marked A-D. Where one (and only one) of the Panel B-C contains an object identical in 3D shape to the object in panel A, but with some difference in orientation, texture, or background and illumination. The test is to guess which panel.
The name of the image contains the correct answer. For each jpg image there is an additional .txt file that contains the query and the response of GPT 4o when given this question. This is a sample for the tests given in the paper.
Benchmark_Generation_Scripts.zip:
Code used to generate the benchmark.
The images supplied here are small samples of the benchmark. To generate the full benchmark use the code. For updated code see this repo.
Evaluation_Scripts.zip
Scripts used for evaluating various LVLM (GPT, LLAMA, CLaude,Gemini) on the benchmark.
Even More images: google drive, pcloud
创建时间:
2025-01-23



