five

8Planetterraforming/Cube-Multi-Object-Consistency-Dataset

收藏
Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/8Planetterraforming/Cube-Multi-Object-Consistency-Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit --- # Cube Multi-Object Consistency Dataset This project explores a structured visual reasoning problem: maintaining exact object count, geometry, indexing, and attribute consistency across a multi-object scene. ## Task Description The reference scene consists of **26 isometric cubes** arranged in a strict layout: - 6 cubes - 6 cubes - 6 cubes - 6 cubes - 2 cubes Each cube must: - preserve its position - preserve spacing - preserve geometry - contain a unique index from **1 to 26** - optionally include an additional attribute (e.g., letter or object like a planet) --- ## Goal The goal is to evaluate whether generative models can: - maintain exact object count - preserve spatial structure - correctly assign symbolic labels - maintain consistency when adding per-object detail --- ## Key Observation As visual and object-level complexity increases, model reliability decreases. When the model generates: - only cubes → mostly correct - cubes + numbers → small errors - cubes + numbers + letters → more errors - cubes + numbers + letters + unique objects → frequent failures This indicates that **multi-object consistency breaks as constraints increase**. --- ## Failure Modes Observed Across generated images, the following errors were observed: - duplicated numbers (e.g. repeated "2") - missing numbers (e.g. no "22") - incorrect ordering - incorrect row structure - merged or overlapping cubes - broken spacing - attribute mismatch (letter does not match number) - inconsistent mapping between object and label - hallucinated values (e.g. "29" instead of 24) --- ## Image Set (7 examples) ### 1. ✅ Correct Reference (Colab) **`image_01_colab_reference.png`** - generated programmatically - perfect geometry - correct layout: `6 / 6 / 6 / 6 / 2` - correct numbering 1–26 - no duplicates, no missing values ![Error 1](Corect-1.png) This image is the **ground truth**. --- ### 2. ⚠️ ChatGPT Generated (from scratch, not editing reference) **`image_02_chatgpt_generated.png`** ![Error 2](image_02_chatgpt_generated.png) - visually similar style - but incorrect: - duplicated numbers - missing numbers - broken layout Shows that **visual plausibility ≠ structural correctness** --- ### 3. ❌ Stylized Version (parchment attempt) **`image_03_stylized_fail.png`** ![Error 3](image_03_stylized_fail.png) - attempted aesthetic transformation - structure not preserved - numbering corrupted Failure cause: > model re-generated scene instead of preserving it --- ### 4. ❌ Multi-object (letters + numbers) **`image_04_letters_fail.png`** ![Error 4](image_04_letters_fail.png) - added letter labels (A–Z) - errors: - mismatch between letters and numbers - shifted assignments Failure cause: > increased symbolic complexity --- ### 5. ❌ Multi-object + visual attributes (planets) **`image_05_planets_fail.png`** ![Error 5](image_05_planets_fail.png) - each cube assigned unique planet-like object - errors increase significantly: - incorrect numbering - duplicated indices - attribute mismatch Failure cause: > per-object visual uniqueness breaks global consistency --- ### 6. ❌ High-detail multi-object scene **`image_06_complex_fail.png`** ![Error 6](iimage_06_complex_fail.png) - more detail per object - increased variation Observed: - structural drift - loss of alignment - incorrect assignments --- ### 7. ❌ Extreme case (combined constraints) **`image_07_extreme_fail.png`** ![Error 7](image_07_extreme_fail.png) - multiple constraints combined: - geometry - numbering - letters - unique objects Result: - model fails across multiple dimensions simultaneously --- ## Key Insight This dataset demonstrates a critical limitation: > Models can produce visually convincing outputs while failing at exact structured reasoning. As the number of simultaneous constraints increases, the probability of failure rises. --- ## Why This Matters Most visual evaluations focus on realism. However, real-world applications often require: - exact counting - exact indexing - strict spatial consistency - correct object-to-label mapping This dataset exposes failures that are: - subtle visually - but critical logically --- ## Conclusion The experiments show that: - simple scenes → mostly correct - structured scenes → partially correct - multi-object structured scenes → unstable This highlights the need for benchmarks that measure **precision, not just appearance**. --- ## Files
提供机构:
8Planetterraforming
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作