amirali1985/synthetic-shapes-3x6x7
收藏Hugging Face2026-03-24 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/amirali1985/synthetic-shapes-3x6x7
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- image-classification
- zero-shot-image-classification
- feature-extraction
tags:
- synthetic
- shapes
- clip
- steering-vectors
- representation-alignment
size_categories:
- 10K<n<100K
---
# Synthetic Shapes 3×6×7
A fully deterministic synthetic dataset of simple geometric shapes rendered as SVG images, with precomputed CLIP (ViT-B-32) embeddings for both text and images.
## Purpose
This dataset is designed for controlled experiments in **representation alignment** and **steering vector evaluation**. Because images are generated deterministically from a known combinatorial space, it provides a clean testbed where ground-truth structure is fully known.
## Construction
Each image contains **3 shapes** drawn from a pool of:
- **6 shape types**: circle, square, triangle, pentagon, hexagon, star
- **7 colors**: red, blue, green, yellow, orange, purple, black
Combinations are **unordered multisets with replacement** of size 3 from the 42 (shape, color) pairs, yielding **C(44, 3) = 13,244** unique images.
Shapes are placed at fixed positions (left, center, right) on a 224×224 white background. The ordering of shapes in the image follows the combinatorial enumeration order, but the text description is always **alphabetically sorted** (e.g., "blue triangle, red circle, yellow star").
### Generation pipeline
1. Enumerate all unordered multisets of size 3 from 42 (shape × color) pairs
2. Render each combination as an SVG → PNG (224×224)
3. Generate text descriptions (sorted alphabetically)
4. Compute CLIP ViT-B-32 (OpenAI) embeddings for all texts and images
5. All embeddings are **L2-normalized**, float32
## Schema
| Column | Type | Description |
|--------|------|-------------|
| `text` | string | Alphabetically sorted description, e.g. "blue triangle, red circle, yellow star" |
| `image` | PIL Image | 224×224 RGB rendering of the 3 shapes on white background |
| `clip_embedding_text` | list[float] | L2-normalized CLIP ViT-B-32 text embedding (dim 512) |
| `clip_embedding_image` | list[float] | L2-normalized CLIP ViT-B-32 image embedding (dim 512) |
## Statistics
| Metric | Value |
|--------|-------|
| Total samples | 13,244 |
| Unique texts | 13,244 |
| Shapes per image | 3 |
| Shape types | 6 (circle, square, triangle, pentagon, hexagon, star) |
| Colors | 7 (red, blue, green, yellow, orange, purple, black) |
| Shape occurrences | 6,622 each (uniform) |
| Color occurrences | 5,676 each (uniform) |
| Embedding model | OpenAI CLIP ViT-B-32 |
| Embedding dim | 512 |
## Usage
```python
from datasets import load_dataset
ds = load_dataset("amirali1985/synthetic-shapes-3x6x7", split="train")
# Access a sample
sample = ds[0]
print(sample["text"]) # "red circle, red circle, red circle"
sample["image"].show() # PIL Image
# Use precomputed embeddings
import numpy as np
text_emb = np.array(sample["clip_embedding_text"]) # (512,)
img_emb = np.array(sample["clip_embedding_image"]) # (512,)
```
## Source Code
Generated using the code in [NirmalenduPrakash/rep_alignment](https://github.com/NirmalenduPrakash/rep_alignment) — see `src/full_evals/synthetic_shapes/generate.py`.
## License
MIT
提供机构:
amirali1985



