five

ScaDSAI/tedrasim

收藏
Hugging Face2026-03-19 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/ScaDSAI/tedrasim
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - image-to-text - image-text-to-text language: - en tags: - synthetic - multimodal - scene-graph - spatial-reasoning - json - computer-vision - 3d pretty_name: TEDRASIM Dataset size_categories: - 10K<n<100K configs: - config_name: default drop_labels: true --- # TEDRASIM Dataset ## Dataset Summary This dataset is a multimodal training corpus for fine-tuning vision-language models to generate structured JSON scene-graph descriptions from rendered images. This dataset contains: - a synthetic dataset of 10,000 scenes, each rendered from 2 views (20,000 images total), - a small real-world dataset of 120 images. Each example consists of: - one image showing a object assembly, - a multi-turn chat-style prompt structure, - a target JSON string describing the scene in a canonical structured format. It is intended for research and development on structured visual reasoning, spatial reasoning, scene understanding, and image-to-JSON generation. --- ## Data Description The dataset contains images of a very specific class of toy-like 3D objects. These objects are: - composed of geometric primitives such as cubes, sphere, cones,... - arranged in simple spatial configurations - rendered from multiple viewpoints --- ## Task Definition The model is expected to generate a JSON scene graph describing the object by relative spatial relationships between a finite set of known fixed primitives Relationships are defined locally between touching primitives, for example: - "the blue cube is behind the green cone" - "the red cube is left of the orange cylinder" These relations are encoded explicitly in the JSON structure. --- ## Example Target Representation A simplified example of a scene description: ```json { "primitive_counts": { "red_cube": 2, "yellow_sphere": 1 }, "primitives": [ { "id": "P1", "color": "red", "shape": "cube", "neighbors": { "front": "P2", "back": "Empty Space", "left": "Empty Space", "right": "Empty Space", "up": "Empty Space", "down": "Empty Space" } }, { "id": "P2", "color": "yellow", "shape": "sphere", "neighbors": { "front": "Empty Space", "back": "P1", "left": "Empty Space", "right": "Empty Space", "up": "Empty Space", "down": "Empty Space" } } ] } ``` --- ## Dataset Structure ### Data Instances Each record in the JSONL files has the following structure: ```json { "image": "train/shard_0000/scene_000000_00.png", "messages": [ {"role": "system", "content": "..."}, {"role": "user", "content": "..."}, {"role": "assistant", "content": "{\"primitive_counts\": ..., \"primitives\": ...}"} ], "meta": { "scene_id": "scene_000000", "scene_hash": "...", "split": "train", "shard": "shard_0000", "view_id": "view_00", "num_primitives_in_scene": 4, "min_primitives": 1, "max_primitives": 6, "seed": 42, "attempt_index": 1, "accepted_index": 0 } } ``` ### Data Fields - image: relative path to the rendered image - messages: chat-style training structure - system: task instruction - user: input prompt - assistant: target JSON - meta: auxiliary metadata for traceability --- ## Splits The synthetic dataset is divided into: - train.jsonl - val.jsonl - test.jsonl The real dataset contains validation data only: - val.jsonl --- ## Repository Layout - synthetic/: synthetic dataset described here - real/: real-world dataset component - train.jsonl / val.jsonl / test.jsonl: split manifests --- ## License MIT
提供机构:
ScaDSAI
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作