Dhanush944/ManiTwin-100K
收藏Hugging Face2026-03-20 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Dhanush944/ManiTwin-100K
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- robotics
- image-to-3d
- visual-question-answering
tags:
- robotics
- manipulation
- 3d-assets
- grasping
- simulation
- digital-twins
size_categories:
- 100K<n<1M
---
# ManiTwin-100K: Manipulation-Ready Digital Object Twins
<p align="center">
<a href="https://manitwin.github.io/"><b>Project Page</b></a> |
<a href="https://arxiv.org/abs/2603.16866"><b>Paper</b></a>
</p>
ManiTwin-100K is a large-scale dataset of manipulation-ready digital object twins designed for robotic manipulation research. Each object includes simulation-ready 3D meshes, physical properties, functional point annotations, grasp configurations, and rich language descriptions—all validated through physics-based simulation.
> **Note:** We are currently releasing approximately **1K sample objects** with a subset of the annotations for early access. The remaining objects will be released soon. Stay tuned!
## Key Features
- **Simulation-Ready**: All meshes are watertight, collision-ready, and directly deployable in physics simulators (Isaac Sim, SAPIEN, PyBullet)
- **Rich Annotations**: Functional points, grasp points, physical properties, and language descriptions
- **Verified Grasps**: 6-DoF grasp poses validated through physics simulation
- **Diverse Categories**: Kitchen items, tools, electronics, personal care, office supplies, household objects, and more
- **Real-World Scale**: Object dimensions span 5-50cm, covering typical manipulation scenarios
## Data Structure
Each object follows this directory structure:
```
{category}/{object_id}/
├── base_rescale.glb # Simulation-ready 3D mesh (GLB format)
├── base_rescale.usdz # 3D mesh (USDZ format)
├── caption.json # Language descriptions
└── manipulation_annotations.json # Consolidated manipulation annotations
```
## Annotation Format
### manipulation_annotations.json
The annotation file contains three top-level sections: `active` (manipulation actions), `passive` (container/placement targets), and `bounding_box` (geometric bounds).
```json
{
"active": {
"grasp": {
"id_0": {
"raw_id": 15,
"grasp_type": "enveloping",
"confidence": 0.95,
"rationale": "middle stable grip",
"grasp_scenario": "daily holding and transportation",
"ranking": ["grasp_37", "grasp_98", "grasp_54"]
}
},
"grasp_group": {
"format": "isaac_grasp",
"format_version": "1.0",
"grasps": {
"grasp_37": {
"confidence": 0.971,
"position": [0.099, -0.0001, -0.002],
"orientation": {
"w": 0.566,
"xyz": [0.446, -0.541, -0.431]
},
"tcp_position": [-0.002, -0.004, -0.0008],
"score": 0.0
}
}
},
"place": {
"id_0": {
"position": [0.0, 0.0, -0.03],
"rotation": [0.0, 0.0, 1.0],
"face": "-z",
"dimensions": [0.017, 0.017, 0.06],
"volume": 9.7e-06
}
},
"tool_use": {
"id_0": {
"id": 27,
"function": "cap seal",
"confidence": 0.98,
"rationale": "cap seals bottle",
"caption": "cap seal"
}
}
},
"passive": {
"placement": {
"id_0": {
"id": 1,
"description": "bottle base",
"confidence": 0.95,
"rationale": "flat bottom surface"
}
},
"mesh_info": {
"num_vertices": 247644,
"num_faces": 82548,
"is_watertight": false
}
},
"bounding_box": {
"min_bounds": [-0.008, -0.008, -0.03],
"max_bounds": [0.008, 0.008, 0.03],
"dimensions": [0.017, 0.017, 0.06],
"center": [0.0, 0.0, 0.0],
"placement_center": [0.0, 0.0, -0.03],
"placement_face": "-z",
"volume": 9.7e-06
}
}
```
**Key Fields:**
- `active.grasp`: VLM-selected grasp points with type, confidence, and ranked grasp IDs
- `active.grasp_group`: Simulation-verified 6-DoF grasp poses in Isaac format
- `active.place`: Placement position for scene layout generation
- `active.tool_use`: Functional points (handle, spout, cap, etc.)
- `passive.placement`: Container placement points for receiving objects
- `bounding_box`: Object bounds for collision detection
### caption.json
Contains diverse language descriptions for each object, split into `seen` (training) and `unseen` (zero-shot evaluation) sets.
```json
{
"seen": [
"small reflective object",
"globe commonly found in compact size",
"dark gray glass perfect sphere marble",
"sphere for paperweight activities",
"compact black reflective marble"
],
"unseen": [
"compact round object",
"sphere designed for paperweight",
"dark gray object used for decoration"
]
}
```
## Usage
### Loading with Python
```python
import json
import trimesh
# Load mesh
mesh = trimesh.load("kitchen/bottle_001/base_rescale.glb")
# Load annotations
with open("kitchen/bottle_001/manipulation_annotations.json") as f:
annotations = json.load(f)
# Access verified grasp poses
grasp_group = annotations["active"]["grasp_group"]
for grasp_id, grasp in grasp_group["grasps"].items():
position = grasp["position"] # [x, y, z] in meters
orientation = grasp["orientation"] # {"w": qw, "xyz": [qx, qy, qz]}
confidence = grasp["confidence"]
# Access functional points
tool_use = annotations["active"].get("tool_use", {})
for point_id, point in tool_use.items():
function = point["function"] # e.g., "cap seal", "liquid outlet"
# Load captions
with open("kitchen/bottle_001/caption.json") as f:
caption = json.load(f)
seen_descriptions = caption["seen"]
unseen_descriptions = caption["unseen"]
```
### Integration with Isaac Sim
```python
from omni.isaac.core.utils.stage import add_reference_to_stage
# Load asset into Isaac Sim
asset_path = "kitchen/bottle_001/base_rescale.usdz"
prim_path = "/World/Objects/bottle_001"
add_reference_to_stage(asset_path, prim_path)
```
## Applications
ManiTwin-100K supports various downstream applications:
- **Manipulation Data Generation**: Generate large-scale grasp and manipulation trajectories
- **Scene Layout Synthesis**: Create diverse multi-object scenes using placement annotations
- **Robotics VQA**: Train vision-language models for manipulation-focused question answering
- **Affordance Learning**: Train models to predict functional regions and grasp locations
- **Sim-to-Real Transfer**: Pre-train manipulation policies in simulation
## Citation
If this helps your research, consider citing:
```bibtex
@misc{ManiTwin2026,
title={ManiTwin: Scaling Data-Generation-Ready Digital Object Dataset to 100K},
author={Kaixuan Wang and Tianxing Chen and Jiawei Liu and Honghao Su and Shaolong Zhu and Minxuan Wang and Zixuan Li and Yue Chen and Huan-ang Gao and Yusen Qin and Jiawei Wang and Qixuan Zhang and Lan Xu and Jingyi Yu and Yao Mu and Ping Luo},
year={2026},
eprint={2603.16866},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2603.16866},
}
```
## Acknowledgments
ManiTwin-100K was constructed using the [ManiTwin](https://manitwin.github.io/) automated pipeline, which leverages state-of-the-art 3D generation models, vision-language models for annotation, and physics simulation for verification.
提供机构:
Dhanush944



