cua-lite/Aguvis
收藏Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/cua-lite/Aguvis
下载链接
链接失效反馈官方服务:
资源简介:
---
license: other
tags:
- cua-lite
- gui
- sft
task_categories:
- image-text-to-text
configs:
- config_name: default
data_files:
- split: train
path:
- "*/*/train*parquet"
- "*/*/train/*.parquet"
- "*/*/train/*/*.parquet"
- split: validation
path:
- "*/*/validation*parquet"
- "*/*/validation/*.parquet"
- "*/*/validation/*/*.parquet"
- config_name: desktop
data_files:
- split: train
path:
- "desktop/*/train*parquet"
- "desktop/*/train/*.parquet"
- "desktop/*/train/*/*.parquet"
- split: validation
path:
- "desktop/*/validation*parquet"
- "desktop/*/validation/*.parquet"
- "desktop/*/validation/*/*.parquet"
- config_name: mobile
data_files:
- split: train
path:
- "mobile/*/train*parquet"
- "mobile/*/train/*.parquet"
- "mobile/*/train/*/*.parquet"
- split: validation
path:
- "mobile/*/validation*parquet"
- "mobile/*/validation/*.parquet"
- "mobile/*/validation/*/*.parquet"
- config_name: web
data_files:
- split: train
path:
- "web/*/train*parquet"
- "web/*/train/*.parquet"
- "web/*/train/*/*.parquet"
- split: validation
path:
- "web/*/validation*parquet"
- "web/*/validation/*.parquet"
- "web/*/validation/*/*.parquet"
- config_name: desktop-grounding-action
data_files:
- split: train
path:
- "desktop/grounding-action/train*parquet"
- "desktop/grounding-action/train/*.parquet"
- "desktop/grounding-action/train/*/*.parquet"
- split: validation
path:
- "desktop/grounding-action/validation*parquet"
- "desktop/grounding-action/validation/*.parquet"
- "desktop/grounding-action/validation/*/*.parquet"
- config_name: mobile-grounding-action
data_files:
- split: train
path:
- "mobile/grounding-action/train*parquet"
- "mobile/grounding-action/train/*.parquet"
- "mobile/grounding-action/train/*/*.parquet"
- split: validation
path:
- "mobile/grounding-action/validation*parquet"
- "mobile/grounding-action/validation/*.parquet"
- "mobile/grounding-action/validation/*/*.parquet"
- config_name: mobile-trajectory
data_files:
- split: train
path:
- "mobile/trajectory/train*parquet"
- "mobile/trajectory/train/*.parquet"
- "mobile/trajectory/train/*/*.parquet"
- split: validation
path:
- "mobile/trajectory/validation*parquet"
- "mobile/trajectory/validation/*.parquet"
- "mobile/trajectory/validation/*/*.parquet"
- config_name: web-grounding-action
data_files:
- split: train
path:
- "web/grounding-action/train*parquet"
- "web/grounding-action/train/*.parquet"
- "web/grounding-action/train/*/*.parquet"
- split: validation
path:
- "web/grounding-action/validation*parquet"
- "web/grounding-action/validation/*.parquet"
- "web/grounding-action/validation/*/*.parquet"
- config_name: web-trajectory
data_files:
- split: train
path:
- "web/trajectory/train*parquet"
- "web/trajectory/train/*.parquet"
- "web/trajectory/train/*/*.parquet"
- split: validation
path:
- "web/trajectory/validation*parquet"
- "web/trajectory/validation/*.parquet"
- "web/trajectory/validation/*/*.parquet"
---
# cua-lite/Aguvis
cua-lite preprocessed version of Aguvis (xlangai/aguvis-stage1 + aguvis-stage2) merged into one repo. Stage-1 contributes grounding:action sub-datasets (OmniAct, RICO, UI-RefExp, GUIEnv, SeeClick, WebUI); stage-2 contributes trajectory data (AITW, Android-Control, CoAT, GUIDE, MiniWoB). Both stages share the unified cua-lite SFT schema; the original stage boundary is preserved in metadata.others.
## Origin
- [https://huggingface.co/datasets/xlangai/aguvis-stage1](https://huggingface.co/datasets/xlangai/aguvis-stage1)
- [https://huggingface.co/datasets/xlangai/aguvis-stage2](https://huggingface.co/datasets/xlangai/aguvis-stage2)
## Load via `datasets`
```python
from datasets import load_dataset
# entire dataset
ds = load_dataset("cua-lite/Aguvis")
# just one platform
ds = load_dataset("cua-lite/Aguvis", "desktop")
# just one (platform, task_type) cohort
ds = load_dataset("cua-lite/Aguvis", "desktop-grounding-action")
```
You can also filter by `metadata.platform` / `metadata.task_type` /
`metadata.others.*` after loading; every row carries a rich `metadata`
struct (see schema below).
## Schema
Each row has these columns:
| column | type | notes |
|---|---|---|
| `image_ids` | list[string] | content-addressed ids (`<sha256>.<ext>`), enables cross-parquet / cross-dataset dedup |
| `images` | list[Image] | bytes embedded at HF push time; matches `image_ids` index-for-index |
| `messages` | list[struct] | OpenAI-style turns with `role` + structured `content` |
| `metadata` | struct | `{platform, task_type, split, others{...}}` |
Coordinate values in `messages` are normalized to `[0, 1000]` integers.
## Layout
```
<platform>/<task_type>/<split>.parquet # single-variant cohort
<platform>/<task_type>/<split>/<variant>.parquet # multi-variant cohort
<platform>/<task_type>/<split>/shard-NNNNN-of-NNNNN.parquet # + sharded single-variant
<platform>/<task_type>/<split>/<variant>/shard-NNNNN-of-NNNNN.parquet # + sharded multi-variant
```
- `platform` ∈ {desktop, mobile, web}
- `task_type` directory uses a hyphen where the metadata value uses a colon: `grounding-action/` → `grounding:action`
- `split` ∈ {train, validation} — `validation` is an in-distribution held-out slice (never used in training); `test` is reserved for out-of-distribution benchmark datasets
## Stats
| platform | task_type | variant | train | validation |
|---|---|---|---:|---:|
| desktop | grounding:action | omniact | 5,392 | 99 |
| mobile | grounding:action | ricoig16k | 15,774 | 359 |
| mobile | grounding:action | ricosca | 171,212 | 2,000 |
| mobile | grounding:action | ui_refexp | 15,268 | 356 |
| mobile | grounding:action | widget_cap | 99,485 | 1,940 |
| mobile | trajectory | aitw | 1,698 | 30 |
| mobile | trajectory | android_control | 12,318 | 263 |
| mobile | trajectory | coat | 1,306 | 23 |
| mobile | trajectory | guide | 595 | 12 |
| web | grounding:action | guienv | 325,972 | 2,000 |
| web | grounding:action | seeclick | 269,121 | 2,000 |
| web | grounding:action | seeclick_mi | 269,119 | 2,000 |
| web | grounding:action | webui | 56,302 | 1,087 |
| web | trajectory | miniwob | 1,775 | 31 |
## Image storage
Images are content-addressed by SHA-256 and deduplicated within this repo.
The `images` column on HuggingFace embeds raw bytes so the Hub viewer
renders thumbnails and `datasets.load_dataset` works out of the box.
For local workflows (SFT export, cross-dataset dedup, split rebalancing),
run [`reverse.py`](https://github.com/cua-lite/cua-lite/tree/main/scripts/hf_upload)
on a cloned repo: it extracts each unique `image_id` once to a shared
`image_store/<hash[:2]>/<hash>.<ext>` and rewrites the parquets to drop
the `images` column, so rows reference images by hash id only. The shared
store is reusable across datasets — the same image in two repos lands in
one file.
- Total unique images: **516,962**
- Store size: **209.74 GB**
## Notes
Sub-datasets (variants) may have heterogeneous source licenses. See metadata.others.source for provenance. Many Aguvis sub-datasets overlap with standalone cua-lite datasets (Mind2Web, AMEX, etc.); deduplicate before mixing for training.
## License & citation
See original datasets (xlangai/aguvis-stage1 and aguvis-stage2)
See https://aguvis-project.github.io/
提供机构:
cua-lite



