five

e1879/showui-web-processed

收藏
Hugging Face2026-03-24 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/e1879/showui-web-processed
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: [en] license: cc-by-4.0 task_categories: [image-classification] tags: [ui-grounding, web, showui, processed] source_datasets: [showlab/ShowUI-web] size_categories: [10K<n<100K] --- # ShowUI-Web Processed Flattened, normalized, and scenario-split version of [showlab/ShowUI-web](https://huggingface.co/datasets/showlab/ShowUI-web). Each row is a single (instruction, UI element) pair with normalized bounding-box coordinates. ## Schema | Column | Type | Description | |---|---|---| | `sample_id` | string | Unique row identifier (`{row}_{element}`) | | `screenshot_id` | string | Groups elements from the same screenshot | | `image_relpath` | string | Relative path to the screenshot image | | `scenario` | string | Website/domain inferred from the image path | | `instruction` | string | Natural-language grounding instruction | | `bbox_xyxy` | list[float] | Normalized bounding box `[x1, y1, x2, y2]` in `[0, 1]` | | `point_xy` | list[float] or null | Normalized click point `[x, y]` | | `element_type` | string or null | UI element type label | ## Splits | Split | Rows | Strategy | |---|---|---| | train | majority | Scenario-based holdout | | validation | ~10% scenarios | Domain holdout | | test | ~15% scenarios | Domain holdout | ## Repository Layout The dataset repo contains both row-level parquet artifacts and image files: - `flat.parquet` — full flattened table (all rows) - `splits/train.parquet` — train split - `splits/val.parquet` — validation split - `splits/test.parquet` — test split - `splits/splits.json` — split metadata - `images/...` — screenshot and UI metadata files ## Images Screenshot images are hosted in the `images/` directory of this repository. Use `image_relpath` to construct the path or fetch individual images on demand: ```python from huggingface_hub import hf_hub_download from PIL import Image path = hf_hub_download( repo_id="e1879/showui-web-processed", repo_type="dataset", filename=f"images/{row['image_relpath']}", ) img = Image.open(path) ``` ## Usage Load train split via `datasets`: ```python from datasets import load_dataset ds = load_dataset("e1879/showui-web-processed") print(ds["train"][0]["instruction"]) ``` Or load parquet artifacts directly from the dataset repo: ```python import pandas as pd from huggingface_hub import hf_hub_download train_path = hf_hub_download( repo_id="e1879/showui-web-processed", repo_type="dataset", filename="splits/train.parquet", ) train_df = pd.read_parquet(train_path) print(train_df.shape) ``` ## Credit Source dataset: [showlab/ShowUI-web](https://huggingface.co/datasets/showlab/ShowUI-web)
提供机构:
e1879
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作