Sheelu1246/MolmoWeb-HumanSkills
收藏Hugging Face2026-03-31 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Sheelu1246/MolmoWeb-HumanSkills
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: sample_id
dtype: string
- name: instruction
dtype: string
- name: trajectory
dtype: string
- name: images
sequence: image
- name: image_paths
sequence: string
splits:
- name: train
num_examples: 115637
- name: preview
num_examples: 10
configs:
- config_name: default
data_files:
- split: train
path: data/train-*.parquet
- split: preview
path: data/preview-00000.parquet
---
# MolmoWeb-HumanSkills
A dataset of human collected web-navigation skills, where a skill is a trajectory for a very low level task (eg. find_and_open, fill_form). Each example pairs an instruction with a sequence of webpage screenshots and the corresponding agent actions (clicks, typing, scrolling, etc.).
## Dataset Usage
```python
from datasets import load_dataset
# load a single subset
ds = load_dataset("allenai/MolmoWeb-HumanSkills")
```
### Working with images and trajectories
Each row has an `images` field (list of raw image bytes) and a corresponding `image_paths` field (list of filenames). Use `image_paths` to match screenshots to trajectory steps:
```python
import json
row = ds[0]
traj = json.loads(row["trajectory"])
# build a lookup from filename -> image bytes
image_by_path = dict(zip(row["image_paths"], row["images"]))
for step_id in sorted(traj.keys(), key=int):
screenshot_name = traj[step_id].get("screenshot")
if not screenshot_name:
continue
img_bytes = image_by_path.get(screenshot_name)
# img_bytes is the raw PNG/JPEG data for this step
```
## Dataset Structure
### Features
| Field | Type | Description |
|---|---|---|
| `sample_id` | `string` | Unique hash identifying the trajectory |
| `instruction` | `string` | JSON-encoded task instruction (contains a `low_level` key or similar) |
| `trajectory` | `string` | JSON-encoded trajectory: a dict keyed by step index, each entry containing the agent's parsed action and screenshot filename |
| `images` | `list[bytes]` | List of screenshot structs; `bytes` is the raw image data, `path` is the filename used to match against trajectory steps |
| `image_paths` | `list[path]` | List of paths to screenshots used to match against trajectory steps |
### Trajectory step structure
Each step in `trajectory` (keyed by step index) contains:
| Field | Type | Description |
|---|---|---|
| `screenshot` | `string` | Filename matching an entry in the `images` list |
| `action` | `dict` | The agent action: `action_str` (parseable action string), `action_description` (natural language), and `action_output` (structured dict with `thought`, `action_name`, and action parameters) |
| `other_obs` | `dict` | Browser state: current `url`, `page_index`, `open_pages_titles`, `open_pages_urls` |
| `action_timestamp` | `float` | Unix timestamp of the action |
## License
This dataset is licensed under ODC-BY 1.0. It is intended for research and educational use in accordance with [Ai2's Responsible Use Guidelines](https://allenai.org/responsible-use). Instruction data was generated using GPT models, which are subject to [OpenAI's Terms of Use](https://openai.com/policies/row-terms-of-use/).
提供机构:
Sheelu1246



