Sheelu1246/MolmoWeb-HumanSkills

Name: Sheelu1246/MolmoWeb-HumanSkills
Creator: Sheelu1246
Published: 2026-03-31 15:50:38
License: 暂无描述

Hugging Face2026-03-31 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/Sheelu1246/MolmoWeb-HumanSkills

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: sample_id dtype: string - name: instruction dtype: string - name: trajectory dtype: string - name: images sequence: image - name: image_paths sequence: string splits: - name: train num_examples: 115637 - name: preview num_examples: 10 configs: - config_name: default data_files: - split: train path: data/train-*.parquet - split: preview path: data/preview-00000.parquet --- # MolmoWeb-HumanSkills A dataset of human collected web-navigation skills, where a skill is a trajectory for a very low level task (eg. find_and_open, fill_form). Each example pairs an instruction with a sequence of webpage screenshots and the corresponding agent actions (clicks, typing, scrolling, etc.). ## Dataset Usage ```python from datasets import load_dataset # load a single subset ds = load_dataset("allenai/MolmoWeb-HumanSkills") ``` ### Working with images and trajectories Each row has an `images` field (list of raw image bytes) and a corresponding `image_paths` field (list of filenames). Use `image_paths` to match screenshots to trajectory steps: ```python import json row = ds[0] traj = json.loads(row["trajectory"]) # build a lookup from filename -> image bytes image_by_path = dict(zip(row["image_paths"], row["images"])) for step_id in sorted(traj.keys(), key=int): screenshot_name = traj[step_id].get("screenshot") if not screenshot_name: continue img_bytes = image_by_path.get(screenshot_name) # img_bytes is the raw PNG/JPEG data for this step ``` ## Dataset Structure ### Features | Field | Type | Description | |---|---|---| | `sample_id` | `string` | Unique hash identifying the trajectory | | `instruction` | `string` | JSON-encoded task instruction (contains a `low_level` key or similar) | | `trajectory` | `string` | JSON-encoded trajectory: a dict keyed by step index, each entry containing the agent's parsed action and screenshot filename | | `images` | `list[bytes]` | List of screenshot structs; `bytes` is the raw image data, `path` is the filename used to match against trajectory steps | | `image_paths` | `list[path]` | List of paths to screenshots used to match against trajectory steps | ### Trajectory step structure Each step in `trajectory` (keyed by step index) contains: | Field | Type | Description | |---|---|---| | `screenshot` | `string` | Filename matching an entry in the `images` list | | `action` | `dict` | The agent action: `action_str` (parseable action string), `action_description` (natural language), and `action_output` (structured dict with `thought`, `action_name`, and action parameters) | | `other_obs` | `dict` | Browser state: current `url`, `page_index`, `open_pages_titles`, `open_pages_urls` | | `action_timestamp` | `float` | Unix timestamp of the action | ## License This dataset is licensed under ODC-BY 1.0. It is intended for research and educational use in accordance with [Ai2's Responsible Use Guidelines](https://allenai.org/responsible-use). Instruction data was generated using GPT models, which are subject to [OpenAI's Terms of Use](https://openai.com/policies/row-terms-of-use/).

提供机构：

Sheelu1246

5,000+

优质数据集

54 个

任务类型

进入经典数据集