luca0621/amex-gelab-448
收藏Hugging Face2026-04-02 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/luca0621/amex-gelab-448
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: AMEX SFT
task_categories:
- image-text-to-text
- text-generation
tags:
- gui
- mobile
- navigation
- multimodal
- ge-lab
size_categories:
- 10K<n<100K
---
# AMEX SFT
This dataset is a packaged export of the local `amex_sft` directory for uploading to the Hugging Face Hub as a dataset repository.
## Source
- Source dataset roots:
- `/home1/irteam/data-vol1/amex_sft_hf_448` (3046 trajectories)
- Number of trajectory folders: `3046`
- Number of tar shards: `61`
- Trajectories per shard: `50`
## Layout
- `shards/*.tar`: tar shards containing trajectory folders
- `manifest.jsonl`: trajectory-to-shard index
- `dataset_info.json`: high-level metadata
Each tar shard preserves the original trajectory folder layout. For example:
```text
<source_root>/<trajectory_id>/
ui_structure.json
ui_structure_layer.json
trajectory_assets_manifest.json
action_coord/...
extracted_assets/...
```
## Why tar shards?
The raw source contains a very large number of small PNG files. Packaging them into tar shards makes Hub upload and downstream download much more reliable for large-scale storage.
## Notes
- This repository is intended for storage and reuse of the packaged dataset.
- If you want Hub-native previews and lighter access patterns, consider a future Parquet/WebDataset conversion.
提供机构:
luca0621



