five

macpaw-research/Screen2AX-Task

收藏
Hugging Face2025-11-19 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/macpaw-research/Screen2AX-Task
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 dataset_info: features: - name: image dtype: image - name: x1 dtype: float32 - name: y1 dtype: float32 - name: x2 dtype: float32 - name: y2 dtype: float32 - name: image_width dtype: int32 - name: image_height dtype: int32 - name: command dtype: string - name: visual_description dtype: string splits: - name: train num_bytes: 1463276646.793 num_examples: 5933 download_size: 712698894 dataset_size: 1463276646.793 configs: - config_name: default data_files: - split: train path: data/train-* language: - en pretty_name: Screen2AX-Task size_categories: - 1K<n<10K --- # 📦 Screen2AX-Task Screen2AX-Task is part of the **Screen2AX** dataset suite, a research-driven collection for advancing accessibility in macOS applications using computer vision and deep learning. This dataset focuses on **UI task grounding**, pairing macOS application screenshots with task descriptions and their corresponding visual references. It is designed for training/evaluating models that connect natural language commands to on-screen UI regions. --- ## 🧠 Dataset Summary Each sample in the dataset consists of: - An application **screenshot** (`image`) - A **bounding box** for the target UI region: - `x1`, `y1`, `x2`, `y2`: absolute coordinates - `image_width`, `image_height`: Dimensions of the original image - A **task description** (`command`): Natural language command for a specific UI action - A **visual description** (`visual_description`): Caption of the UI target This dataset supports tasks such as **language grounding**, **UI element linking**, and **vision-language model training** for accessibility applications. **Split:** - `train` **Language:** - English (`en`) **Task Category:** - Vision-language / UI task grounding --- ## 📚 Usage ### Load with `datasets` library ```python from datasets import load_dataset dataset = load_dataset("macpaw-research/Screen2AX-Task") ``` ### Example structure ```python sample = dataset["train"][0] print(sample.keys()) # dict_keys(['image', 'x1', 'y1', 'x2', 'y2', 'image_width', 'image_height', 'command', 'visual_description']) ``` --- ## 📜 License This dataset is licensed under the **Apache 2.0 License**. --- ## 🔗 Related Projects - [Screen2AX Main Project Page](https://github.com/MacPaw/Screen2AX) - [Screen2AX HuggingFace Collection](https://huggingface.co/collections/macpaw-research/screen2ax) --- ## ✍️ Citation If you use this dataset, please cite the Screen2AX paper: ```bibtex @misc{muryn2025screen2axvisionbasedapproachautomatic, title={Screen2AX: Vision-Based Approach for Automatic macOS Accessibility Generation}, author={Viktor Muryn and Marta Sumyk and Mariya Hirna and Sofiya Garkot and Maksym Shamrai}, year={2025}, eprint={2507.16704}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2507.16704}, } ``` --- ## 🌐 MacPaw Research Learn more at [https://research.macpaw.com](https://research.macpaw.com)
提供机构:
macpaw-research
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作