five

RICO-Screen2Words

收藏
魔搭社区2025-12-05 更新2025-11-08 收录
下载链接:
https://modelscope.cn/datasets/rootsautomation/RICO-Screen2Words
下载链接
链接失效反馈
官方服务:
资源简介:
# Dataset Card for Screen2Words Screen2Words is a dataset providing screen summaries (i.e., image captions for mobile screens). It uses the RICO image database. ## Dataset Details ### Dataset Description - **Curated by:** Google Research, UIUC, Northwestern, University of Toronto - **Funded by:** Google Research - **Shared by:** Google Research - **Language(s) (NLP):** English - **License:** CC-BY-4.0 ### Dataset Sources - **Repository:** - [google-research-datasets/screen2words](https://github.com/google-research-datasets/screen2words) - [RICO raw downloads](http://www.interactionmining.org/rico.html) - **Paper:** - [Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning](https://arxiv.org/abs/2108.03353) - [Rico: A Mobile App Dataset for Building Data-Driven Design Applications](https://dl.acm.org/doi/10.1145/3126594.3126651) ## Uses This dataset is for developing multimodal automations for mobile screens. ### Direct Use - Automatic screen summarization & description - Language-Based UI retreival (given a UI, retreive similar interfaces) - Enhancing screen readers - Screen indexing - Conversational mobile applications ## Dataset Structure - `screenId`: Unique RICO screen ID - `image`: RICO screenshot - `image_icon`: Google Play Store icon for the app - `image_semantic`: Semantic RICO screenshot; details are abstracted away to main visual UI elements - `file_name`: Image local filename - `file_name_icon`: Icon image local filename - `file_name_semantic`: Screenshot Image as a semantic annotated image local filename - `captions`: A list of string captions - `app_package_name`: Android package name - `play_store_name`: Google Play Store name - `category`: Type of category of the app - `number_of_downloads`: Number of downloads of the app (as a coarse range string) - `number_of_ratings`: Number of ratings of the app on the Google Play store (as of collection) - `average_rating`: Average rating of the app on the Google Play Store (as of collection) - `semantic_annotations`: Reduced view hierarchy, to the semantically-relevant portions of the full view hierarchy. It corresponds to what is visualized in `image_semantic` and has a lot of details about what's on screen. It is stored as a JSON object string. - `view_hierarchy`: Full view-hierarchy ## Dataset Creation ### Curation Rationale - RICO rationale: Create a broad dataset that can be used for UI automation. An explicit goal was to develop automation software that can validate an app's design and assess whether it achieves its stated goal. - Screen2Words rationale: Create a dataset that facilities the distillation of screenshots into concise summaries ### Source Data - RICO: Mobile app screenshots, collected on Android devices. - Screen2Words: Human annotated screen summaries from paid contractors. #### Data Collection and Processing - RICO: Human and automated collection of Android screens. ~9.8k free apps from the Google Play Store. - Screen2Words: Takes the subset of screens used in RICO-SCA, which eliminates screens with missing or inaccurate view hierarchies. #### Who are the source data producers? - RICO: 13 human workers (10 from the US, 3 from the Philippines) through UpWork. - Screen2Words: 85 professional annotators ## Citation ### RICO **BibTeX:** ```misc @inproceedings{deka2017rico, title={Rico: A mobile app dataset for building data-driven design applications}, author={Deka, Biplab and Huang, Zifeng and Franzen, Chad and Hibschman, Joshua and Afergan, Daniel and Li, Yang and Nichols, Jeffrey and Kumar, Ranjitha}, booktitle={Proceedings of the 30th annual ACM symposium on user interface software and technology}, pages={845--854}, year={2017} } ``` **APA:** Deka, B., Huang, Z., Franzen, C., Hibschman, J., Afergan, D., Li, Y., ... & Kumar, R. (2017, October). Rico: A mobile app dataset for building data-driven design applications. In Proceedings of the 30th annual ACM symposium on user interface software and technology (pp. 845-854). ### Screen2Words **BibTeX:** ```misc @inproceedings{wang2021screen2words, title={Screen2words: Automatic mobile UI summarization with multimodal learning}, author={Wang, Bryan and Li, Gang and Zhou, Xin and Chen, Zhourong and Grossman, Tovi and Li, Yang}, booktitle={The 34th Annual ACM Symposium on User Interface Software and Technology}, pages={498--510}, year={2021} } ``` **APA:** Wang, B., Li, G., Zhou, X., Chen, Z., Grossman, T., & Li, Y. (2021, October). Screen2words: Automatic mobile UI summarization with multimodal learning. In The 34th Annual ACM Symposium on User Interface Software and Technology (pp. 498-510). ## Dataset Card Authors Hunter Heidenreich, Roots Automation ## Dataset Card Contact hunter "DOT" heidenreich "AT" rootsautomation "DOT" com

# Screen2Words 数据集卡片 Screen2Words是一款用于生成屏幕摘要(即移动设备屏幕的图像描述文本)的数据集,其基于RICO(RICO)图像数据库构建。 ## 数据集详情 ### 数据集描述 - **整理方:** 谷歌研究院(Google Research)、伊利诺伊大学厄巴纳-香槟分校(UIUC)、西北大学、多伦多大学 - **资助方:** 谷歌研究院(Google Research) - **发布方:** 谷歌研究院(Google Research) - **(自然语言处理)所用语言:** 英语 - **授权协议:** CC-BY-4.0 ### 数据集来源 - **代码仓库:** - [google-research-datasets/screen2words](https://github.com/google-research-datasets/screen2words) - [RICO 原始数据下载](http://www.interactionmining.org/rico.html) - **相关论文:** - [Screen2Words:基于多模态学习的移动用户界面自动摘要](https://arxiv.org/abs/2108.03353) - [RICO:用于构建数据驱动设计应用的移动应用数据集](https://dl.acm.org/doi/10.1145/3126594.3126651) ## 用途 本数据集用于开发移动屏幕的多模态自动化应用。 ### 直接用途 - 自动屏幕摘要与描述 - 基于语言的用户界面(UI)检索(给定一个用户界面,检索相似的应用界面) - 优化屏幕阅读器 - 屏幕索引 - 对话式移动应用 ## 数据集结构 - `screenId`:唯一的RICO屏幕标识符 - `image`:RICO屏幕截图 - `image_icon`:该应用的谷歌应用商店(Google Play Store)图标 - `image_semantic`:语义化RICO屏幕截图,仅保留主要可视化用户界面元素,其余细节已抽象化 - `file_name`:图像本地文件名 - `file_name_icon`:图标图像本地文件名 - `file_name_semantic`:语义标注版屏幕截图的本地文件名 - `captions`:字符串形式的描述文本列表 - `app_package_name`:安卓应用包名 - `play_store_name`:谷歌应用商店(Google Play Store)应用名称 - `category`:应用所属类别 - `number_of_downloads`:应用下载量(以粗略范围字符串形式存储) - `number_of_ratings`:截至数据收集时,该应用在谷歌应用商店的评分数量 - `average_rating`:截至数据收集时,该应用在谷歌应用商店的平均评分 - `semantic_annotations`:简化版视图层级,仅保留完整视图层级中与语义相关的部分,与`image_semantic`中的可视化内容对应,包含屏幕上所有元素的详细信息,以JSON对象字符串形式存储 - `view_hierarchy`:完整视图层级 ## 数据集构建 ### 整理初衷 - RICO数据集整理初衷:构建可用于用户界面自动化的大规模数据集,其核心目标是开发能够验证应用设计是否达成既定目标的自动化软件。 - Screen2Words数据集整理初衷:构建用于将屏幕截图提炼为简洁摘要的数据集。 ### 源数据 - RICO数据集:安卓设备上收集的移动应用屏幕截图。 - Screen2Words数据集:来自付费外包人员的人工标注屏幕摘要。 #### 数据收集与处理 - RICO数据集:通过人工与自动化方式收集安卓屏幕数据,涵盖谷歌应用商店中约9.8万个免费应用。 - Screen2Words数据集:选取RICO-SCA中使用的屏幕子集,剔除了视图层级缺失或不准确的屏幕。 #### 源数据生产者 - RICO数据集:通过UpWork平台招募的13名标注人员(10名来自美国,3名来自菲律宾)。 - Screen2Words数据集:85名专业标注人员。 ## 引用 ### RICO **BibTeX格式:** misc @inproceedings{deka2017rico, title={Rico: A mobile app dataset for building data-driven design applications}, author={Deka, Biplab and Huang, Zifeng and Franzen, Chad and Hibschman, Joshua and Afergan, Daniel and Li, Yang and Nichols, Jeffrey and Kumar, Ranjitha}, booktitle={Proceedings of the 30th annual ACM symposium on user interface software and technology}, pages={845--854}, year={2017} } **APA格式:** Deka, B., Huang, Z., Franzen, C., Hibschman, J., Afergan, D., Li, Y., ... & Kumar, R. (2017, October). Rico: A mobile app dataset for building data-driven design applications. In Proceedings of the 30th annual ACM symposium on user interface software and technology (pp. 845-854). ### Screen2Words **BibTeX格式:** misc @inproceedings{wang2021screen2words, title={Screen2words: Automatic mobile UI summarization with multimodal learning}, author={Wang, Bryan and Li, Gang and Zhou, Xin and Chen, Zhourong and Grossman, Tovi and Li, Yang}, booktitle={The 34th Annual ACM Symposium on User Interface Software and Technology}, pages={498--510}, year={2021} } **APA格式:** Wang, B., Li, G., Zhou, X., Chen, Z., Grossman, T., & Li, Y. (2021, October). Screen2words: Automatic mobile UI summarization with multimodal learning. In The 34th Annual ACM Symposium on User Interface Software and Technology (pp. 498-510). ## 数据集卡片作者 亨特·海登赖希(Hunter Heidenreich)、Roots自动化公司(Roots Automation) ## 数据集卡片联系方式 hunter.heidenreich@rootsautomation.com
提供机构:
maas
创建时间:
2025-10-14
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
RICO-Screen2Words是一个基于RICO图像数据库的移动屏幕摘要数据集,用于开发多模态自动化应用,如自动屏幕摘要和UI检索。数据集包含丰富的结构信息和详细标注,由Google Research等多个机构共同策划。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作