five

RICO-WidgetCaptioning

收藏
魔搭社区2025-12-05 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/rootsautomation/RICO-WidgetCaptioning
下载链接
链接失效反馈
官方服务:
资源简介:
# Dataset Card for RICO Widget Captioning Widget Captioning is a dataset for providing captions for UI elements on mobile screens. It uses the RICO image database. ## Dataset Details ### Dataset Description - **Curated by:** Google Research, UIUC, Northwestern, Georgia Tech - **Funded by:** Google Research - **Shared by:** Google Research - **Language(s) (NLP):** English - **License:** CC-BY-4.0 ### Dataset Sources - **Repository:** - [google-research-datasets/widget-caption](https://github.com/google-research-datasets/widget-caption) - [RICO raw downloads](http://www.interactionmining.org/rico.html) - **Paper:** - [Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements](https://arxiv.org/abs/2010.04295) - [Rico: A Mobile App Dataset for Building Data-Driven Design Applications](https://dl.acm.org/doi/10.1145/3126594.3126651) ## Uses This dataset is for developing multimodal automations for mobile screens. ### Direct Use - Enhancing screen readers - Screen indexing - Conversational mobile applications - Q&A on screens ## Dataset Structure - `screenId`: Unique RICO screen ID - `image`: RICO screenshot - `image_icon`: Google Play Store icon for the app - `image_semantic`: Semantic RICO screenshot; details are abstracted away to main visual UI elements - `file_name`: Image local filename - `file_name_icon`: Icon image local filename - `file_name_semantic`: Screenshot Image as a semantic annotated image local filename - `captions`: A list of string captions - `bbox`: The bounding box for the widget being captioned, relatively scaled with the image size so that coordinates are in [0, 1] - `app_package_name`: Android package name - `play_store_name`: Google Play Store name - `category`: Type of category of the app - `number_of_downloads`: Number of downloads of the app (as a coarse range string) - `number_of_ratings`: Number of ratings of the app on the Google Play store (as of collection) - `average_rating`: Average rating of the app on the Google Play Store (as of collection) - `semantic_annotations`: Reduced view hierarchy, to the semantically-relevant portions of the full view hierarchy. It corresponds to what is visualized in `image_semantic` and has a lot of details about what's on screen. It is stored as a JSON object string. ## Dataset Creation ### Curation Rationale - RICO rationale: Create a broad dataset that can be used for UI automation. An explicit goal was to develop automation software that can validate an app's design and assess whether it achieves its stated goal. - Widget Captioning rationale: Create a dataset that helps machines reason about UI elements on screens ### Source Data - RICO: Mobile app screenshots, collected on Android devices. - Widget Captioning: Human annotated concise captions for widgets on screen #### Data Collection and Processing - RICO: Human and automated collection of Android screens. ~9.8k free apps from the Google Play Store. - Widget Captioning: Takes the subset of screens used in RICO, eliminates screens with missing or inaccurate view hierarchies. #### Who are the source data producers? - RICO: 13 human workers (10 from the US, 3 from the Philippines) through UpWork. - Widget Captioning: 5.4k annotators through Amazon Mechanical Turk ## Citation ### RICO **BibTeX:** ```misc @inproceedings{deka2017rico, title={Rico: A mobile app dataset for building data-driven design applications}, author={Deka, Biplab and Huang, Zifeng and Franzen, Chad and Hibschman, Joshua and Afergan, Daniel and Li, Yang and Nichols, Jeffrey and Kumar, Ranjitha}, booktitle={Proceedings of the 30th annual ACM symposium on user interface software and technology}, pages={845--854}, year={2017} } ``` **APA:** Deka, B., Huang, Z., Franzen, C., Hibschman, J., Afergan, D., Li, Y., ... & Kumar, R. (2017, October). Rico: A mobile app dataset for building data-driven design applications. In Proceedings of the 30th annual ACM symposium on user interface software and technology (pp. 845-854). ### Widget Captioning **BibTeX:** ```misc @inproceedings{li2020widget, title={Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements}, author={Li, Yang and Li, Gang and He, Luheng and Zheng, Jingjie and Li, Hong and Guan, Zhiwei}, booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, pages={5495--5510}, year={2020} } ``` **APA:** Li, Y., Li, G., He, L., Zheng, J., Li, H., & Guan, Z. (2020, November). Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 5495-5510). ## Dataset Card Authors Hunter Heidenreich, Roots Automation ## Dataset Card Contact hunter "DOT" heidenreich "AT" rootsautomation "DOT" com

# RICO 组件描述数据集卡片(RICO Widget Captioning) 组件描述数据集(Widget Captioning)是一款为移动屏幕上的用户界面(User Interface, UI)元素生成描述文本的数据集,其依托RICO图像数据库构建。 ## 数据集详情 ### 数据集描述 - **整理方:** 谷歌研究院(Google Research)、伊利诺伊大学厄巴纳-香槟分校(UIUC)、西北大学、佐治亚理工学院 - **资助方:** 谷歌研究院(Google Research) - **发布方:** 谷歌研究院(Google Research) - **语言(自然语言处理方向):** 英语 - **许可证:** CC-BY-4.0 ### 数据集来源 - **代码仓库:** - [google-research-datasets/widget-caption](https://github.com/google-research-datasets/widget-caption) - [RICO原始数据下载](http://www.interactionmining.org/rico.html) - **相关论文:** - [《组件描述:为移动用户界面元素生成自然语言描述》(Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements)](https://arxiv.org/abs/2010.04295) - [《RICO:用于构建数据驱动设计应用的移动应用数据集》(Rico: A Mobile App Dataset for Building Data-Driven Design Applications)](https://dl.acm.org/doi/10.1145/3126594.3126651) ## 应用场景 本数据集用于开发面向移动屏幕的多模态自动化系统。 ### 直接应用 - 优化屏幕阅读器功能 - 构建屏幕索引 - 开发对话式移动应用 - 构建屏幕相关问答系统 ## 数据集结构 - `screenId`:RICO屏幕的唯一标识符 - `image`:RICO应用截图 - `image_icon`:对应应用的谷歌应用商店(Google Play Store)图标 - `image_semantic`:语义化RICO截图:仅保留核心视觉用户界面元素,抽象化其余细节 - `file_name`:图像本地文件名 - `file_name_icon`:图标图像本地文件名 - `file_name_semantic`:语义标注版截图的本地文件名 - `captions`:字符串形式的描述文本列表 - `bbox`:待描述组件的边界框(bounding box),相对于图像尺寸进行归一化缩放,坐标范围为[0, 1] - `app_package_name`:安卓(Android)应用包名 - `play_store_name`:谷歌应用商店(Google Play Store)应用名称 - `category`:应用所属类别 - `number_of_downloads`:应用下载量(以粗略范围字符串形式存储) - `number_of_ratings`:谷歌应用商店(Google Play Store)中该应用的评分数量(数据采集时统计) - `average_rating`:谷歌应用商店(Google Play Store)中该应用的平均评分(数据采集时统计) - `semantic_annotations`:简化版视图层级,仅保留完整视图层级中与语义相关的部分,对应`image_semantic`中可视化的内容,包含屏幕上所有元素的详细信息,以JSON对象字符串形式存储。 ## 数据集构建 ### 构建初衷 - RICO数据集构建初衷:构建一个可用于用户界面自动化的大规模数据集,核心目标是开发能够验证应用设计并评估其是否达成既定目标的自动化软件。 - 组件描述数据集构建初衷:创建一款可帮助机器理解屏幕上用户界面元素的数据集。 ### 源数据 - RICO数据集:安卓(Android)设备上采集的移动应用截图。 - 组件描述数据集:针对屏幕上的组件由人工标注的简洁描述文本。 #### 数据采集与处理 - RICO数据集:通过人工与自动化手段采集安卓(Android)设备屏幕,涵盖谷歌应用商店(Google Play Store)中约9.8万个免费应用。 - 组件描述数据集:从RICO数据集的屏幕样本中筛选出视图层级完整且准确的子集。 #### 标注方信息 - RICO数据集:通过UpWork平台招募的13名标注人员(10名来自美国,3名来自菲律宾)。 - 组件描述数据集:通过亚马逊土耳其机器人(Amazon Mechanical Turk)平台招募的5.4万名标注人员。 ## 引用说明 ### RICO 数据集 **BibTeX格式:** misc @inproceedings{deka2017rico, title={Rico: A mobile app dataset for building data-driven design applications}, author={Deka, Biplab and Huang, Zifeng and Franzen, Chad and Hibschman, Joshua and Afergan, Daniel and Li, Yang and Nichols, Jeffrey and Kumar, Ranjitha}, booktitle={Proceedings of the 30th annual ACM symposium on user interface software and technology}, pages={845--854}, year={2017} } **APA格式:** Deka, B., Huang, Z., Franzen, C., Hibschman, J., Afergan, D., Li, Y., ... & Kumar, R. (2017, 10月). Rico: A mobile app dataset for building data-driven design applications. In Proceedings of the 30th annual ACM symposium on user interface software and technology (pp. 845-854). ### 组件描述数据集 **BibTeX格式:** misc @inproceedings{li2020widget, title={Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements}, author={Li, Yang and Li, Gang and He, Luheng and Zheng, Jingjie and Li, Hong and Guan, Zhiwei}, booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, pages={5495--5510}, year={2020} } **APA格式:** Li, Y., Li, G., He, L., Zheng, J., Li, H., & Guan, Z. (2020, 11月). Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 5495-5510). ## 数据集卡片作者 亨特·海德奈瑞克(Hunter Heidenreich),Roots自动化公司 ## 数据集卡片联系方式 hunter.heidenreich@rootsautomation.com
提供机构:
maas
创建时间:
2025-10-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作