five

WebGuard

收藏
魔搭社区2025-12-05 更新2025-08-02 收录
下载链接:
https://modelscope.cn/datasets/osunlp/WebGuard
下载链接
链接失效反馈
官方服务:
资源简介:
# WebGuard Annotation Dataset WebGuard Dataset This dataset contains web safety annotations for browser interactions. Each entry represents an annotated action on a website with a risk level. Fields: - url: The URL where the action was performed - description: Description of the action (may be null) - tagHead: HTML tag type of the target element - Screenshot: Google Drive link to screenshot view - Annotation: Review classification (SAFE/UNSAFE/LOW/HIGH) - website: Website name/category ## Dataset Summary This dataset contains 5,999 web safety annotations for browser interactions. ## Data Fields - `url`: The URL where the action was performed - `description`: Description of the action (may be null) - `tagHead`: HTML tag type of the target element - `Screenshot`: Google Drive link to screenshot view - `Annotation`: Review classification (SAFE/UNSAFE/LOW/HIGH) - `website`: Website name/category ## Usage ```python from datasets import load_dataset # Load the dataset dataset = load_dataset("osunlp/WebGuard") # Access the data for example in dataset["train"]: print(f"URL: {example['url']}") print(f"Description: {example['description']}") print(f"Tag: {example['tagHead']}") print(f"Screenshot: {example['Screenshot']}") print(f"Annotation: {example['Annotation']}") print(f"Website: {example['website']}") print("---") ``` ## Citation ```bibtex @article{zheng2025webguard, title={WebGuard: Building a Generalizable Guardrail for Web Agents}, author={Zheng, Boyuan and Liao, Zeyi and Salisbury, Scott and Liu, Zeyuan and Lin, Michael and Zheng, Qinyuan and Wang, Zifan and Deng, Xiang and Song, Dawn and Sun, Huan and others}, journal={arXiv preprint arXiv:2507.14293}, year={2025} } @inproceedings{zheng-etal-2024-webolympus, title = "{W}eb{O}lympus: An Open Platform for Web Agents on Live Websites", author = "Zheng, Boyuan and Gou, Boyu and Salisbury, Scott and Du, Zheng and Sun, Huan and Su, Yu", editor = "Hernandez Farias, Delia Irazu and Hope, Tom and Li, Manling", booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations", month = nov, year = "2024", address = "Miami, Florida, USA", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.emnlp-demo.20", pages = "187--197", } ``` ## License Creative Commons Attribution-NonCommercial 4.0 International

# WebGuard标注数据集 WebGuard数据集 本数据集涵盖针对浏览器交互的网页安全标注内容,每条数据对应网站上一项附带风险等级的已标注操作。 ### 字段说明 - url:执行该操作的网页URL - description:操作描述(可为空值) - tagHead:目标元素的HTML标签类型 - Screenshot:指向截图的谷歌云端硬盘(Google Drive)链接 - Annotation:审核分类(SAFE/UNSAFE/LOW/HIGH) - website:网站名称/类别 ## 数据集概览 本数据集共包含5999条浏览器交互的网页安全标注数据。 ## 数据字段 - `url`:执行该操作的网页URL - `description`:操作描述(可为空值) - `tagHead`:目标元素的HTML标签类型 - `Screenshot`:指向截图的谷歌云端硬盘(Google Drive)链接 - `Annotation`:审核分类(SAFE/UNSAFE/LOW/HIGH) - `website`:网站名称/类别 ## 使用方法 python from datasets import load_dataset # 加载数据集 dataset = load_dataset("osunlp/WebGuard") # 访问数据 for example in dataset["train"]: print(f"URL: {example['url']}") print(f"Description: {example['description']}") print(f"Tag: {example['tagHead']}") print(f"Screenshot: {example['Screenshot']}") print(f"Annotation: {example['Annotation']}") print(f"Website: {example['website']}") print("---") ## 引用 bibtex @article{zheng2025webguard, title={WebGuard: Building a Generalizable Guardrail for Web Agents}, author={Zheng, Boyuan and Liao, Zeyi and Salisbury, Scott and Liu, Zeyuan and Lin, Michael and Zheng, Qinyuan and Wang, Zifan and Deng, Xiang and Song, Dawn and Sun, Huan and others}, journal={arXiv preprint arXiv:2507.14293}, year={2025} } @inproceedings{zheng-etal-2024-webolympus, title = "{W}eb{O}lympus: An Open Platform for Web Agents on Live Websites", author = "Zheng, Boyuan and Gou, Boyu and Salisbury, Scott and Du, Zheng and Sun, Huan and Su, Yu", editor = "Hernandez Farias, Delia Irazu and Hope, Tom and Li, Manling", booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations", month = nov, year = "2024", address = "Miami, Florida, USA", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.emnlp-demo.20", pages = "187--197", } ## 许可协议 知识共享署名-非商业性使用4.0国际许可协议(Creative Commons Attribution-NonCommercial 4.0 International)
提供机构:
maas
创建时间:
2025-07-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作