five

Medli1108/Open_CaptchaWorld

收藏
Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Medli1108/Open_CaptchaWorld
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: apache-2.0 size_categories: - n<1K task_categories: - visual-document-retrieval tags: - Multimodal_Agents - Open_Source_CAPTCHAs --- # Open CaptchaWorld Dataset This dataset accompanies the paper [Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents](https://huggingface.co/papers/2505.24878). It contains 20 distinct CAPTCHA types, each testing different visual reasoning capabilities. The dataset is designed for evaluating the visual reasoning and interaction capabilities of Multimodal Large Language Model (MLLM)-powered agents. [Project Page](https://huggingface.co/spaces/OpenCaptchaWorld/platform) | [Github](https://github.com/Yaxin9Luo/Open_CaptchaWorld) The dataset includes: * **20 CAPTCHA Types**: A diverse set of visual puzzles testing various capabilities. See the Github repository for a full list. * **Web Interface**: A clean, intuitive interface for human or AI interaction. * **API Endpoints**: Programmatic access to puzzles and verification. This dataset is useful for benchmarking and improving multimodal AI agents' performance on CAPTCHA-like challenges, a crucial step in deploying web agents for real-world tasks. The data is structured for easy integration into research and development pipelines.
提供机构:
Medli1108
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作