stanford-oval/churro
收藏Hugging Face2025-09-09 更新2025-09-13 收录
下载链接:
https://hf-mirror.com/datasets/stanford-oval/churro
下载链接
链接失效反馈官方服务:
资源简介:
CHURRO数据集是一个包含图片和文本信息的多元数据集,支持包括中文在内的多种语言。它包含了图片的宽度、高度、文件名、文本转录、主要语言、语言列表、主要脚本、脚本列表、文档类型和唯一的数据集标识符等特征。数据集分为训练集、验证集和测试集,大小在10K到100K之间,适用于图像到文本的任务。
The CHURRO dataset is a multi-modal dataset containing images and associated text information, supporting multiple languages including English. It includes features such as image width, height, file name, text transcription, main language, list of languages, main script, list of scripts, document type, and a unique dataset identifier. The dataset is split into training, validation, and test sets, and falls within the size category of 10K<n<100K, suitable for image-to-text tasks.
提供机构:
stanford-oval



