LCO-Embedding/SeaDoc
收藏Hugging Face2025-10-16 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/LCO-Embedding/SeaDoc
下载链接
链接失效反馈官方服务:
资源简介:
SeaDoc数据集是一个针对东南亚语言的视觉文档检索任务设计的挑战性数据集。它旨在评估和增强以语言为中心的多模态嵌入框架,特别是在涉及多样化语言和视觉文档理解任务的低资源环境中。该数据集包含不同的配置,如语料库、默认、qrels和查询,每个配置都有特定的特征和分割。
The SeaDoc dataset is designed for challenging visual document retrieval tasks in Southeast Asian languages. It is intended to evaluate and enhance language-centric omnimodal embedding frameworks, especially in low-resource settings involving diverse languages and visual document understanding tasks. The dataset includes various configurations such as corpus, default, qrels, and query, each with specific features and splits.
提供机构:
LCO-Embedding



