GreenNode/scidocs-vn
收藏Hugging Face2026-01-08 更新2025-08-09 收录
下载链接:
https://hf-mirror.com/datasets/GreenNode/scidocs-vn
下载链接
链接失效反馈官方服务:
资源简介:
这是一个关于文本检索的越南语数据集,包含corpus、default和queries三个配置。corpus配置包括标题、文本及其原始版本,default配置包括查询ID、语料库ID和评分,queries配置包括文本及其原始版本。数据集分为训练和测试集,可用于在mteb任务上评估文本嵌入模型。
This is a Vietnamese text retrieval dataset consisting of three configurations: corpus, default, and queries. The corpus configuration includes title, text, and their original versions, the default configuration includes query ID, corpus ID, and score, and the queries configuration includes text and its original version. The dataset is split into training and test sets, which can be used to evaluate text embedding models on the mteb task.
提供机构:
GreenNode



