GreenNode/dbpedia-vn
收藏Hugging Face2026-01-08 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/GreenNode/dbpedia-vn
下载链接
链接失效反馈官方服务:
资源简介:
DBPedia-VN数据集是从DBpedia-Entity标准测试集合翻译而来的,用于在DBpedia知识库上进行实体搜索。该数据集是越南大规模文本嵌入基准(VN-MTEB)的一部分,创建过程涉及使用大型语言模型(LLMs,特别是Coherence的Aya模型)进行翻译,应用先进的嵌入模型过滤翻译结果,并使用LLM-as-a-judge根据多个标准对样本质量进行评分。数据集包含多种配置(corpus、default、qrels、queries),具有详细的特征和分割。数据集采用cc-by-sa-4.0许可,源自mteb/dbpedia和GreenNode/dbpedia-vn。数据集适用于文本检索任务,语言为越南语。
A translated dataset from DBpedia-Entity is a standard test collection for entity search over the DBpedia knowledge base. The dataset is part of the Vietnamese Massive Text Embedding Benchmark (VN-MTEB) and involves translation using large language models (LLMs), specifically Coherences Aya model, followed by filtering and quality scoring. The dataset includes multiple configurations (corpus, default, qrels, queries) with detailed features and splits. It is licensed under cc-by-sa-4.0 and is derived from source datasets mteb/dbpedia and GreenNode/dbpedia-vn. The dataset is intended for text retrieval tasks and is available in Vietnamese.
提供机构:
GreenNode



