mteb/StackOverflowDupQuestions-VN
收藏Hugging Face2025-10-16 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/mteb/StackOverflowDupQuestions-VN
下载链接
链接失效反馈官方服务:
资源简介:
StackOverflowDupQuestions-VN是一个越南语的文本嵌入基准数据集,翻译自Stack Overflow重复问题任务,包含Java、JavaScript和Python标签的问题。数据集使用了Coherence的Aya模型进行翻译,并采用先进的嵌入模型和LLM-as-a-judge评分系统进行过滤和质量评估。
StackOverflowDupQuestions-VN is a Vietnamese text embedding benchmark dataset translated from the Stack Overflow Duplicate Questions Task, including questions tagged with Java, JavaScript, and Python. The dataset utilizes Coherences Aya model for translation and is filtered and evaluated for quality using advanced embedding models and an LLM-as-a-judge scoring system.
提供机构:
mteb



