aisingapore/MultiTurn-Chat-MT-Bench
收藏Hugging Face2024-12-20 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/aisingapore/MultiTurn-Chat-MT-Bench
下载链接
链接失效反馈官方服务:
资源简介:
SEA-MTBench是一个用于评估模型在多轮(2轮)对话中回应人类需求的能力的数据集。它使用gpt-4-1106-preview作为评判模型,并与gpt-3.5-turbo-0125基线模型进行比较。该数据集基于MT-Bench,并由本地 speakers 手动翻译成印度尼西亚语(id)、爪哇语(jv)、巽他语(su)和越南语(vi)。泰语部分使用来自ThaiLLM leaderboard的MT-Bench Thai。
SEA-MTBench evaluates a models ability to engage in multi-turn (2 turns) conversations and respond in ways that align with human needs. We use gpt-4-1106-preview as the judge model and compare against gpt-3.5-turbo-0125 as the baseline model. It is based on MT-Bench and was manually translated by native speakers for Indonesian (id), Javanese (jv), Sundanese (su), and Vietnamese (vi). The Thai split of this dataset uses MT-Bench Thai from the ThaiLLM leaderboard.
提供机构:
aisingapore



