nthakur/mirage-bench-pairwise-judgments
收藏Hugging Face2025-03-21 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/nthakur/mirage-bench-pairwise-judgments
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含多种语言的数据集,每种语言都有query_id, judge, model_A, model_B, output, verdict这些字段,用于记录某种评估或比较的过程和结果。每个语言的数据集都有一个开发集split,可用于开发和测试模型。
This is a dataset containing multiple languages, each with fields for query_id, judge, model_A, model_B, output, verdict, which are used to record the process and results of some evaluation or comparison. Each language dataset has a development set split, which can be used for model development and testing.
提供机构:
nthakur



