ru-alpaca-eval
收藏魔搭社区2025-12-05 更新2025-07-26 收录
下载链接:
https://modelscope.cn/datasets/t-tech/ru-alpaca-eval
下载链接
链接失效反馈官方服务:
资源简介:
# ru-alpaca-eval
**ru-alpaca-eval** is translated version of [alpaca_eval](https://huggingface.co/datasets/tatsu-lab/alpaca_eval/blob/main/alpaca_eval.json). The translation of the original dataset was done manually. In addition, content of each task in dataset was reviewed, the correctness of the task statement and compliance with moral and ethical standards were assessed. Thus, this dataset allows you to evaluate the abilities of language models to support the Russian language. Baseline responses updated with GPT-4o model and also reviewed.
### Overview of the Dataset
- Original dataset: [alpaca_eval](https://huggingface.co/datasets/tatsu-lab/alpaca_eval/blob/main/alpaca_eval.json)
- Number of tasks in original dataset: **805**
- Number of tasks: **799**
- Format: **JSON**
### Usage
To use this dataset for model estimation, follow these steps:
1. Download this [json file](https://huggingface.co/datasets/t-tech/ru-alpaca-eval/blob/main/data/alpaca_eval.json).
2. Use it with [original codebase](https://github.com/tatsu-lab/alpaca_eval). For example:
```bash
alpaca_eval evaluate_from_model \
--model_configs models_configs/custom_model \
--annotators_config 'alpaca_eval_gpt4_turbo_fn' \
--evaluation_dataset=$PATH_TO_JSON_FILE
```
### Sample example
```json
{
"instruction": "Как штаты США получили свои названия?",
"output": "Названия штатов США имеют различное происхождение...",
"generator": "gpt-4o",
"dataset": "helpful_base"
}
```
Here, **instruction** is the question to evaluate the model's response, **output** is the baseline response.
# ru-alpaca-eval
**ru-alpaca-eval** 是[alpaca_eval](https://huggingface.co/datasets/tatsu-lab/alpaca_eval/blob/main/alpaca_eval.json)的俄语译制版。本数据集的原始内容均通过人工翻译完成。此外,我们对数据集中的每一项任务均开展了内容审核,评估了任务描述的准确性以及其是否符合道德伦理规范。因此,该数据集可用于评估大语言模型(Large Language Model, LLM)的俄语支持能力。基准响应已通过GPT-4o模型进行更新,并同样经过了审核校验。
### 数据集概览
- 原始数据集:[alpaca_eval](https://huggingface.co/datasets/tatsu-lab/alpaca_eval/blob/main/alpaca_eval.json)
- 原始数据集任务数量:**805**
- 本数据集任务数量:**799**
- 数据格式:**JSON**
### 使用说明
若需使用本数据集开展模型性能评估,请遵循以下步骤:
1. 下载此[JSON文件](https://huggingface.co/datasets/t-tech/ru-alpaca-eval/blob/main/data/alpaca_eval.json)。
2. 配合[原始代码库](https://github.com/tatsu-lab/alpaca_eval)使用,示例命令如下:
bash
alpaca_eval evaluate_from_model
--model_configs models_configs/custom_model
--annotators_config 'alpaca_eval_gpt4_turbo_fn'
--evaluation_dataset=$PATH_TO_JSON_FILE
### 示例条目
json
{
"instruction": "Как штаты США получили свои названия?",
"output": "Названия штатов США имеют различное происхождение...",
"generator": "gpt-4o",
"dataset": "helpful_base"
}
此处,**instruction** 为用于评估模型响应的提问指令,**output** 为基准响应内容。
提供机构:
maas
创建时间:
2025-07-19



