MINT
收藏魔搭社区2025-10-09 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/MBZUAI/MINT
下载链接
链接失效反馈官方服务:
资源简介:
# Overview
The evaluation toolkit to be used is [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval). This toolkit facilitates the evaluation of models across multiple tasks and languages.
# Installation
To install `lmms-eval`, execute the following commands:
```bash
git clone https://github.com/EvolvingLMMs-Lab/lmms-eval
cd lmms-eval
pip install -e .
```
For additional dependencies for models, please refer to the [lmms-eval repository](https://github.com/EvolvingLMMs-Lab/lmms-eval).
# Preparing the Mint Task Files
Copy the required MINT task files to the `lmms-eval` tasks directory:
```bash
# For mintmcq
huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintmcq/ --local-dir ./
# For mintoe
huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintoe/ --local-dir ./
```
# Running Evaluations
## Tasks to Evaluate
To evaluate the tasks, use the following options:
### For `mintmcq`:
```bash
--tasks mintmcq_english,mintmcq_arabic,mintmcq_bengali,mintmcq_chinese,mintmcq_french,mintmcq_hindi,mintmcq_japanese,mintmcq_sinhala,mintmcq_spanish,mintmcq_tamil,mintmcq_urdu
OR
--tasks mintmcq_val
```
### For `mintoe`:
```bash
--tasks mintoe_english,mintoe_arabic,mintoe_bengali,mintoe_chinese,mintoe_french,mintoe_hindi,mintoe_japanese,mintoe_sinhala,mintoe_spanish,mintoe_tamil,mintoe_urdu
OR
--tasks mintoe_val
```
# Example: Evaluating `llavaonevision`
## Clone the Repository
Clone the `llavaonevision` repository:
```bash
git clone https://github.com/LLaVA-VL/LLaVA-NeXT
```
## Download the Dataset
Use `huggingface-cli` for parallel dataset download:
```bash
huggingface-cli download MBZUAI/MINT --repo-type dataset
```
## Run the Evaluation
Export the necessary environment variables:
```bash
export HF_HOME=
export PYTHONPATH=
```
Run the evaluation command:
```bash
accelerate launch --num_processes 8 -m lmms_eval \
--model llava_onevision \
--model_args pretrained="lmms-lab/llava-onevision-qwen2-7b-ov-chat" \
--tasks mintmcq_val,mintoe_val \
--batch_size 1 \
--log_samples \
--output_path ./logs/ \
--verbosity INFO
```
## Output
The model responses will be saved in the `logs` directory after the evaluation.
# 概述
本次将使用的评估工具包为 [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval),该工具包可支持跨多任务、多语言的模型评估工作。
# 安装
如需安装 `lmms-eval`,请执行以下命令:
bash
git clone https://github.com/EvolvingLMMs-Lab/lmms-eval
cd lmms-eval
pip install -e .
如需获取模型所需的额外依赖项,请参考 [lmms-eval 官方仓库](https://github.com/EvolvingLMMs-Lab/lmms-eval)。
# 准备 MINT 任务文件
将所需的 MINT 任务文件复制至 `lmms-eval` 的任务目录下:
bash
# 针对 mintmcq 任务
huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintmcq/ --local-dir ./
# 针对 mintoe 任务
huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintoe/ --local-dir ./
# 运行评估
## 待评估任务
如需执行任务评估,可使用如下参数选项:
### 针对 `mintmcq` 任务:
bash
--tasks mintmcq_english,mintmcq_arabic,mintmcq_bengali,mintmcq_chinese,mintmcq_french,mintmcq_hindi,mintmcq_japanese,mintmcq_sinhala,mintmcq_spanish,mintmcq_tamil,mintmcq_urdu
或者
--tasks mintmcq_val
### 针对 `mintoe` 任务:
bash
--tasks mintoe_english,mintoe_arabic,mintoe_bengali,mintoe_chinese,mintoe_french,mintoe_hindi,mintoe_japanese,mintoe_sinhala,mintoe_spanish,mintoe_tamil,mintoe_urdu
或者
--tasks mintoe_val
# 示例:评估 `llavaonevision` 模型
## 克隆仓库
克隆 `llavaonevision` 对应仓库:
bash
git clone https://github.com/LLaVA-VL/LLaVA-NeXT
## 下载数据集
使用 `huggingface-cli` 进行并行数据集下载:
bash
huggingface-cli download MBZUAI/MINT --repo-type dataset
## 运行评估
先导出必要的环境变量:
bash
export HF_HOME=
export PYTHONPATH=
随后执行评估命令:
bash
accelerate launch --num_processes 8 -m lmms_eval
--model llava_onevision
--model_args pretrained="lmms-lab/llava-onevision-qwen2-7b-ov-chat"
--tasks mintmcq_val,mintoe_val
--batch_size 1
--log_samples
--output_path ./logs/
--verbosity INFO
## 输出结果
评估完成后,模型的推理响应将保存至 `logs` 目录中。
提供机构:
maas
创建时间:
2025-03-17



