MINT_BAK

Name: MINT_BAK
Creator: maas
Published: 2025-09-01 16:25:17
License: 暂无描述

魔搭社区2025-09-01 更新2025-03-22 收录

下载链接：

https://modelscope.cn/datasets/MBZUAI/MINT_BAK

下载链接

链接失效反馈

官方服务：

资源简介：

# Overview The evaluation toolkit to be used is [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval). This toolkit facilitates the evaluation of models across multiple tasks and languages. # Installation To install `lmms-eval`, execute the following commands: ```bash git clone https://github.com/EvolvingLMMs-Lab/lmms-eval cd lmms-eval pip install -e . ``` For additional dependencies for models, please refer to the [lmms-eval repository](https://github.com/EvolvingLMMs-Lab/lmms-eval). # Preparing the Mint Task Files Copy the required MINT task files to the `lmms-eval` tasks directory: ```bash # For mintmcq huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintmcq/ --local-dir ./ # For mintoe huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintoe/ --local-dir ./ ``` # Running Evaluations ## Tasks to Evaluate To evaluate the tasks, use the following options: ### For `mintmcq`: ```bash --tasks mintmcq_english,mintmcq_arabic,mintmcq_bengali,mintmcq_chinese,mintmcq_french,mintmcq_german,mintmcq_hindi,mintmcq_japanese,mintmcq_russian,mintmcq_sinhala,mintmcq_spanish,mintmcq_swedish,mintmcq_tamil,mintmcq_urdu ``` ### For `mintoe`: ```bash --tasks mintoe_english,mintoe_arabic,mintoe_bengali,mintoe_chinese,mintoe_french,mintoe_german,mintoe_hindi,mintoe_japanese,mintoe_russian,mintoe_sinhala,mintoe_spanish,mintoe_swedish,mintoe_tamil,mintoe_urdu ``` # Example: Evaluating `llavaonevision` ## Clone the Repository Clone the `llavaonevision` repository: ```bash git clone https://github.com/LLaVA-VL/LLaVA-NeXT ``` ## Download the Dataset Use `huggingface-cli` for parallel dataset download: ```bash huggingface-cli download MBZUAI/MINT --repo-type dataset ``` ## Run the Evaluation Export the necessary environment variables: ```bash export HF_HOME= export PYTHONPATH= ``` Run the evaluation command: ```bash accelerate launch --num_processes 8 -m lmms_eval \ --model llava_onevision \ --model_args pretrained="lmms-lab/llava-onevision-qwen2-7b-ov-chat" \ --tasks mintmcq_english \ --batch_size 1 \ --log_samples \ --output_path ./logs/ \ --verbosity INFO ``` ## Output The model responses will be saved in the `logs` directory after the evaluation.

# 项目概览本次评估所使用的工具包为lmms-eval（https://github.com/EvolvingLMMs-Lab/lmms-eval），该工具包可支持多任务、多语言的模型评估工作。 # 安装若要安装`lmms-eval`，请执行如下命令： bash git clone https://github.com/EvolvingLMMs-Lab/lmms-eval cd lmms-eval pip install -e . 若需安装模型所需的额外依赖项，请参考[lmms-eval官方仓库](https://github.com/EvolvingLMMs-Lab/lmms-eval)。 # 准备MINT任务文件请将所需的MINT任务文件复制至`lmms-eval`的任务目录下： bash # 针对mintmcq任务 huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintmcq/ --local-dir ./ bash # 针对mintoe任务 huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintoe/ --local-dir ./ # 运行评估 ## 待评估任务若要执行任务评估，请使用如下参数选项： ### 针对`mintmcq`任务： bash --tasks mintmcq_english,mintmcq_arabic,mintmcq_bengali,mintmcq_chinese,mintmcq_french,mintmcq_german,mintmcq_hindi,mintmcq_japanese,mintmcq_russian,mintmcq_sinhala,mintmcq_spanish,mintmcq_swedish,mintmcq_tamil,mintmcq_urdu ### 针对`mintoe`任务： bash --tasks mintoe_english,mintoe_arabic,mintoe_bengali,mintoe_chinese,mintoe_french,mintoe_german,mintoe_hindi,mintoe_japanese,mintoe_russian,mintoe_sinhala,mintoe_spanish,mintoe_swedish,mintoe_tamil,mintoe_urdu # 示例：评估`llavaonevision`模型 ## 克隆仓库请克隆`llavaonevision`对应的仓库： bash git clone https://github.com/LLaVA-VL/LLaVA-NeXT ## 下载数据集请使用`huggingface-cli`进行并行数据集下载： bash huggingface-cli download MBZUAI/MINT --repo-type dataset ## 执行评估请先导出必要的环境变量： bash export HF_HOME= export PYTHONPATH= 随后执行如下评估命令： bash accelerate launch --num_processes 8 -m lmms_eval --model llava_onevision --model_args pretrained="lmms-lab/llava-onevision-qwen2-7b-ov-chat" --tasks mintmcq_english --batch_size 1 --log_samples --output_path ./logs/ --verbosity INFO ## 输出结果评估完成后，模型的推理结果将保存至`logs`目录中。

提供机构：

maas

创建时间：

2025-03-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集