five

MINT

收藏
魔搭社区2025-10-09 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/MBZUAI/MINT
下载链接
链接失效反馈
官方服务:
资源简介:
# Overview The evaluation toolkit to be used is [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval). This toolkit facilitates the evaluation of models across multiple tasks and languages. # Installation To install `lmms-eval`, execute the following commands: ```bash git clone https://github.com/EvolvingLMMs-Lab/lmms-eval cd lmms-eval pip install -e . ``` For additional dependencies for models, please refer to the [lmms-eval repository](https://github.com/EvolvingLMMs-Lab/lmms-eval). # Preparing the Mint Task Files Copy the required MINT task files to the `lmms-eval` tasks directory: ```bash # For mintmcq huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintmcq/ --local-dir ./ # For mintoe huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintoe/ --local-dir ./ ``` # Running Evaluations ## Tasks to Evaluate To evaluate the tasks, use the following options: ### For `mintmcq`: ```bash --tasks mintmcq_english,mintmcq_arabic,mintmcq_bengali,mintmcq_chinese,mintmcq_french,mintmcq_hindi,mintmcq_japanese,mintmcq_sinhala,mintmcq_spanish,mintmcq_tamil,mintmcq_urdu OR --tasks mintmcq_val ``` ### For `mintoe`: ```bash --tasks mintoe_english,mintoe_arabic,mintoe_bengali,mintoe_chinese,mintoe_french,mintoe_hindi,mintoe_japanese,mintoe_sinhala,mintoe_spanish,mintoe_tamil,mintoe_urdu OR --tasks mintoe_val ``` # Example: Evaluating `llavaonevision` ## Clone the Repository Clone the `llavaonevision` repository: ```bash git clone https://github.com/LLaVA-VL/LLaVA-NeXT ``` ## Download the Dataset Use `huggingface-cli` for parallel dataset download: ```bash huggingface-cli download MBZUAI/MINT --repo-type dataset ``` ## Run the Evaluation Export the necessary environment variables: ```bash export HF_HOME= export PYTHONPATH= ``` Run the evaluation command: ```bash accelerate launch --num_processes 8 -m lmms_eval \ --model llava_onevision \ --model_args pretrained="lmms-lab/llava-onevision-qwen2-7b-ov-chat" \ --tasks mintmcq_val,mintoe_val \ --batch_size 1 \ --log_samples \ --output_path ./logs/ \ --verbosity INFO ``` ## Output The model responses will be saved in the `logs` directory after the evaluation.

# 概述 本次将使用的评估工具包为 [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval),该工具包可支持跨多任务、多语言的模型评估工作。 # 安装 如需安装 `lmms-eval`,请执行以下命令: bash git clone https://github.com/EvolvingLMMs-Lab/lmms-eval cd lmms-eval pip install -e . 如需获取模型所需的额外依赖项,请参考 [lmms-eval 官方仓库](https://github.com/EvolvingLMMs-Lab/lmms-eval)。 # 准备 MINT 任务文件 将所需的 MINT 任务文件复制至 `lmms-eval` 的任务目录下: bash # 针对 mintmcq 任务 huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintmcq/ --local-dir ./ # 针对 mintoe 任务 huggingface-cli download MBZUAI/MINT --repo-type dataset --include lmms_eval/tasks/mintoe/ --local-dir ./ # 运行评估 ## 待评估任务 如需执行任务评估,可使用如下参数选项: ### 针对 `mintmcq` 任务: bash --tasks mintmcq_english,mintmcq_arabic,mintmcq_bengali,mintmcq_chinese,mintmcq_french,mintmcq_hindi,mintmcq_japanese,mintmcq_sinhala,mintmcq_spanish,mintmcq_tamil,mintmcq_urdu 或者 --tasks mintmcq_val ### 针对 `mintoe` 任务: bash --tasks mintoe_english,mintoe_arabic,mintoe_bengali,mintoe_chinese,mintoe_french,mintoe_hindi,mintoe_japanese,mintoe_sinhala,mintoe_spanish,mintoe_tamil,mintoe_urdu 或者 --tasks mintoe_val # 示例:评估 `llavaonevision` 模型 ## 克隆仓库 克隆 `llavaonevision` 对应仓库: bash git clone https://github.com/LLaVA-VL/LLaVA-NeXT ## 下载数据集 使用 `huggingface-cli` 进行并行数据集下载: bash huggingface-cli download MBZUAI/MINT --repo-type dataset ## 运行评估 先导出必要的环境变量: bash export HF_HOME= export PYTHONPATH= 随后执行评估命令: bash accelerate launch --num_processes 8 -m lmms_eval --model llava_onevision --model_args pretrained="lmms-lab/llava-onevision-qwen2-7b-ov-chat" --tasks mintmcq_val,mintoe_val --batch_size 1 --log_samples --output_path ./logs/ --verbosity INFO ## 输出结果 评估完成后,模型的推理响应将保存至 `logs` 目录中。
提供机构:
maas
创建时间:
2025-03-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作