EvalAnything-AMU
收藏魔搭社区2025-12-05 更新2025-02-08 收录
下载链接:
https://modelscope.cn/datasets/PKU-Alignment/EvalAnything-AMU
下载链接
链接失效反馈官方服务:
资源简介:
# All-Modality Understanding
<span style="color: red;">All-Modality Understanding benchmark evaluates a model's ability to simultaneously process and integrate information from multiple modalities (text, images, videos, and audio) to answer open-ended questions comprehensively.</span>
[🏠 Homepage](https://github.com/PKU-Alignment/align-anything) | [👍 Our Official Code Repo](https://github.com/PKU-Alignment/align-anything)
[🤗 All-Modality Understanding Benchmark](https://huggingface.co/datasets/PKU-Alignment/EvalAnything-AMU)
[🤗 All-Modality Generation Benchmark (Instruction Following Part)](https://huggingface.co/datasets/PKU-Alignment/EvalAnything-InstructionFollowing)
[🤗 All-Modality Generation Benchmark (Modality Selection and Synergy Part)](https://huggingface.co/datasets/PKU-Alignment/EvalAnything-Selection_Synergy)
[🤗 All-Modality Generation Reward Model](https://huggingface.co/PKU-Alignment/AnyRewardModel)
## Data Example
<div align="center">
<img src="example-amu.png" width="100%"/>
</div>
## Load dataset
The default loading method for all AMU data is
```python
data = load_dataset(
"PKU-Alignment/EvalAnything-AMU",
trust_remote_code=True
)
```
or
```python
data = load_dataset(
"PKU-Alignment/EvalAnything-AMU",
name='all',
trust_remote_code=True
)
```
Due to the differences in processing images and videos, the AMU dataset provides separate test subsets for loading images as visual information and videos as visual information. The loading methods are distributed as
```python
data = load_dataset(
"PKU-Alignment/EvalAnything-AMU",
name='image',
trust_remote_code=True
)
```
and
```python
data = load_dataset(
"PKU-Alignment/EvalAnything-AMU",
name='video',
trust_remote_code=True
)
```
## Model Evaluation
Model evaluation is initiated using the [eval_anything/amu/example.py](https://github.com/PKU-Alignment/align-anything/blob/main/align_anything/evaluation/eval_anything/amu/example.py) script. Note that you need to complete the model inference-related code before use. For evaluation prompts, refer to [eval_anything/amu/amu_eval_prompt.py](https://github.com/PKU-Alignment/align-anything/blob/main/align_anything/evaluation/eval_anything/amu/amu_eval_prompt.py).
**Note:** The current code is a sample script for the All-Modality Understanding subtask of Eval Anything. In the future, we will integrate Eval Anything's evaluation into the framework to provide convenience for community use.
## Citation
Please cite our work if you use our benchmark or model in your paper.
```bibtex
@inproceedings{ji2024align,
title={Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback},
author={Jiaming Ji and Jiayi Zhou and Hantao Lou and Boyuan Chen and Donghai Hong and Xuyao Wang and Wenqi Chen and Kaile Wang and Rui Pan and Jiahao Li and Mohan Wang and Josef Dai and Tianyi Qiu and Hua Xu and Dong Li and Weipeng Chen and Jun Song and Bo Zheng and Yaodong Yang},
year={2024},
url={https://arxiv.org/abs/2412.15838}
}
```
# 全模态理解(All-Modality Understanding)
**全模态理解(All-Modality Understanding)基准用于评估模型同时处理并整合多模态信息(文本、图像、视频与音频)以全面回答开放式问题的能力。**
[🏠 项目主页](https://github.com/PKU-Alignment/align-anything) | [👍 官方代码仓库](https://github.com/PKU-Alignment/align-anything)
[🤗 全模态理解基准数据集](https://huggingface.co/datasets/PKU-Alignment/EvalAnything-AMU)
[🤗 全模态生成基准(指令遵循分支)](https://huggingface.co/datasets/PKU-Alignment/EvalAnything-InstructionFollowing)
[🤗 全模态生成基准(模态选择与协同分支)](https://huggingface.co/datasets/PKU-Alignment/EvalAnything-Selection_Synergy)
[🤗 全模态生成奖励模型](https://huggingface.co/PKU-Alignment/AnyRewardModel)
## 数据示例
<div align="center">
<img src="example-amu.png" width="100%"/>
</div>
## 数据集加载
全模态理解(AMU)数据集的默认加载方式为:
python
data = load_dataset(
"PKU-Alignment/EvalAnything-AMU",
trust_remote_code=True
)
或
python
data = load_dataset(
"PKU-Alignment/EvalAnything-AMU",
name='all',
trust_remote_code=True
)
考虑到图像与视频的处理逻辑存在差异,AMU数据集针对视觉输入为图像、视觉输入为视频的场景分别提供了独立的测试子集,其加载方式如下:
python
data = load_dataset(
"PKU-Alignment/EvalAnything-AMU",
name='image',
trust_remote_code=True
)
和
python
data = load_dataset(
"PKU-Alignment/EvalAnything-AMU",
name='video',
trust_remote_code=True
)
## 模型评估
模型评估可通过 [eval_anything/amu/example.py](https://github.com/PKU-Alignment/align-anything/blob/main/align_anything/evaluation/eval_anything/amu/example.py) 脚本启动。请注意,使用前需补全模型推理相关代码。如需获取评估提示词,可参考 [eval_anything/amu/amu_eval_prompt.py](https://github.com/PKU-Alignment/align-anything/blob/main/align_anything/evaluation/eval_anything/amu/amu_eval_prompt.py)。
**注意:** 当前代码为Eval Anything的全模态理解子任务示例脚本,未来我们将把Eval Anything评估流程集成至框架中,以方便社区用户使用。
## 引用
若您在论文中使用本基准或模型,请引用我们的工作:
bibtex
@inproceedings{ji2024align,
title={Align Anything:利用语言反馈训练全模态模型以遵循指令},
author={Jiaming Ji and Jiayi Zhou and Hantao Lou and Boyuan Chen and Donghai Hong and Xuyao Wang and Wenqi Chen and Kaile Wang and Rui Pan and Jiahao Li and Mohan Wang and Josef Dai and Tianyi Qiu and Hua Xu and Dong Li and Weipeng Chen and Jun Song and Bo Zheng and Yaodong Yang},
year={2024},
url={https://arxiv.org/abs/2412.15838}
}
提供机构:
maas
创建时间:
2025-02-07



