five

smgjch/MeowBench

收藏
Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/smgjch/MeowBench
下载链接
链接失效反馈
官方服务:
资源简介:
MeowBench是一个高保真、经过专家验证的四模态基准测试,旨在评估多模态大型语言模型(MLLMs)在猫科动物意图解码方面的能力。它通过将外部观察数据(视频/音频)与内部生物标记(ECG/EEG/IMU)相关联,解决动物行为中的“语义别名”问题,测试模型是否能够进行真正的潜在状态推理。数据集结构为多项选择题(MCQ),每个样本包括视频、音频和时间序列数据的同步(或意图匹配)三元组,以及一个自然语言提示和四个选项。数据集的构建和验证过程包括下一步行为预测(NBP)逻辑、意图匹配合成和由八位专业猫科动物行为学家进行的手动审核。

MeowBench is a high-fidelity, expert-verified quad-modal benchmark designed to evaluate Multimodal Large Language Models (MLLMs) on feline intention decoding. It provides a rigorous testing ground for models to determine if they can move beyond superficial pattern matching to genuine latent state reasoning by correlating external observational data (video/audio) with internal biological markers (ECG/EEG/IMU). Each sample in MeowBench is structured as a Multiple Choice Question (MCQ), including a synchronized (or intent-matched) triplet of Video, Audio, and Time-Series data, a natural language prompt, and four options. The datasets construction and verification process includes Next-Behaviour Prediction (NBP) Logic, Intent-Matched Synthesis, and manual review by eight Professional Feline Ethologists.
提供机构:
smgjch
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作