Baby Intuitions Benchmark (BIB)
收藏arXiv2022-02-12 更新2024-06-21 收录
下载链接:
https://kanishkgandhi.com/bib
下载链接
链接失效反馈官方服务:
资源简介:
Baby Intuitions Benchmark (BIB) 是一个专为测试机器对其他代理行为的理解和推理能力而设计的数据集。该数据集由纽约大学的研究团队创建,包含1000个评估任务,每个任务都基于发展认知科学的实验刺激,旨在模拟婴儿对日常生活的直观理解。BIB采用“期望违反”(VOE)范式,这是一种常用于婴儿研究的实验方法,通过比较预期和意外结果来评估观察者的理解。数据集不仅适用于机器学习模型,也允许与人类婴儿的表现进行直接比较,从而评估机器在理解代理意图方面的能力。
Baby Intuitions Benchmark (BIB) is a dataset specifically designed to test machines' ability to understand and reason about the behaviors of other agents. It was developed by a research team at New York University, and comprises 1000 evaluation tasks, each grounded in experimental stimuli from developmental cognitive science, with the goal of simulating infants' intuitive understanding of daily life. BIB adopts the "Violation of Expectations (VOE)" paradigm, an experimental method widely used in infant research, which evaluates observers' understanding by comparing expected and unexpected outcomes. This dataset is not only applicable to machine learning models, but also enables direct comparison with the performance of human infants, thus assessing machines' capability to comprehend the intentions of agents.
提供机构:
纽约大学
创建时间:
2021-02-24



