Baby Intuitions Benchmark (BIB)

Name: Baby Intuitions Benchmark (BIB)
Creator: 纽约大学
Published: 2022-02-12 06:57:16
License: 暂无描述

arXiv2022-02-12 更新2024-06-21 收录

下载链接：

https://kanishkgandhi.com/bib

下载链接

链接失效反馈

官方服务：

资源简介：

Baby Intuitions Benchmark (BIB) 是一个专为测试机器对其他代理行为的理解和推理能力而设计的数据集。该数据集由纽约大学的研究团队创建，包含1000个评估任务，每个任务都基于发展认知科学的实验刺激，旨在模拟婴儿对日常生活的直观理解。BIB采用“期望违反”(VOE)范式，这是一种常用于婴儿研究的实验方法，通过比较预期和意外结果来评估观察者的理解。数据集不仅适用于机器学习模型，也允许与人类婴儿的表现进行直接比较，从而评估机器在理解代理意图方面的能力。

Baby Intuitions Benchmark (BIB) is a dataset specifically designed to test machines' ability to understand and reason about the behaviors of other agents. It was developed by a research team at New York University, and comprises 1000 evaluation tasks, each grounded in experimental stimuli from developmental cognitive science, with the goal of simulating infants' intuitive understanding of daily life. BIB adopts the "Violation of Expectations (VOE)" paradigm, an experimental method widely used in infant research, which evaluates observers' understanding by comparing expected and unexpected outcomes. This dataset is not only applicable to machine learning models, but also enables direct comparison with the performance of human infants, thus assessing machines' capability to comprehend the intentions of agents.

提供机构：

纽约大学

创建时间：

2021-02-24

5,000+

优质数据集

54 个

任务类型

进入经典数据集