ntt-icml2021

Name: ntt-icml2021
Creator: maas
Published: 2025-08-29 16:39:35
License: 暂无描述

魔搭社区2025-08-29 更新2025-07-26 收录

下载链接：

https://modelscope.cn/datasets/microsoft/ntt-icml2021

下载链接

链接失效反馈

官方服务：

资源简介：

# Overview The Navigation Turing Test dataset is an annotated set of human and agent navigation trajectories in a 3D game world. It was developed as a research benchmark of human-like navigation behavior in a 3D video game environment. For research uses and further information, see the related [GitHub - microsoft/NTT: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation](https://github.com/microsoft/NTT) [ICML 2021]. A detailed discussion of the Navigation Turing Test dataset, including how it was developed and evaluated, can be found in our paper at: [Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation - Microsoft Research](https://www.microsoft.com/en-us/research/publication/navigation-turing-test-ntt-learning-to-evaluate-human-like-navigation/) ## Intended Uses The Navigation Turing Test dataset is best suited for research into the development and evaluation of human-like navigation in 3D video games. It is being shared with the research community to facilitate reproduction of our results and foster further research in this area. ## Out of Scope Uses The Navigation Turing Test dataset is not representative of 3D navigation outside of the specific game in which it was collected. We do not recommend using the Navigation Turing Test dataset in commercial or real-world applications without further testing and development. It is being released for research purposes only. We do not recommend using the Navigation Turing Test dataset in the context of high-risk decision making (e.g. in law enforcement, legal, finance, or healthcare).  # Dataset Details ## Dataset Contents The Navigation Turing Test dataset consists of 40 instances of trajectories, where each trajectory represents a human or a machine learning agent navigating to a given goal location in a 3D video game environment. Multiple feature representations are provided for each trajectory, which correspond to the feature representations that were explored in the paper, namely: - MP4: video showing a game character navigate the 3D environment – this is what a player would see while playing the game - Barcodes: a 2D compressed summary of each video, as detailed in the paper. - Symbolic representation: the game state or telemetry, including xyz positions of the game character at each timestep and locations of other game objects - Topdown: a 2D top-down (mini map) projection of the trajectory The trajectories were annotated in 2 user studies. The resulting annotations are provided in HNTT_data. Most importantly, this includes each study participant’s judgment on which of a pair of trajectories more likely originated from a human player or the machine learned model. Further details about the study setup and all questions that were answered by participants are included in the supplemental material of our paper [Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation](https://proceedings.mlr.press/v139/devlin21a.html). The survey form itself is also provided. All materials were produced between December 2020 and February 2021. All annotations were collected in January 2021. The Navigation Turing Test dataset does not contain links to external data sources. Data points in the Navigation Turing Test dataset correspond to individual people’s opinions about pairs of trajectories. It does not include data pertaining to children. Measures have been taken to remove potentially identifying information. All records were manually reviewed. Measures have been taken to remove sensitive or private data. All records were manually reviewed. ## Data Creation & Processing The Navigation Turing Test dataset is an original dataset created by annotating pairs of trajectories that represent human players or machine learning agents. All survey materials used to create these annotations are provided with the dataset. Data collection was performed by employees of Microsoft that were not part of the project team. Creating the Navigation Turing Test dataset did not involve existing data. The data was annotated with participants’ opinions on which of a pair of trajectories was more likely reflecting human navigation behavior as opposed to behavior generated by a machine learning model. The Navigation Turing Test dataset is not believed to contain information that could be used to directly or indirectly identify a person. The Navigation Turing Test dataset is not believed to contain information that might be considered sensitive or private. The Navigation Turing Test dataset is not believed to contain information that might be considered offensive or insulting, or otherwise cause emotional distress. # How To Get Started To begin using the Navigation Turing Test dataset, see the sample code and documentation: [GitHub - microsoft/NTT: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation [ICML 2021]](https://github.com/microsoft/NTT) # Validation The validity of the Navigation Turing Test was assessed in follow up research, where the team replicated the study setup at a larger scale by working with mechanical turk workers instead of Microsoft employees. The results of the validation studies show high robustness, indicating that the assessments of what constitutes more human-like behavior is relatively stable across populations of human assessors. Additional details are provided here: [Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games | Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems](https://dl.acm.org/doi/full/10.1145/3544548.3581348) # Limitations The Navigation Turing Test dataset provides annotations that are based on the study participants’ opinions about human-like behavior. As such, there were no right or wrong answers. Agreement between participants as well as their confidence is reported in our publication: [Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation](https://proceedings.mlr.press/v139/devlin21a.html) (see Figure 7). The Navigation Turing Test dataset has not been systematically evaluated for sociocultural, economic, or demographic bias. Developers should consider the potential for bias as they select use cases, and evaluate and mitigate for accuracy, safety, and fairness concerns specific to each intended downstream use. The Navigation Turing Test dataset was developed for research and experimental purposes. Further testing and validation are needed before considering its application in commercial or real-world scenarios. The Navigation Turing Test dataset should not be used in highly regulated domains where inaccurate or incomplete outputs could suggest actions that lead to injury or negatively impact an individual's legal, financial, or life opportunities. The Navigation Turing Test dataset was used to develop the models and approaches described in [Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation](https://proceedings.mlr.press/v139/devlin21a.html). See the paper and documentation at [GitHub - microsoft/NTT: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation [ICML 2021]](https://github.com/microsoft/NTT) to understand the capabilities and limitations of the resulting models. # Best Practices Given the relatively small data set size, we recommend cross validation for the training and evaluation of machine learning approaches as shown in our sample code: [microsoft/NTT: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation [ICML 2021]](https://github.com/microsoft/NTT?tab=readme-ov-file#training-antt-models-section-33) It is the user’s responsibility to ensure that the use of the Navigation Turing Test dataset complies with relevant data protection regulations and organizational guidelines. # License This dataset is licensed under Microsoft Research License Agreement (MSR-LA) for data. See [LICENSE](https://huggingface.co/datasets/microsoft/ntt-icml2021/blob/main/LICENSE.md). # Ethics Data collection activities were conducted with approval from Microsoft’s Institutional Review Board. Data collection and annotation activities were conducted with the informed consent of the study participants. Data annotators were compensated for their participation. # Contact We welcome feedback and collaboration from our audience. If you have suggestions, questions, or observe unexpected/problematic data in our dataset, please contact us at . If the team receives reports of undesired content or identifies issues independently, we will update this repository with appropriate mitigations.

# 概述导航图灵测试（Navigation Turing Test, NTT）数据集是针对3D游戏世界中人类与AI智能体（AI Agent）导航轨迹的标注数据集，其被构建为用于评估3D电子游戏环境下类人导航行为的研究基准。如需获取研究用途相关信息及更多细节，请参阅相关GitHub仓库：[GitHub - microsoft/NTT: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation](https://github.com/microsoft/NTT) [ICML 2021]。关于导航图灵测试数据集的详细讨论（包括数据集构建与评估方式）可参阅我们的论文：[Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation - Microsoft Research](https://www.microsoft.com/en-us/research/publication/navigation-turing-test-ntt-learning-to-evaluate-human-like-navigation/) # 预期用途导航图灵测试数据集最适用于3D电子游戏中类人导航技术的开发与评估研究。本数据集向研究社区公开，旨在助力研究成果复现，并推动该领域的进一步研究。 # 非适用场景导航图灵测试数据集仅能代表其采集所用特定游戏内的3D导航行为，无法推广至其他游戏场景。未经进一步测试与开发，不建议将本数据集应用于商业或真实世界场景。本数据集仅用于研究目的。不建议将本数据集用于高风险决策场景（如执法、法务、金融或医疗保健领域）。 # 数据集详情 ## 数据集内容导航图灵测试数据集包含40组轨迹样本，每组轨迹代表人类或机器学习智能体在3D电子游戏环境中导航至指定目标位置的过程。每条轨迹均提供多种特征表示，与论文中探索的特征类型一致，具体包括： - MP4格式视频：展示游戏角色在3D环境中的导航过程，即玩家游戏时的可视画面 - 条形码（Barcodes）：每条视频的二维压缩摘要，详见论文内容 - 符号化表示：游戏状态或遥测数据，包含每一时间步游戏角色的xyz坐标及其他游戏物体的位置 - 俯视图（Topdown）：轨迹的二维俯视（迷你地图）投影上述轨迹通过两项用户研究完成标注，标注结果存储于HNTT_data中。核心标注内容为每位研究参与者对一组轨迹的判断：即哪一条轨迹更可能源自人类玩家而非机器学习模型。研究设置细节及参与者回答的所有问题详见论文[Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation](https://proceedings.mlr.press/v139/devlin21a.html)的补充材料，调研问卷原件亦随数据集一并提供。所有数据材料生成于2020年12月至2021年2月间，所有标注数据采集于2021年1月。导航图灵测试数据集不包含外部数据源链接。数据集中的每个数据点对应个体对两组轨迹的主观判断，未包含与儿童相关的数据。已采取措施移除潜在可识别信息，所有记录均经过人工审核。已采取措施移除敏感或隐私数据，所有记录均经过人工审核。 ## 数据创建与处理导航图灵测试数据集为原创数据集，通过标注人类玩家与机器学习智能体的轨迹对生成。用于生成标注的所有调研材料均随数据集一并提供。数据采集由微软非项目团队的员工完成。本数据集的创建未使用现有公开数据。标注内容为参与者对两组轨迹的判断：即哪一组轨迹更能体现人类导航行为，而非机器学习模型生成的行为。本数据集未包含可直接或间接识别个人的信息。本数据集未包含可被视为敏感或隐私的信息。本数据集未包含可能具有冒犯性、侮辱性或引发情绪困扰的内容。 # 快速上手如需开始使用导航图灵测试数据集，请参阅示例代码与文档：[GitHub - microsoft/NTT: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation [ICML 2021]](https://github.com/microsoft/NTT) # 有效性验证导航图灵测试的有效性在后续研究中得到验证：研究团队以更大规模复刻了原研究设置，使用机械Turk（Mechanical Turk）工人而非微软员工参与评估。验证研究结果显示评估结果具有高鲁棒性，表明人类评估者对「何为更类人导航行为」的判断在不同人群中相对稳定。详细信息可参阅论文：[Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games | Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems](https://dl.acm.org/doi/full/10.1145/3544548.3581348) # 局限性导航图灵测试数据集的标注基于研究参与者对类人行为的主观判断，因此并无绝对正确或错误的答案。参与者间的一致性及标注置信度已在论文[Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation](https://proceedings.mlr.press/v139/devlin21a.html)中汇报（见图7）。本数据集未针对社会文化、经济或人口统计学偏差进行系统性评估。开发者在选择使用场景时，应考虑潜在偏差风险，并针对每个特定下游应用场景评估并缓解准确性、安全性与公平性相关问题。本数据集仅为研究与实验目的开发，若考虑将其应用于商业或真实世界场景，需进行进一步测试与验证。本数据集不得用于高度监管领域，此类场景中不准确或不完整的输出可能导致引发伤害的行动，或对个体的法律、财务或人生机遇造成负面影响。本数据集曾用于开发论文[Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation](https://proceedings.mlr.press/v139/devlin21a.html)中描述的模型与方法。如需了解最终模型的能力与局限性，请参阅该论文及[GitHub - microsoft/NTT: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation [ICML 2021]](https://github.com/microsoft/NTT)中的文档。 # 最佳实践鉴于本数据集规模相对较小，我们建议如示例代码所示，采用交叉验证方法开展机器学习方法的训练与评估：[microsoft/NTT: Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation [ICML 2021]](https://github.com/microsoft/NTT?tab=readme-ov-file#training-antt-models-section-33) 使用者需自行确保导航图灵测试数据集的使用符合相关数据保护法规及组织指南。 # 许可协议本数据集基于微软研究数据许可协议（Microsoft Research License Agreement, MSR-LA）进行授权。详见[LICENSE](https://huggingface.co/datasets/microsoft/ntt-icml2021/blob/main/LICENSE.md)。 # 伦理声明数据采集活动已获得微软机构审查委员会的批准。数据采集与标注活动均获得研究参与者的知情同意。数据标注者因参与研究获得了相应报酬。 # 联系方式我们欢迎社区用户提供反馈与合作。若您有任何建议、疑问，或发现数据集中存在异常/问题数据，请联系我们。若团队收到不良内容举报或自行发现问题，我们将通过适当的缓解措施更新本代码库。

提供机构：

maas

创建时间：

2025-07-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集