Hi-ToM/Hi-ToM_Dataset
收藏Hugging Face2023-10-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Hi-ToM/Hi-ToM_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
# Hi-ToM Dataset
This is the dataset for the paper "Hi-ToM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning in Large Language Models".
<img src=media/Picture1.png height=430>
### The `Hi-ToM_data` folder
Contains ToMh data consisting of story-question pairs and the corresponding answers.
The names of subfolder branches have the following meanings:
- `Tell` / `No_Tell`: whether or not the stories contain communications among agents.
- `MC` / `CoT`: the prompting style. `MC` corresponds to Vanilla Prompting (VP) in the paper, while `CoT` stands for Chain-of-Thought Prompting (CoTP).
- `length_n`: the story length, i.e. the number of chapters in a story. From 1 to 3.
- `sample_n`: the numbering of different sample stories.
- `order_n`: the ToM order of the question. From 0 to 4.
### The `Hi-ToM_prompt` folder
Contains prompt files that can be directly input to API.
The data in it are almost the same as `Hi-ToM_data`, except that answers are eliminated.
### Generate new data and prompts
Run the script `generate_tomh.sh`.
提供机构:
Hi-ToM
原始信息汇总
Hi-ToM 数据集
数据集概述
Hi-ToM 数据集是为论文 "Hi-ToM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning in Large Language Models" 创建的。
数据结构
Hi-ToM_data 文件夹
该文件夹包含 ToMh 数据,包括故事-问题对及其对应答案。子文件夹名称的含义如下:
Tell/No_Tell: 故事是否包含代理之间的通信。MC/CoT: 提示风格。MC对应论文中的 Vanilla Prompting (VP),而CoT代表 Chain-of-Thought Prompting (CoTP)。length_n: 故事长度,即故事中的章节数,范围从 1 到 3。sample_n: 不同样本故事的编号。order_n: 问题的 ToM 阶数,范围从 0 到 4。
Hi-ToM_prompt 文件夹
该文件夹包含可以直接输入 API 的提示文件。数据与 Hi-ToM_data 几乎相同,但答案被删除。
数据生成
运行脚本 generate_tomh.sh 可以生成新的数据和提示。



