purplesquirrelnetworks/thot-pocket-gaze
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/purplesquirrelnetworks/thot-pocket-gaze
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-classification
- time-series-forecasting
language:
- en
tags:
- gaze-behavior
- eye-contact
- ai-avatar
- conversation-dynamics
- nonverbal-communication
- thot-pocket
pretty_name: Thot Pocket Gaze Behavior Dataset
size_categories:
- n<1K
---
# Thot Pocket Gaze Behavior Dataset
Training data for **Thot Pocket** -- an AI avatar eye contact intelligence system that generates naturalistic gaze behavior during conversations.
**Main repo:** [github.com/ExpertVagabond/thot-pocket](https://github.com/ExpertVagabond/thot-pocket)
**Crate:** [crates.io/crates/thot-pocket](https://crates.io/crates/thot-pocket) (v0.1.0)
The training pipeline is included in the GitHub repo under `train/` — a GazeTransformer model (4-layer, 128-dim, 3 output heads).
## Purpose
Human eye contact follows complex, culturally-informed patterns during conversation. Speakers and listeners avert gaze at different rates, in different directions, and for different durations depending on cognitive load, conversational role, and cultural norms. This dataset captures timestamped gaze annotations from real and simulated conversations to train AI avatars that exhibit believable eye contact behavior rather than the uncanny-valley stare of most virtual agents.
## Schema
Each row in the JSONL data files contains:
| Field | Type | Description |
|---|---|---|
| `timestamp_ms` | `int` | Milliseconds from conversation start |
| `conversation_id` | `string` | Unique conversation identifier |
| `speaker_role` | `string` | Who is producing the current speech segment: `avatar` or `user` |
| `conversation_state` | `string` | Current avatar state: `idle`, `listening`, `thinking`, `speaking` |
| `user_gaze_zone` | `string` | Where the user is looking: `at_me`, `away`, `down` |
| `avatar_looking_at_user` | `bool` | Whether the avatar is making eye contact |
| `aversion_direction` | `string \| null` | If avatar is not looking at user, the aversion direction: `up_right`, `up_left`, `right`, `left`, `down_right`, `down_left`. Null when `avatar_looking_at_user` is true. |
| `contact_duration_ms` | `int` | How long the current gaze state has been held (ms) |
| `culture` | `string` | Cultural context label: `western`, `east_asian`, `middle_eastern`, `latin_american`, `south_asian`, `african` |
| `transcript_segment` | `string` | The speech content at this timestamp |
### Conversation States
- **idle** -- No active conversation; avatar is in resting gaze pattern
- **listening** -- User is speaking; avatar should maintain higher eye contact
- **thinking** -- Avatar is formulating a response; gaze aversion increases naturally
- **speaking** -- Avatar is delivering speech; gaze follows natural speaker patterns
### Gaze Zone Labels
User gaze zones observed from the avatar's perspective:
- **at_me** -- User is looking at the avatar's face/eyes
- **away** -- User is looking to the side or at something else
- **down** -- User is looking downward (phone, notes, etc.)
### Aversion Directions
When the avatar breaks eye contact, the direction of gaze shift:
- **up_right** -- Associated with visual construction/imagination
- **up_left** -- Associated with visual recall
- **right** -- Associated with auditory construction
- **left** -- Associated with auditory recall
- **down_right** -- Associated with kinesthetic processing
- **down_left** -- Associated with internal dialogue
## Data Files
```
data/
train.jsonl # Training split
```
## How to Load
```python
from datasets import load_dataset
dataset = load_dataset("purplesquirrelnetworks/thot-pocket-gaze")
```
## Contributing Data
We welcome contributions of annotated gaze behavior data. To contribute:
1. Fork this repository
2. Add your annotated data in JSONL format following the schema above
3. Ensure each row includes all required fields
4. Include a `culture` label for cultural context
5. Open a pull request with a description of your data source and annotation methodology
### Annotation Guidelines
- Timestamps should be relative to conversation start (first utterance = 0ms)
- Gaze zone labels should be annotated at a minimum of 100ms granularity
- Aversion direction must be null when `avatar_looking_at_user` is true
- Cultural labels should reflect the cultural background of the human participant
- Transcript segments should capture the speech content at the annotation timestamp
### Data Sources We Accept
- Webcam-recorded conversations with manual gaze annotation
- Eye-tracker data from conversation studies
- Expert-annotated synthetic conversation scenarios
- Published research datasets re-formatted to this schema (with appropriate licensing)
## License
MIT
## Citation
```bibtex
@dataset{thot_pocket_gaze_2026,
title={Thot Pocket Gaze Behavior Dataset},
author={Purple Squirrel Networks},
year={2026},
url={https://huggingface.co/datasets/purplesquirrelnetworks/thot-pocket-gaze},
license={MIT}
}
```
---
许可证:MIT协议
任务类别:
- 文本分类
- 时间序列预测
语言:
- 英语
标签:
- 注视行为(gaze-behavior)
- 眼神接触(eye-contact)
- AI化身(AI avatar)
- 对话动态(conversation-dynamics)
- 非语言沟通(nonverbal-communication)
- thot-pocket
数据集名称:Thot Pocket 注视行为数据集
样本规模:
- n<1K
---
# Thot Pocket 注视行为数据集
本数据集为**Thot Pocket**——一款可在对话中生成自然化注视行为的AI化身(AI avatar)眼神接触智能系统——提供训练数据。
**主代码仓库:** [github.com/ExpertVagabond/thot-pocket](https://github.com/ExpertVagabond/thot-pocket)
**Crate包:** [crates.io/crates/thot-pocket](https://crates.io/crates/thot-pocket)(v0.1.0)
训练流水线已包含在GitHub仓库的`train/`目录下——该模型为GazeTransformer(4层、128维、3个输出头)。
## 数据集用途
人类在对话中的眼神接触遵循复杂且受文化影响的模式。说话者与倾听者的避视频率、方向及持续时长会因认知负荷、对话角色与文化规范的不同而有所差异。本数据集收录了真实与模拟对话中带时间戳的注视标注数据,用于训练能够展现自然可信眼神接触行为的AI化身,而非多数虚拟代理那种令人产生恐怖谷效应的直盯状态。
## 数据格式规范
JSONL数据文件中的每一行包含以下字段:
| 字段名 | 数据类型 | 描述 |
|---|---|---|
| `timestamp_ms` | `int` | 以对话开始时刻为基准的毫秒级时间戳 |
| `conversation_id` | `string` | 唯一对话标识符 |
| `speaker_role` | `string` | 当前语音片段的发言者角色,可选值为`avatar`(AI化身)或`user`(用户) |
| `conversation_state` | `string` | AI化身当前状态,可选值为`idle`(空闲)、`listening`(倾听)、`thinking`(思考)、`speaking`(发言) |
| `user_gaze_zone` | `string` | 用户的注视区域,可选值为`at_me`(看向化身)、`away`(看向别处)、`down`(向下看) |
| `avatar_looking_at_user` | `bool` | 标记AI化身是否与用户进行眼神接触 |
| `aversion_direction` | `string | null` | 当AI化身未与用户进行眼神接触时,其避视的方向,可选值为`up_right`(右上)、`up_left`(左上)、`right`(右)、`left`(左)、`down_right`(右下)、`down_left`(左下);当`avatar_looking_at_user`为`true`时,该字段为null |
| `contact_duration_ms` | `int` | 当前注视状态已持续的时长(单位:毫秒) |
| `culture` | `string` | 文化背景标签,可选值为`western`(西方)、`east_asian`(东亚)、`middle_eastern`(中东)、`latin_american`(拉美)、`south_asian`(南亚)、`african`(非洲) |
| `transcript_segment` | `string` | 该时间戳对应的语音内容 |
### 对话状态说明
- **idle(空闲)**:无活跃对话,AI化身处于静息注视模式
- **listening(倾听)**:用户正在发言,AI化身应保持较高的眼神接触频率
- **thinking(思考)**:AI化身正在组织回应,自然增加避视行为
- **speaking(发言)**:AI化身正在发言,其注视行为遵循自然的说话者模式
### 用户注视区域标签(以AI化身视角为基准)
- **at_me**:用户看向AI化身的面部/眼部区域
- **away**:用户看向侧面或其他物体
- **down**:用户向下看(如手机、笔记等)
### 避视方向说明
当AI化身中断眼神接触时,其视线转移的方向与对应认知活动关联如下:
- **up_right**:与视觉构建/想象相关
- **up_left**:与视觉回忆相关
- **right**:与听觉构建相关
- **left**:与听觉回忆相关
- **down_right**:与动觉加工相关
- **down_left**:与内部对话相关
## 数据文件
data/
train.jsonl # 训练集划分
## 数据加载方法
可通过以下代码加载本数据集:
python
from datasets import load_dataset
dataset = load_dataset("purplesquirrelnetworks/thot-pocket-gaze")
## 数据贡献
我们欢迎各类带标注的注视行为数据贡献。贡献流程如下:
1. Fork本仓库
2. 按照上述数据格式规范,以JSONL格式添加标注数据
3. 确保每一行数据包含所有必填字段
4. 为数据添加文化背景标签
5. 提交拉取请求,并附带数据来源与标注方法的详细说明
### 标注规范
- 时间戳需以对话开始时刻为基准(第一条语音的时间戳为0ms)
- 注视区域标签的标注粒度至少为100ms
- 当`avatar_looking_at_user`为`true`时,避视方向字段必须为null
- 文化标签需反映人类参与者的文化背景
- 语音片段需记录标注时间戳对应的完整语音内容
### 可接受的数据来源
- 带人工注视标注的摄像头录制对话数据
- 对话研究中的眼动追踪数据
- 经专家标注的合成对话场景数据
- 重新格式化为本规范的已发表研究数据集(需符合相应许可证要求)
## 许可证
MIT协议
## 引用格式
bibtex
@dataset{thot_pocket_gaze_2026,
title={Thot Pocket Gaze Behavior Dataset},
author={Purple Squirrel Networks},
year={2026},
url={https://huggingface.co/datasets/purplesquirrelnetworks/thot-pocket-gaze},
license={MIT}
}
提供机构:
purplesquirrelnetworks



