Emotiontalk

Name: Emotiontalk
Creator: maas
Published: 2026-05-16 00:06:36
License: 暂无描述

魔搭社区2026-05-16 更新2025-11-15 收录

下载链接：

https://modelscope.cn/datasets/BAAI/Emotiontalk

下载链接

链接失效反馈

官方服务：

资源简介：

# EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations [![Hugging Face Datasets](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Datasets-yellow)](https://huggingface.co/datasets/BAAI/Emotiontalk) [![arXiv](https://img.shields.io/badge/arXiv-2502.18913-b31b1b.svg)](https://arxiv.org/pdf/2505.23018) [![License: CC BY-NC-SA-4.0](https://img.shields.io/badge/License-CC%20BY--SA--NC%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc-sa/4.0/) [![Github](https://img.shields.io/badge/Github-EmotionTalk-blue)](https://github.com/flageval-baai/EmotionTalk) ## Introduction **EmotionTalk** is an interactive Chinese multimodal emotion dataset with rich annotations. This dataset provides multimodal information from 19 actors participating in dyadic conversation settings, incorporating acoustic, visual, and textual modalities. It includes 23.6 hours of speech (19,250 utterances), annotations for 7 utterance-level emotion categories (happy, surprise, sad, disgust, anger, fear, and neutral), 5-dimensional sentiment labels (negative, weakly negative, neutral, weakly positive, and positive) and 4-dimensional speech captions (speaker, speaking style, emotion and overall). The dataset is released under a **CC BY-NC-SA 4.0 license**, meaning it is available for non-commercial use. ## Dataset Details This dataset contains 23.6 hours of spontaneous dialogue recordings. Key features of the dataset include: * **Speakers:** 19 speakers. * **Audio Format:** WAV files with a 44.1kHz sampling rate. * **Label:** Happy, angry, sad, disgusted, fear, surprise, neutral. * **Annotations:** The dataset includes annotations for each modality. * **Text modality:** `data` (each annotator's labeling results), `emotion_result`, `speaker_id`, `file_name` (file path), `content` (transcription). * **Audio modality:** `data` (each annotator's labeling results), `emotion_result`, `speaker_id`, `paragraphs` (timestamp), `sourceAttr` (caption), `file_name` (file path), `content` (transcription). * **Video modality:** `data` (each annotator's labeling results), `emotion_result`, `speaker_id`, `file_name` (file path). * **Multimodal:** `data` (each annotator's labeling results), `emotion_result`, `Continuous label_result`, `speaker_id`, `file_name` (file path). ### Dataset Structure The dataset file structure is as follows. ``` data ├── audio/*.tar ├── Text/*.tar ├── Video/*.tar └── Multimodal/*.tar ``` ### Dataset Statistics The dataset is split into three subsets: | | Angry | Disgusted | Fearful | Happy | Neutral | Sad | Surprised | Total | | :------- | :---- | :-------- | :------ | :---- | :------ | :--- | :-------- | :----- | | Train | 2950 | 1142 | 672 | 2986 | 5377 | 919 | 1367 | 15413 | | Val(G01/G12) | 409 | 95 | 125 | 360 | 675 | 111 | 133 | 1908 | | Test(G03/G15) | 339 | 134 | 125 | 246 | 801 | 123 | 161 | 1929 | | **Total**| **3698**| **1371** | **922** | **3592**| **6853**| **1153**| **1661** | **19250**| For more details, please refer to our paper [EmotionTalk](https://arxiv.org/pdf/2505.23018). ## 📚 Cite me ``` @article{sun2025emotiontalk, title={EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations}, author={Sun, Haoqin and Wang, Xuechen and Zhao, Jinghua and Zhao, Shiwan and Zhou, Jiaming and Wang, Hui and He, Jiabei and Kong, Aobo and Yang, Xi and Wang, Yequan and others}, journal={arXiv preprint arXiv:2505.23018}, year={2025} } ```

# EmotionTalk：拥有丰富标注的交互式中文多模态情感数据集 [![🤗 Hugging Face 数据集](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Datasets-yellow)](https://huggingface.co/datasets/BAAI/Emotiontalk) [![arXiv:2505.23018](https://img.shields.io/badge/arXiv-2502.18913-b31b1b.svg)](https://arxiv.org/pdf/2505.23018) [![许可证：CC BY-NC-SA 4.0](https://img.shields.io/badge/License-CC%20BY--SA--NC%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc-sa/4.0/) [![GitHub: EmotionTalk](https://img.shields.io/badge/Github-EmotionTalk-blue)](https://github.com/flageval-baai/EmotionTalk) ## 简介 **EmotionTalk（EmotionTalk）** 是一个拥有丰富标注的交互式中文多模态情感数据集。本数据集收录了19名演员在双人对话场景下产生的多模态信息，涵盖声学、视觉与文本三种模态。数据集包含23.6小时的语音数据（共19250条话语），标注涵盖7个话语级情感类别（快乐、惊讶、悲伤、厌恶、愤怒、恐惧与中性）、5维情感倾向标签（负面、弱负面、中性、弱正面、正面）以及4维度语音描述标签（说话人、说话风格、情感与整体）。本数据集采用 **CC BY-NC-SA 4.0许可证** 发布，仅可用于非商业用途。 ## 数据集详情本数据集包含23.6小时的自发对话录音。其核心特征如下： * **说话人：** 共19名说话人。 * **音频格式：** 采样率为44.1kHz的WAV文件。 * **情感标签：** 快乐、愤怒、悲伤、厌恶、恐惧、惊讶与中性。 * **标注信息：** 本数据集为各模态提供了完整标注： * **文本模态：** 包含`data`（标注者的标注结果）、`emotion_result`、`speaker_id`、`file_name`（文件路径）与`content`（转录文本）。 * **音频模态：** 包含`data`（标注者的标注结果）、`emotion_result`、`speaker_id`、`paragraphs`（时间戳）、`sourceAttr`（描述文本）、`file_name`（文件路径）与`content`（转录文本）。 * **视频模态：** 包含`data`（标注者的标注结果）、`emotion_result`、`speaker_id`与`file_name`（文件路径）。 * **多模态：** 包含`data`（标注者的标注结果）、`emotion_result`、`Continuous label_result`、`speaker_id`与`file_name`（文件路径）。 ### 数据集结构数据集的文件结构如下： data ├── audio/*.tar ├── Text/*.tar ├── Video/*.tar └── Multimodal/*.tar ### 数据集统计本数据集被划分为三个子集： | | 愤怒 | 厌恶 | 恐惧 | 快乐 | 中性 | 悲伤 | 惊讶 | 总计 | | :------- | :---- | :-------- | :------ | :---- | :------ | :--- | :-------- | :----- | | 训练集 | 2950 | 1142 | 672 | 2986 | 5377 | 919 | 1367 | 15413 | | 验证集(G01/G12) | 409 | 95 | 125 | 360 | 675 | 111 | 133 | 1908 | | 测试集(G03/G15) | 339 | 134 | 125 | 246 | 801 | 123 | 161 | 1929 | | **总计**| **3698**| **1371** | **922** | **3592**| **6853**| **1153**| **1661** | **19250**| 如需了解更多细节，请参阅我们的论文[EmotionTalk（EmotionTalk）](https://arxiv.org/pdf/2505.23018)。 ## 📚 引用本数据集 @article{sun2025emotiontalk, title={EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations}, author={Sun, Haoqin and Wang, Xuechen and Zhao, Jinghua and Zhao, Shiwan and Zhou, Jiaming and Wang, Hui and He, Jiabei and Kong, Aobo and Yang, Xi and Wang, Yequan and others}, journal={arXiv preprint arXiv:2505.23018}, year={2025} }

提供机构：

maas

创建时间：

2025-11-07

5,000+

优质数据集

54 个

任务类型

进入经典数据集