TTS-AGI/balanced-emotion-dataset-majestrino-withtemporal-detailed-captions
收藏Hugging Face2026-04-03 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/TTS-AGI/balanced-emotion-dataset-majestrino-withtemporal-detailed-captions
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- audio-classification
- text-generation
language:
- en
tags:
- audio
- emotion
- caption
- speech
- balanced
- webdataset
size_categories:
- 100K<n<1M
---
# Balanced Emotion Dataset — Majestrino with Temporal Detailed Captions
An emotion-balanced subset of [TTS-AGI/majestrino-unified-detailed-captions-temporal](https://huggingface.co/datasets/TTS-AGI/majestrino-unified-detailed-captions-temporal).
## Overview
- **Total samples**: 482,594
- **Samples per emotion category**: 12,997
- **Number of emotion categories**: 40
- **Format**: WebDataset (tar files with FLAC audio + JSON metadata)
- **Number of tar files**: 483
- **Samples per tar**: ~1000
## Balancing Strategy
Samples were selected from the source dataset using keyword matching on captions.
Each of the 40 emotion categories has exactly 12,997 samples,
balanced by the rarest category (Intoxication/Altered States).
Samples are spread across diverse source shards for maximum variety.
Some samples (~7.3%) appear in multiple categories due to multi-emotion captions.
## Emotion Categories
| Category | Samples | Example Keywords |
|----------|---------|-----------------|
| Amusement | 12997 | lighthearted fun, amusement, mirth, joviality, laughter |
| Elation | 12997 | happiness, excitement, joy, exhilaration, delight |
| Pleasure/Ecstasy | 12997 | ecstasy, pleasure, bliss, rapture, beatitude |
| Contentment | 12997 | contentment, relaxation, peacefulness, calmness, satisfaction |
| Thankfulness/Gratitude | 12997 | thankfulness, gratitude, appreciation, gratefulness |
| Affection | 12997 | sympathy, compassion, warmth, trust, caring |
| Infatuation | 12997 | infatuation, having a crush, romantic desire, fondness, butterflies in the stomach |
| Hope/Optimism | 12997 | hope, enthusiasm, optimism, anticipation, courage |
| Triumph | 12997 | triumph, superiority |
| Pride | 12997 | pride, dignity, self-confidently, honor, self-consciousness |
| Interest | 12997 | interest, fascination, curiosity, intrigue |
| Awe | 12997 | awe, awestruck, wonder |
| Astonishment/Surprise | 12997 | astonishment, surprise, amazement, shock, startlement |
| Concentration | 12997 | concentration, deep focus, engrossment, absorption, attention |
| Contemplation | 12997 | contemplation, thoughtfulness, pondering, reflection, meditation |
| Relief | 12997 | relief, respite, alleviation, solace, comfort |
| Longing | 12997 | yearning, longing, pining, wistfulness, nostalgia |
| Teasing | 12997 | teasing, bantering, mocking playfully, ribbing, provoking lightly |
| Impatience and Irritability | 12997 | impatience, irritability, irritation, restlessness, short-temperedness |
| Sexual Lust | 12997 | sexual lust, carnal desire, lust, feeling horny, feeling turned on |
| Doubt | 12997 | doubt, distrust, suspicion, skepticism, uncertainty |
| Fear | 12997 | fear, terror, dread, apprehension, alarm |
| Distress | 12997 | worry, anxiety, unease, anguish, trepidation |
| Confusion | 12997 | confusion, bewilderment, flabbergasted, disorientation, perplexity |
| Embarrassment | 12997 | embarrassment, shyness, mortification, discomfiture, awkwardness |
| Shame | 12997 | shame, guilt, remorse, humiliation, contrition |
| Disappointment | 12997 | disappointment, regret, dismay, letdown, chagrin |
| Sadness | 12997 | sadness, sorrow, grief, melancholy, dejection |
| Bitterness | 12997 | resentment, acrimony, bitterness, cynicism, rancor |
| Contempt | 12997 | contempt, disapproval, scorn, disdain, loathing |
| Disgust | 12997 | disgust, revulsion, repulsion, abhorrence, loathing |
| Anger | 12997 | anger, rage, fury, hate, irascibility |
| Malevolence/Malice | 12997 | spite, sadism, malevolence, malice, desire to harm |
| Sourness | 12997 | sourness, tartness, acidity, acerbity, sharpness |
| Pain | 12997 | physical pain, suffering, torment, ache, agony |
| Helplessness | 12997 | helplessness, powerlessness, desperation, submission |
| Fatigue/Exhaustion | 12997 | fatigue, exhaustion, weariness, lethargy, burnout |
| Emotional Numbness | 12997 | numbness, detachment, insensitivity, emotional blunting, apathy |
| Intoxication/Altered States | 12997 | being drunk, stupor, intoxication, disorientation, altered perception |
| Jealousy & Envy | 12997 | jealousy, envy, covetousness |
## Data Format
Each tar file contains paired `.flac` and `.json` files:
- **FLAC**: Audio recording
- **JSON**: Metadata including `caption` (unified detailed caption with temporal aspects),
`transcription`, `duration`, `characters_per_second`, quality scores, and emotion scores
## Usage
```python
import webdataset as wds
dataset = wds.WebDataset("data/{00000..00482}.tar")
for sample in dataset:
audio = sample["flac"] # FLAC bytes
meta = json.loads(sample["json"])
caption = meta["caption"]
```
## Source
Built from [TTS-AGI/majestrino-unified-detailed-captions-temporal](https://huggingface.co/datasets/TTS-AGI/majestrino-unified-detailed-captions-temporal)
using emotion keyword matching across all 821 training shards.
提供机构:
TTS-AGI
搜集汇总
数据集介绍

构建方式
在情感计算与语音分析领域,构建高质量且均衡的数据集对于模型训练至关重要。本数据集基于TTS-AGI/majestrino-unified-detailed-captions-temporal源数据集,通过关键词匹配策略从821个训练分片中精心筛选而成。其核心构建方法聚焦于情感类别的平衡性,以最稀有的“Intoxication/Altered States”类别为基准,确保40个情感类别各自包含恰好12,997个样本。这种设计不仅实现了跨类别的数量均衡,还通过分散样本来源以最大化数据多样性,同时约7.3%的样本因包含多重情感描述而出现在多个类别中,进一步丰富了数据的内在联系。
特点
该数据集在情感语音研究领域展现出显著的结构化优势,其最突出的特点在于严格的情感类别平衡性,涵盖了从积极情感到消极情感的40个细致类别,如愉悦、恐惧、愤怒等,每个类别均包含相同数量的样本。数据以WebDataset格式组织,包含483个压缩包,每个包内约1000个样本,集成了FLAC音频文件与JSON元数据。元数据不仅提供带时间维度的详细描述文本,还包含转录文本、时长、语速及质量评分等多维度信息,为多模态情感分析提供了丰富而一致的标注基础。
使用方法
在语音情感识别与生成任务中,本数据集为研究者提供了便捷高效的访问方式。用户可通过WebDataset库直接加载数据,每个样本包含FLAC格式的音频字节流和JSON格式的元数据。元数据中的详细描述文本融合了时间信息,适用于训练具有时序感知能力的模型。数据集支持端到端的处理流程,研究者可轻松提取音频特征并结合文本描述进行多模态学习,或利用均衡的情感分布进行类别敏感的模型评估与比较,从而推动情感计算领域的算法创新。
背景与挑战
背景概述
情感计算领域长期致力于构建能够识别和理解人类复杂情感状态的计算模型,其中高质量、大规模且平衡的情感标注数据集是推动该领域发展的关键基础设施。Balanced Emotion Dataset — Majestrino with Temporal Detailed Captions 由TTS-AGI研究团队构建,作为一个从Majestrino统一详细时间描述数据集中精心筛选出的子集,其核心研究目标在于解决现有语音情感数据集中普遍存在的情感类别分布不均衡问题。该数据集涵盖了从愉悦、感激到愤怒、痛苦等40种精细划分的情感类别,每个类别精确包含12,997个样本,旨在为语音情感识别、情感化文本生成等任务提供一个类别平衡、标注丰富的基准资源,对促进多模态情感理解的算法公平性与鲁棒性具有显著意义。
当前挑战
该数据集旨在应对语音情感分析领域的一个核心挑战:如何在高维、连续的情感空间中,对细微且混合的情感状态进行准确、一致的标注与建模。其构建过程本身亦面临多重挑战:首先,从海量的原始音频-文本对中,通过关键词匹配策略实现40个情感类别的严格数量平衡,需要克服原始数据分布高度偏斜的困难;其次,处理多情感标签样本(约占总样本7.3%)时,需在数据平衡与标签纯净度之间做出权衡;最后,确保筛选出的样本在时间描述、音质和来源分片上的多样性,以维持数据集的代表性和泛化能力,这些都对数据工程的精确性与规模提出了严格要求。
常用场景
经典使用场景
在情感计算与语音分析领域,Balanced Emotion Dataset — Majestrino with Temporal Detailed Captions 以其均衡的40类情感标注和时序性详细描述,成为训练多模态情感识别模型的经典资源。研究者常利用该数据集构建端到端的深度学习框架,通过音频信号与文本描述的联合建模,精准捕捉语音中细微的情感波动。其大规模且类别平衡的特性,有效支持了从粗粒度到细粒度的情感分类任务,尤其在跨模态对齐研究中,为探索声学特征与语义表达之间的关联提供了坚实基础。
解决学术问题
该数据集显著缓解了情感语音研究中长期存在的类别不平衡问题,通过为每类情感提供等量样本,确保了模型训练不会偏向高频情感。它推动了细粒度情感分类的发展,使研究者能够区分如‘感激’与‘希望’等相近情感状态。同时,时序性标注文本为理解情感在时间维度上的演变提供了可能,促进了动态情感建模与多模态融合方法的创新,对提升情感识别的解释性与鲁棒性具有重要学术价值。
衍生相关工作
围绕该数据集,已衍生出一系列经典研究工作。例如,基于其均衡特性设计的加权损失函数与数据增强策略,显著提升了少数情感类别的识别性能。多模态预训练模型利用其音频-文本对进行跨模态对比学习,学习到了更具泛化能力的联合表示。此外,研究者还利用其时序标注开发了注意力机制与循环神经网络相结合的架构,用于建模情感状态的动态转换,这些工作共同推动了情感智能领域的技术前沿。
以上内容由遇见数据集搜集并总结生成



