KAI-indian-emotional-speech-corpus
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/Kratos-AI/KAI-indian-emotional-speech-corpus
下载链接
链接失效反馈官方服务:
资源简介:
# Indian Emotional Speech Corpus
## Dataset Description
This dataset comprises high-quality audio recordings of Indian speakers reading a standardized 50-word paragraph in four distinct emotional tones — **happy**, **sad**, **surprised**, and **angry**.
Each recording is approximately 20–25 seconds long and includes the full paragraph with tone shifts at specific points.
**Text spoken by all participants:**
> (happy tone) *Last Monday was perfect—I got the job I’d been dreaming of! I screamed, hugged my parents, and we even had cake for breakfast.*
> (sad tone) *But two days later, my mom called—our dog Bruno had passed away in his sleep. I didn’t even get to say goodbye.*
> (surprised tone) *And just when I thought the week couldn’t get any crazier, I got a surprise call—I was selected for a global fellowship I never even applied for!*
> (angry tone) *But earlier that day, my Wi-Fi cut out again during an important meeting... third time this week! It’s beyond frustrating. What a week. Honestly, I’m still processing it all.*
The dataset supports training and evaluation of models in:
- Automatic Speech Recognition (ASR)
- Emotional tone classification
- Voice synthesis and generation
- Emotion-aware conversational agents
---
## Dataset Structure
- **Audio Files**: Audio clips are stored in the `audio/` directory with names like `audio_001.wav`, `audio_002.mp3`, etc.
- **Formats**: `.wav`, `.mp3`, `.m4a`
- **Duration**: Each file is 20–25 seconds long
- **Metadata** (in `metadata.csv`):
- `file_name`: Name of the audio file
- `transcription`: Full 50-word paragraph
- `emotional_tones`: Order of tones (e.g., `happy;sad;surprised;angry`)
- `age_group`: e.g., `18–30`, `31–45`, `46–60`
- `gender`: `male` or `female`
- `region`: e.g., `North India`, `South India`, `East India`, `West India`
---
## Intended Uses
### ✅ Direct Use
- Training and benchmarking ASR models with Indian-accented English
- Emotion detection and classification from voice
- Research in affective computing and empathetic AI
### ❌ Out-of-Scope Use
- Real-time or production-grade systems
- Commercial use without proper CC BY 4.0 attribution
- Clinical or diagnostic use cases
---
## Considerations and Limitations
- ❗ The dataset is small (<1,000 samples) and not fully representative of India's linguistic and emotional diversity
- 💡 Emotions are subjective — classification results may vary by listener or model
- 🔄 Future versions will aim to expand multilingual support and speaker diversity
---
## License
**CC BY 4.0** — You can use, modify, and share the dataset with appropriate credit.
---
## Contact
- For queries or collaborations related to datasets, contact at :
- anoushka@kgen.io
- abhishek.vadapalli@kgen.io
---
## Citation
**BibTeX:**
```bibtex
@misc{indian_emotional_speech_corpus,
title = {Indian Emotional Speech Corpus},
author = {Contributors from across India},
year = {2025},
howpublished = {\url{https://huggingface.co/datasets/indian_emotional_speech_corpus}},
note = {Dataset available under CC-BY-4.0 License}
}
# 印度情感语音语料库(Indian Emotional Speech Corpus)
## 数据集描述
本数据集包含印度说话者朗读标准化50词段落的高质量音频录音,共涵盖四种明确的情感语调——**开心**、**悲伤**、**惊讶**与**愤怒**。
每条录音时长约20至25秒,完整朗读该段落,并在指定位置完成语调转换。
**所有参与者朗读的文本如下:**
> (开心语调)*上周一完美至极——我拿到了梦寐以求的工作!我尖叫着拥抱了父母,甚至还把蛋糕当早餐享用。*
> (悲伤语调)*但两天后,我妈妈打来电话——我的狗狗布鲁诺在睡梦中离世了。我甚至都没能和它道别。*
> (惊讶语调)*就在我以为这周不会再发生更离谱的事时,我接到了一个惊喜来电——我入选了一项我从未申请过的全球奖学金项目!*
> (愤怒语调)*但当天早些时候,我的Wi-Fi在一场重要会议期间又断了——这是本周第三次了!简直令人抓狂。这周真是糟透了。说实话,我还在消化这一切。*
本数据集可支持以下场景下的模型训练与评估:
- 自动语音识别(ASR)
- 情感语调分类
- 语音合成与生成
- 情感感知对话智能体
## 数据集结构
- **音频文件**:音频片段存储于`audio/`目录下,命名格式如`audio_001.wav`、`audio_002.mp3`等。
- **文件格式**:支持`.wav`、`.mp3`、`.m4a`三种格式
- **时长**:每条音频文件时长为20至25秒
- **元数据**(存储于`metadata.csv`中):
- `file_name`:文件名
- `transcription`:完整转录文本
- `emotional_tones`:语调顺序(例如`happy;sad;surprised;angry`)
- `age_group`:年龄段(例如`18–30`、`31–45`、`46–60`)
- `gender`:性别,可选`male`(男性)或`female`(女性)
- `region`:所属地区(例如`North India`(印度北部)、`South India`(印度南部)、`East India`(印度东部)、`West India`(印度西部))
## 预期使用场景
### ✅ 合法使用范围
- 针对印度口音英语的自动语音识别模型训练与基准测试
- 语音情感检测与分类
- 情感计算与共情人工智能领域的研究
### ❌ 禁止使用范围
- 实时或生产级系统
- 未遵循CC BY 4.0协议进行适当署名的商业使用
- 临床或诊断类应用场景
## 注意事项与局限性
❗ 数据集规模较小(样本量不足1000),未能完全覆盖印度的语言与情感多样性
💡 情感表达具有主观性——分类结果可能因听众或模型不同而存在差异
🔄 未来版本将致力于拓展多语言支持与说话者多样性
## 授权协议
**CC BY 4.0** — 您可在恰当署名的前提下使用、修改与共享本数据集。
## 联系方式
- 若有数据集相关咨询或合作需求,请联系:
- anoushka@kgen.io
- abhishek.vadapalli@kgen.io
## 引用格式
**BibTeX:**
bibtex
@misc{indian_emotional_speech_corpus,
title = {Indian Emotional Speech Corpus},
author = {Contributors from across India},
year = {2025},
howpublished = {url{https://huggingface.co/datasets/indian_emotional_speech_corpus}},
note = {Dataset available under CC-BY-4.0 License}
}
提供机构:
maas
创建时间:
2025-08-01



