KAI-indian-emotional-speech-corpus

Name: KAI-indian-emotional-speech-corpus
Creator: maas
Published: 2025-12-05 16:44:15
License: 暂无描述

魔搭社区2025-12-05 更新2025-12-06 收录

下载链接：

https://modelscope.cn/datasets/Kratos-AI/KAI-indian-emotional-speech-corpus

下载链接

链接失效反馈

官方服务：

资源简介：

# Indian Emotional Speech Corpus ## Dataset Description This dataset comprises high-quality audio recordings of Indian speakers reading a standardized 50-word paragraph in four distinct emotional tones — **happy**, **sad**, **surprised**, and **angry**. Each recording is approximately 20–25 seconds long and includes the full paragraph with tone shifts at specific points. **Text spoken by all participants:** > (happy tone) *Last Monday was perfect—I got the job I’d been dreaming of! I screamed, hugged my parents, and we even had cake for breakfast.* > (sad tone) *But two days later, my mom called—our dog Bruno had passed away in his sleep. I didn’t even get to say goodbye.* > (surprised tone) *And just when I thought the week couldn’t get any crazier, I got a surprise call—I was selected for a global fellowship I never even applied for!* > (angry tone) *But earlier that day, my Wi-Fi cut out again during an important meeting... third time this week! It’s beyond frustrating. What a week. Honestly, I’m still processing it all.* The dataset supports training and evaluation of models in: - Automatic Speech Recognition (ASR) - Emotional tone classification - Voice synthesis and generation - Emotion-aware conversational agents --- ## Dataset Structure - **Audio Files**: Audio clips are stored in the `audio/` directory with names like `audio_001.wav`, `audio_002.mp3`, etc. - **Formats**: `.wav`, `.mp3`, `.m4a` - **Duration**: Each file is 20–25 seconds long - **Metadata** (in `metadata.csv`): - `file_name`: Name of the audio file - `transcription`: Full 50-word paragraph - `emotional_tones`: Order of tones (e.g., `happy;sad;surprised;angry`) - `age_group`: e.g., `18–30`, `31–45`, `46–60` - `gender`: `male` or `female` - `region`: e.g., `North India`, `South India`, `East India`, `West India` --- ## Intended Uses ### ✅ Direct Use - Training and benchmarking ASR models with Indian-accented English - Emotion detection and classification from voice - Research in affective computing and empathetic AI ### ❌ Out-of-Scope Use - Real-time or production-grade systems - Commercial use without proper CC BY 4.0 attribution - Clinical or diagnostic use cases --- ## Considerations and Limitations - ❗ The dataset is small (<1,000 samples) and not fully representative of India's linguistic and emotional diversity - 💡 Emotions are subjective — classification results may vary by listener or model - 🔄 Future versions will aim to expand multilingual support and speaker diversity --- ## License **CC BY 4.0** — You can use, modify, and share the dataset with appropriate credit. --- ## Contact - For queries or collaborations related to datasets, contact at : - anoushka@kgen.io - abhishek.vadapalli@kgen.io --- ## Citation **BibTeX:** ```bibtex @misc{indian_emotional_speech_corpus, title = {Indian Emotional Speech Corpus}, author = {Contributors from across India}, year = {2025}, howpublished = {\url{https://huggingface.co/datasets/indian_emotional_speech_corpus}}, note = {Dataset available under CC-BY-4.0 License} }

# 印度情感语音语料库（Indian Emotional Speech Corpus） ## 数据集描述本数据集包含印度说话者朗读标准化50词段落的高质量音频录音，共涵盖四种明确的情感语调——**开心**、**悲伤**、**惊讶**与**愤怒**。每条录音时长约20至25秒，完整朗读该段落，并在指定位置完成语调转换。 **所有参与者朗读的文本如下：** > （开心语调）*上周一完美至极——我拿到了梦寐以求的工作！我尖叫着拥抱了父母，甚至还把蛋糕当早餐享用。* > （悲伤语调）*但两天后，我妈妈打来电话——我的狗狗布鲁诺在睡梦中离世了。我甚至都没能和它道别。* > （惊讶语调）*就在我以为这周不会再发生更离谱的事时，我接到了一个惊喜来电——我入选了一项我从未申请过的全球奖学金项目！* > （愤怒语调）*但当天早些时候，我的Wi-Fi在一场重要会议期间又断了——这是本周第三次了！简直令人抓狂。这周真是糟透了。说实话，我还在消化这一切。* 本数据集可支持以下场景下的模型训练与评估： - 自动语音识别（ASR） - 情感语调分类 - 语音合成与生成 - 情感感知对话智能体 ## 数据集结构 - **音频文件**：音频片段存储于`audio/`目录下，命名格式如`audio_001.wav`、`audio_002.mp3`等。 - **文件格式**：支持`.wav`、`.mp3`、`.m4a`三种格式 - **时长**：每条音频文件时长为20至25秒 - **元数据**（存储于`metadata.csv`中）： - `file_name`：文件名 - `transcription`：完整转录文本 - `emotional_tones`：语调顺序（例如`happy;sad;surprised;angry`） - `age_group`：年龄段（例如`18–30`、`31–45`、`46–60`） - `gender`：性别，可选`male`（男性）或`female`（女性） - `region`：所属地区（例如`North India`（印度北部）、`South India`（印度南部）、`East India`（印度东部）、`West India`（印度西部）） ## 预期使用场景 ### ✅ 合法使用范围 - 针对印度口音英语的自动语音识别模型训练与基准测试 - 语音情感检测与分类 - 情感计算与共情人工智能领域的研究 ### ❌ 禁止使用范围 - 实时或生产级系统 - 未遵循CC BY 4.0协议进行适当署名的商业使用 - 临床或诊断类应用场景 ## 注意事项与局限性 ❗ 数据集规模较小（样本量不足1000），未能完全覆盖印度的语言与情感多样性 💡 情感表达具有主观性——分类结果可能因听众或模型不同而存在差异 🔄 未来版本将致力于拓展多语言支持与说话者多样性 ## 授权协议 **CC BY 4.0** — 您可在恰当署名的前提下使用、修改与共享本数据集。 ## 联系方式 - 若有数据集相关咨询或合作需求，请联系： - anoushka@kgen.io - abhishek.vadapalli@kgen.io ## 引用格式 **BibTeX:** bibtex @misc{indian_emotional_speech_corpus, title = {Indian Emotional Speech Corpus}, author = {Contributors from across India}, year = {2025}, howpublished = {url{https://huggingface.co/datasets/indian_emotional_speech_corpus}}, note = {Dataset available under CC-BY-4.0 License} }

提供机构：

maas

创建时间：

2025-08-01

5,000+

优质数据集

54 个

任务类型

进入经典数据集