Natural-ASR-Samples
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/Kratos-AI/Natural-ASR-Samples
下载链接
链接失效反馈官方服务:
资源简介:
# Natural ASR Speech Samples
This set contains natural two-person conversations recorded in Hinglish, Tamil, and Bengali, paired with high-quality human-generated transcriptions. The recordings capture spontaneous, real-life dialogue including pauses, fillers, overlaps, and informal phrasing, making it ideal for building robust Automatic Speech Recognition (ASR) systems. Also we can support all indic languages and accents across India + Arabic + Korean + Vietnamese + Portuguese
---
## Dataset Features
- Natural, unscripted two-speaker conversations
- Hinglish, Tamil, and Bengali multilingual coverage
- Varied speaking speeds, tones, and regional accents
- Clean and accurate human transcriptions
- Includes pauses, interruptions, and conversational flow
- Suitable for research and commercial ASR development with attribution
---
## Intended Uses
### ✅ Direct Use
- Training multilingual and code-mixed ASR models
- Benchmarking conversational ASR performance
- Hinglish language modeling
- Accent-robust ASR system development
- Dialogue understanding and speech-to-text tasks
- Evaluation of spontaneous-speech ASR accuracy
### ❌ Out-of-Scope Use
- Speaker or biometric identification
- Psychological, emotion, or behavior profiling
- Medical or clinical speech analysis
- Commercial deployment without CC BY 4.0 credit
- Real-time mission-critical ASR applications
---
## Considerations and Limitations
- ❗ Dataset size is limited (<1,000 samples) and may not include all dialects
- 🎧 Contains fillers, hesitations, and overlapping conversation (true natural speech)
- 🗣️ Accent diversity exists but is not fully representative of all regions
- 🔄 Future versions will add more speakers, languages, and recording environments
---
## License
**CC BY 4.0** — Free to use, modify, distribute, and publish with attribution.
---
## Contact
For dataset collaboration, contribution, or citation details, contact:
- arunabh@kgen.io
# 自然语音识别(Automatic Speech Recognition, ASR)语音样本集
本数据集包含以印地语英语混合语(Hinglish)、泰米尔语(Tamil)、孟加拉语(Bengali)录制的自然双人对话,并搭配高质量人工生成的转录文本。录音涵盖了包含停顿、填充语、重叠语音及非正式措辞的自发真实对话,非常适合构建鲁棒性强的自动语音识别系统。此外,本数据集可覆盖印度境内所有印度语系语言及地域口音,同时兼容阿拉伯语、韩语、越南语与葡萄牙语。
---
## 数据集特性
- 自然无脚本双人对话
- 覆盖印地语英语混合语、泰米尔语、孟加拉语的多语言场景
- 涵盖多样的语速、语调与区域口音
- 附带精准规范的人工转录文本
- 完整保留停顿、插话与自然对话流
- 适用于需标注的学术研究与商用自动语音识别系统开发
---
## 预期用途
### ✅ 合法适用场景
- 多语言与语码混合自动语音识别模型的训练
- 对话式自动语音识别性能的基准测试
- 印地语英语混合语语言建模
- 口音鲁棒型自动语音识别系统的研发
- 对话理解与语音转文本任务
- 自发语音识别准确率的评估
### ❌ 禁止适用场景
- 说话人或生物特征识别
- 心理、情绪或行为画像分析
- 医疗或临床语音分析
- 未遵循CC BY 4.0协议署名的商用部署
- 实时关键任务型自动语音识别应用
---
## 注意事项与局限性
- ❗ 数据集规模有限(样本量不足1000),未覆盖所有方言变体
- 🎧 包含填充语、犹豫语与重叠对话(符合真实自然语音特征)
- 🗣️ 虽具备口音多样性,但未能全面代表印度所有区域口音
- 🔄 未来版本将新增更多说话人、语言类型与录制场景
---
## 授权协议
**CC BY 4.0** — 可免费使用、修改、分发与发布,但需标注原作者。
---
## 联系方式
如需数据集合作、贡献或引用详情,请联系:
- arunabh@kgen.io
提供机构:
maas
创建时间:
2025-11-27



