arabic-audio-dataset
收藏魔搭社区2026-04-27 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/Kratos-AI/arabic-audio-dataset
下载链接
链接失效反馈官方服务:
资源简介:
# Arabic Audio Dataset
*This dataset contains high-quality (“A-grade”) data. It has been carefully curated, cleaned, and verified to ensure accuracy, completeness, and consistency, making it suitable for high-stakes or production-grade model training.
## Contact
For queries or collaborations related to this dataset, contact:
- support@humynlabs.ai
## Supported Tasks
- **Task Categories**: Speech Emotion Recognition (SER)
- **Supported Tasks**:
- Emotion classification from speech
- Audio signal processing for affective computing
- Speaker demographic analysis
- Cross-cultural emotion recognition research
- Voice synthesis with emotional expression (secondary use)
## Languages
- **Primary Language**: Arabic
## Dataset Creation
### Curation Rationale
This dataset was created to advance Arabic speech emotion recognition research by providing labeled emotional speech samples across different demographic groups.
### Source Data
- **Contributors**: Arabic native speakers across different age groups and genders
### Other Known Limitations
- **Size**: Relatively small dataset may limit model generalization
- **Audio Quality**: Variations in recording conditions may affect model performance
- **Regional Dialects**: May not represent all Arabic regional speech patterns
## Intended Uses
### ✅ Direct Use
- Training and benchmarking Speech Emotion Recognition models for Arabic
- Research in cross-cultural emotion recognition
- Development of Arabic-language affective computing applications
- Academic research in computational linguistics and psychology
### ❌ Out-of-Scope Use
- Real-time production systems without additional validation
- Clinical or diagnostic applications for mental health
- Commercial use without proper attribution
- Surveillance or privacy-invasive applications
## License
CC BY 4.0
# 阿拉伯语音频数据集
*本数据集包含高质量(A级)音频数据,经过精心筛选、清理与验证,确保数据的准确性、完整性与一致性,适用于高风险场景或生产级模型训练。
## 联系我们
若您有关于本数据集的咨询或合作需求,请联系:
- anoushka@kgen.io
- abhishek.vadapalli@kgen.io
## 支持任务
- **任务类别**:语音情感识别(Speech Emotion Recognition, SER)
- **支持任务**:
- 语音情感分类
- 情感计算领域的音频信号处理
- 说话者人口统计学特征分析
- 跨文化情感识别研究
- 带情感表达的语音合成(二次使用场景)
## 语言信息
- **主要语言**:阿拉伯语
## 数据集构建
### 筛选依据
本数据集旨在通过提供覆盖不同人口统计学群体的标注情感语音样本,推动阿拉伯语语音情感识别研究的发展。
### 源数据
- **贡献者**:来自不同年龄层与性别的阿拉伯语母语使用者
### 其他已知局限性
- **数据规模**:数据集规模相对较小,可能限制模型的泛化能力
- **音频质量**:录音环境存在差异,可能对模型性能造成影响
- **区域方言**:未能覆盖所有阿拉伯语区域的语音模式
## 预期用途
### ✅ 直接使用场景
- 训练与评测阿拉伯语语音情感识别模型
- 开展跨文化情感识别研究
- 开发阿拉伯语情感计算应用
- 计算语言学与心理学领域的学术研究
### ❌ 超出适用范围的使用场景
- 未经过额外验证的实时生产系统部署
- 心理健康相关的临床或诊断应用
- 未进行合理署名的商业使用
- 监控或侵犯隐私的应用场景
## 许可协议
CC BY 4.0
提供机构:
maas
创建时间:
2025-08-30



