Egyptian_People_Speaking_Video_Dataset
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/Kratos-AI/Egyptian_People_Speaking_Video_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
# Egyptian People Speaking Video Dataset
*This dataset contains high-quality video recordings of Egyptian people speaking on a range of topics. It is curated for AI research in speech recognition, multimodal analysis, topic understanding, and spoken-language modeling.*
## Contact
For queries or collaborations related to this dataset, contact:
- anoushka@kgen.io
- abhishek.vadapalli@kgen.io
## Supported Tasks
- **Task Categories**:
- Video Classification
- Speech-to-Text (Automatic Speech Recognition)
- Multimodal Understanding
- **Supported Tasks**:
- Transcription of spoken Egyptian Arabic or bilingual (Arabic-English) speech
- Topic classification based on speech content
- Sentiment and emotion analysis from spoken video
- Speaker diarization and identity-agnostic speaker recognition
- Multimodal analysis combining audio, video, and gestures
- Research in human-computer interaction and AI communication models
## Languages
- **Primary Language**: Egyptian Arabic
- **Secondary Presence**: English (if code-switching occurs)
## Dataset Creation
### Curation Rationale
The dataset was created to support AI systems in understanding natural spoken Egyptian Arabic, enabling speech-to-text, topic modeling, and multimodal interaction research. It is intended for educational, accessibility, and research purposes.
### Source Data
- **Contributors**: Volunteers and participants recorded speaking on assigned topics
- **Collection Process**: Videos were recorded in controlled or natural environments. Personal identifiers unrelated to the speech task were removed or anonymized. All participants gave explicit consent for research use.
### Other Known Limitations
- **Bias**: Urban dialects and younger speakers may be overrepresented
- **Environmental Noise**: Background noise may affect speech clarity
- **Topic Coverage**: Selected topics may not fully reflect all cultural or regional contexts in Egypt
## Intended Uses
### ✅ Direct Use
- Training Automatic Speech Recognition (ASR) systems for Egyptian Arabic
- Topic classification and sentiment analysis research
- Multimodal AI studies combining video, audio, and gestures
- Human-computer interaction and accessibility applications
### ❌ Out-of-Scope Use
- Surveillance or identification of individuals without consent
- Commercial exploitation without ethical clearance
- Any use violating participant privacy or recording agreements
## License
CC BY 4.0
# 埃及民众演讲视频数据集(Egyptian People Speaking Video Dataset)
本数据集收录了埃及民众围绕各类主题发表演讲的高质量视频录像,专为语音识别、多模态分析、主题理解与口语语言建模等领域的人工智能研究精心编选。
## 联系方式
若您对本数据集有咨询或合作需求,请联系:
- anoushka@kgen.io
- abhishek.vadapalli@kgen.io
## 支持任务
### 任务类别
- 视频分类
- 语音转文字(自动语音识别,Automatic Speech Recognition)
- 多模态理解
### 支持的具体任务
- 埃及阿拉伯语口语或阿英双语口语的转录工作
- 基于语音内容的主题分类
- 来自口语视频的情感与情绪分析
- 说话人 diarization(Speaker Diarization)与身份无关的说话人识别
- 结合音频、视频与肢体动作的多模态分析
- 人机交互与人工智能通信模型相关研究
## 语言类型
- **主要语言**:埃及阿拉伯语
- **次要语言**:英语(当出现语码转换时)
## 数据集构建
### 编选依据
本数据集旨在支撑人工智能系统理解自然口语埃及阿拉伯语,推动语音转文字、主题建模以及多模态交互等领域的研究,主要用于教育、无障碍服务与学术研究场景。
### 源数据采集
- **贡献者**:自愿参与的受访者,围绕指定主题发表演讲
- **采集流程**:视频在受控或自然环境中录制完成,与语音任务无关的个人身份信息均已移除或匿名化处理,所有参与者均已明确同意将其内容用于学术研究。
### 其他已知局限性
- **偏差问题**:城市方言与年轻受访者的占比可能偏高
- **环境噪声**:背景噪声可能影响语音清晰度
- **主题覆盖**:所选主题未能完全涵盖埃及所有文化或区域语境
## 预期用途
### ✅ 直接用途
- 训练面向埃及阿拉伯语的自动语音识别(Automatic Speech Recognition)系统
- 主题分类与情感分析相关研究
- 结合音频、视频与肢体动作的多模态人工智能研究
- 人机交互与无障碍应用开发
### ❌ 越权使用
- 未经授权的个人监视或身份识别
- 未获得伦理许可的商业开发利用
- 任何违反参与者隐私或录制协议的使用行为
## 授权协议
CC BY 4.0
提供机构:
maas
创建时间:
2025-10-15



