five

Egyptian_People_Speaking_Video_Dataset

收藏
魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/Kratos-AI/Egyptian_People_Speaking_Video_Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
# Egyptian People Speaking Video Dataset *This dataset contains high-quality video recordings of Egyptian people speaking on a range of topics. It is curated for AI research in speech recognition, multimodal analysis, topic understanding, and spoken-language modeling.* ## Contact For queries or collaborations related to this dataset, contact: - anoushka@kgen.io - abhishek.vadapalli@kgen.io ## Supported Tasks - **Task Categories**: - Video Classification - Speech-to-Text (Automatic Speech Recognition) - Multimodal Understanding - **Supported Tasks**: - Transcription of spoken Egyptian Arabic or bilingual (Arabic-English) speech - Topic classification based on speech content - Sentiment and emotion analysis from spoken video - Speaker diarization and identity-agnostic speaker recognition - Multimodal analysis combining audio, video, and gestures - Research in human-computer interaction and AI communication models ## Languages - **Primary Language**: Egyptian Arabic - **Secondary Presence**: English (if code-switching occurs) ## Dataset Creation ### Curation Rationale The dataset was created to support AI systems in understanding natural spoken Egyptian Arabic, enabling speech-to-text, topic modeling, and multimodal interaction research. It is intended for educational, accessibility, and research purposes. ### Source Data - **Contributors**: Volunteers and participants recorded speaking on assigned topics - **Collection Process**: Videos were recorded in controlled or natural environments. Personal identifiers unrelated to the speech task were removed or anonymized. All participants gave explicit consent for research use. ### Other Known Limitations - **Bias**: Urban dialects and younger speakers may be overrepresented - **Environmental Noise**: Background noise may affect speech clarity - **Topic Coverage**: Selected topics may not fully reflect all cultural or regional contexts in Egypt ## Intended Uses ### ✅ Direct Use - Training Automatic Speech Recognition (ASR) systems for Egyptian Arabic - Topic classification and sentiment analysis research - Multimodal AI studies combining video, audio, and gestures - Human-computer interaction and accessibility applications ### ❌ Out-of-Scope Use - Surveillance or identification of individuals without consent - Commercial exploitation without ethical clearance - Any use violating participant privacy or recording agreements ## License CC BY 4.0

# 埃及民众演讲视频数据集(Egyptian People Speaking Video Dataset) 本数据集收录了埃及民众围绕各类主题发表演讲的高质量视频录像,专为语音识别、多模态分析、主题理解与口语语言建模等领域的人工智能研究精心编选。 ## 联系方式 若您对本数据集有咨询或合作需求,请联系: - anoushka@kgen.io - abhishek.vadapalli@kgen.io ## 支持任务 ### 任务类别 - 视频分类 - 语音转文字(自动语音识别,Automatic Speech Recognition) - 多模态理解 ### 支持的具体任务 - 埃及阿拉伯语口语或阿英双语口语的转录工作 - 基于语音内容的主题分类 - 来自口语视频的情感与情绪分析 - 说话人 diarization(Speaker Diarization)与身份无关的说话人识别 - 结合音频、视频与肢体动作的多模态分析 - 人机交互与人工智能通信模型相关研究 ## 语言类型 - **主要语言**:埃及阿拉伯语 - **次要语言**:英语(当出现语码转换时) ## 数据集构建 ### 编选依据 本数据集旨在支撑人工智能系统理解自然口语埃及阿拉伯语,推动语音转文字、主题建模以及多模态交互等领域的研究,主要用于教育、无障碍服务与学术研究场景。 ### 源数据采集 - **贡献者**:自愿参与的受访者,围绕指定主题发表演讲 - **采集流程**:视频在受控或自然环境中录制完成,与语音任务无关的个人身份信息均已移除或匿名化处理,所有参与者均已明确同意将其内容用于学术研究。 ### 其他已知局限性 - **偏差问题**:城市方言与年轻受访者的占比可能偏高 - **环境噪声**:背景噪声可能影响语音清晰度 - **主题覆盖**:所选主题未能完全涵盖埃及所有文化或区域语境 ## 预期用途 ### ✅ 直接用途 - 训练面向埃及阿拉伯语的自动语音识别(Automatic Speech Recognition)系统 - 主题分类与情感分析相关研究 - 结合音频、视频与肢体动作的多模态人工智能研究 - 人机交互与无障碍应用开发 ### ❌ 越权使用 - 未经授权的个人监视或身份识别 - 未获得伦理许可的商业开发利用 - 任何违反参与者隐私或录制协议的使用行为 ## 授权协议 CC BY 4.0
提供机构:
maas
创建时间:
2025-10-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作