Egyptian_People_Speaking_Video_Dataset

Name: Egyptian_People_Speaking_Video_Dataset
Creator: maas
Published: 2025-12-05 16:54:47
License: 暂无描述

魔搭社区2025-12-05 更新2025-12-06 收录

下载链接：

https://modelscope.cn/datasets/Kratos-AI/Egyptian_People_Speaking_Video_Dataset

下载链接

链接失效反馈

官方服务：

资源简介：

# Egyptian People Speaking Video Dataset *This dataset contains high-quality video recordings of Egyptian people speaking on a range of topics. It is curated for AI research in speech recognition, multimodal analysis, topic understanding, and spoken-language modeling.* ## Contact For queries or collaborations related to this dataset, contact: - anoushka@kgen.io - abhishek.vadapalli@kgen.io ## Supported Tasks - **Task Categories**: - Video Classification - Speech-to-Text (Automatic Speech Recognition) - Multimodal Understanding - **Supported Tasks**: - Transcription of spoken Egyptian Arabic or bilingual (Arabic-English) speech - Topic classification based on speech content - Sentiment and emotion analysis from spoken video - Speaker diarization and identity-agnostic speaker recognition - Multimodal analysis combining audio, video, and gestures - Research in human-computer interaction and AI communication models ## Languages - **Primary Language**: Egyptian Arabic - **Secondary Presence**: English (if code-switching occurs) ## Dataset Creation ### Curation Rationale The dataset was created to support AI systems in understanding natural spoken Egyptian Arabic, enabling speech-to-text, topic modeling, and multimodal interaction research. It is intended for educational, accessibility, and research purposes. ### Source Data - **Contributors**: Volunteers and participants recorded speaking on assigned topics - **Collection Process**: Videos were recorded in controlled or natural environments. Personal identifiers unrelated to the speech task were removed or anonymized. All participants gave explicit consent for research use. ### Other Known Limitations - **Bias**: Urban dialects and younger speakers may be overrepresented - **Environmental Noise**: Background noise may affect speech clarity - **Topic Coverage**: Selected topics may not fully reflect all cultural or regional contexts in Egypt ## Intended Uses ### ✅ Direct Use - Training Automatic Speech Recognition (ASR) systems for Egyptian Arabic - Topic classification and sentiment analysis research - Multimodal AI studies combining video, audio, and gestures - Human-computer interaction and accessibility applications ### ❌ Out-of-Scope Use - Surveillance or identification of individuals without consent - Commercial exploitation without ethical clearance - Any use violating participant privacy or recording agreements ## License CC BY 4.0

# 埃及民众演讲视频数据集（Egyptian People Speaking Video Dataset）本数据集收录了埃及民众围绕各类主题发表演讲的高质量视频录像，专为语音识别、多模态分析、主题理解与口语语言建模等领域的人工智能研究精心编选。 ## 联系方式若您对本数据集有咨询或合作需求，请联系： - anoushka@kgen.io - abhishek.vadapalli@kgen.io ## 支持任务 ### 任务类别 - 视频分类 - 语音转文字（自动语音识别，Automatic Speech Recognition） - 多模态理解 ### 支持的具体任务 - 埃及阿拉伯语口语或阿英双语口语的转录工作 - 基于语音内容的主题分类 - 来自口语视频的情感与情绪分析 - 说话人 diarization（Speaker Diarization）与身份无关的说话人识别 - 结合音频、视频与肢体动作的多模态分析 - 人机交互与人工智能通信模型相关研究 ## 语言类型 - **主要语言**：埃及阿拉伯语 - **次要语言**：英语（当出现语码转换时） ## 数据集构建 ### 编选依据本数据集旨在支撑人工智能系统理解自然口语埃及阿拉伯语，推动语音转文字、主题建模以及多模态交互等领域的研究，主要用于教育、无障碍服务与学术研究场景。 ### 源数据采集 - **贡献者**：自愿参与的受访者，围绕指定主题发表演讲 - **采集流程**：视频在受控或自然环境中录制完成，与语音任务无关的个人身份信息均已移除或匿名化处理，所有参与者均已明确同意将其内容用于学术研究。 ### 其他已知局限性 - **偏差问题**：城市方言与年轻受访者的占比可能偏高 - **环境噪声**：背景噪声可能影响语音清晰度 - **主题覆盖**：所选主题未能完全涵盖埃及所有文化或区域语境 ## 预期用途 ### ✅ 直接用途 - 训练面向埃及阿拉伯语的自动语音识别（Automatic Speech Recognition）系统 - 主题分类与情感分析相关研究 - 结合音频、视频与肢体动作的多模态人工智能研究 - 人机交互与无障碍应用开发 ### ❌ 越权使用 - 未经授权的个人监视或身份识别 - 未获得伦理许可的商业开发利用 - 任何违反参与者隐私或录制协议的使用行为 ## 授权协议 CC BY 4.0

提供机构：

maas

创建时间：

2025-10-15

5,000+

优质数据集

54 个

任务类型

进入经典数据集