five

DarDATAI/SouthAfrican_Accented_English_SpeechData

收藏
Hugging Face2025-12-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/DarDATAI/SouthAfrican_Accented_English_SpeechData
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含1小时的南非口音英语语音,配有经过人工标注的转录文本。作为公开样本,它展示了完整商业版本的数据集结构、音频质量和转录风格。数据集包括高质量的MP3格式音频文件和CSV元数据文件,其中包含唯一标识符、音频文件路径和文本转录。转录过程结合了自动语音识别(ASR)系统和人工审核,以反映真实世界的噪声和不完美情况。该数据集适用于口音分析、ASR实验和商业评估,但由于样本量小和转录错误率的存在,不适合大规模模型训练。

This dataset contains 1 hour of South African–accented English speech paired with human-annotated transcripts. It is provided as a public sample to demonstrate the dataset structure, audio quality, and transcription style used in the full commercial version. The dataset includes high-quality MP3 audio files and a CSV metadata file containing unique identifiers, audio file paths, and text transcripts. The transcription process combines Automatic Speech Recognition (ASR) systems and human review to reflect real-world noise and imperfections. This dataset is suitable for accent analysis, ASR experimentation, and commercial assessment, but due to its small sample size and transcription error rate, it is not intended for full-scale model training.
提供机构:
DarDATAI
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作