AbijahKaj/telephony-amd-dataset

Name: AbijahKaj/telephony-amd-dataset
Creator: AbijahKaj
Published: 2026-04-29 21:36:50
License: 暂无描述

Hugging Face2026-04-29 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/AbijahKaj/telephony-amd-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

Telephony AMD (应答机检测) 数据集是一个多语言的4类电话音频分类数据集，用于训练流式应答机检测模型。数据集包含真实的人类语音（来自PolyAI/MINDS14）和TTS生成的音频（来自Microsoft Neural TTS / edge-tts），覆盖英语、法语、西班牙语和德语。数据集的设计原则是使语音邮件问候听起来与实时语音在声学上相同，以便模型能够通过内容而非声音来区分它们。数据集包含四个类别：人类语音、语音邮件、IVR系统和应答机。每个类别都有详细的语义提示。数据集的统计信息包括总样本数、音频格式、类别分布和语言分布。数据来源包括真实人类语音和TTS生成的语音，音频经过后处理以模拟电话环境。数据集的使用方法、训练建议、脚本和局限性也在README中详细描述。

The Telephony AMD (Answering Machine Detection) Dataset is a multilingual 4-class telephony audio classification dataset designed for training streaming Answering Machine Detection models. It contains real human speech (from PolyAI/MINDS14) mixed with TTS-generated audio (from Microsoft Neural TTS / edge-tts) across English, French, Spanish, and German. The key design principle is that voicemail greetings are recorded by real humans and sound acoustically identical to live speech, enabling models to distinguish them by WHAT is being said, not just HOW it sounds. The dataset includes four classes: human speech, voicemail, IVR systems, and answering machines, each with detailed semantic cues. Statistics include total samples, audio format, class distribution, and language distribution. Data sources include real human speech and TTS-generated speech, with audio post-processing to simulate telephony environments. The README also details usage instructions, training recommendations, scripts, and limitations.

提供机构：

AbijahKaj

5,000+

优质数据集

54 个

任务类型

进入经典数据集