five

Customer Service Voice Data Stream | High-Emotion Audio Dataset for LLM & NLP Training (US Accents)

收藏
Databricks2026-02-13 收录
下载链接:
https://marketplace.databricks.com/details/6ca5361b-cb0e-4682-9d83-d7b73ac0eabb/WiserBrand-com_Customer-Service-Voice-Data-Stream-High-Emotion-Audio-Dataset-for-LLM-&-NLP-Training-(US-Accents)
下载链接
链接失效反馈
官方服务:
资源简介:
We provide Real-World Customer Support Voice Data. This dataset captures the "long-tail" of human interactions: frustration, urgency, interruptions, and diverse acoustic conditions essential for building robust AI models. Why This Data is Unique? Most datasets represent the "Happy Path" (calm, polite speech). Our data solves the Class Imbalance problem by providing high-friction interactions. High Emotional Variance: Authentic anger, sarcasm, and relief — critical for training Sentiment Analysis and Empathetic AI. Real-World Acoustics: Background noise, phone line compression, and crosstalk (interruptions). Diverse Demographics: 90% unique speakers featuring a wide range of US dialects and accents. Dataset Specifications: Volume: We capture over 1,600 hours of new audio daily. Format: .wav, .flac Audio files paired with segmented, time-stamped transcriptions. Metadata: Rich labeling including Call Reason (Intent), Location, OS, Duration, and Speaker Turns. Privacy: All PII is redacted via automated and human-in-the-loop processes. Perfect For Training: LLMs & NLP: Fine-tuning models on conversational logic and intent recognition. ASR (Speech-to-Text): Improving accuracy on "messy" audio and diverse accents. Voicebots: Teaching agents to handle objections and emotional escalations. Customer Intelligence: Churn prediction and conflict resolution analytics. Data Origin: 100% sourced from real US consumers contacting customer support. Not synthetic, not scripted.
提供机构:
WiserBrand.com
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作