five

Premium Studio-Grade, Multi-Speaker English Conversational Audio Dataset with Stems + ...

收藏
Databricks2025-11-19 收录
下载链接:
https://marketplace.databricks.com/details/68ed5233-d538-4eff-93d4-606970efb579/ACNetwork_Premium-Studio-Grade,-Multi-Speaker-English-Conversational-Audio-Dataset-with-Stems-+-
下载链接
链接失效反馈
官方服务:
资源简介:
This premium studio-grade conversational audio dataset delivers over 34.5K hours of multi-speaker English content captured from professional sports and talk studio environments. Each file includes clean mixed-down audio plus individual isolated stems, diarized transcripts at word and utterance level, unique speaker IDs, and sentiment annotations - enabling precise speech modeling, diarization training, and emotion-aware LLMs. The dataset spans thousands of hours of dynamic, domain-specific conversations with 2+ speakers per episode, covering teams, markets, and sports genres across multiple US markets, capturing regional parts of speech, slang, and dialects/accents from over 150+ unique voices. Data is fully rights-cleared and indemnified for AI training, ensuring compliance and commercial safety. New content is continuously added at an average rate of 500+ hours per month, keeping the dataset current and diverse for model refresh and fine-tuning use-cases. Ideal for LLM developers and AI teams training speech-to-text, speaker embedding, sentiment, and multimodal retrieval models. Delivered securely via Google Drive or AWS S3 with metadata in MP3, WAV, JSON, SRT, VTT and TXT formats. Enterprise licensing only - contact ACNetwork for commercial terms or large sample access.
提供机构:
ACNetwork
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作