SAMPLE Conversational Audio Data | Multi-Speaker | 13 Languages | GPS-Verified | Global Coverage

Name: SAMPLE Conversational Audio Data | Multi-Speaker | 13 Languages | GPS-Verified | Global Coverage
Creator: Rwazi
License: 暂无描述

Databricks2026-03-13 收录

下载链接：

https://marketplace.databricks.com/details/421f271c-e2f7-4554-9e18-b438f520542d/Rwazi_SAMPLE-Conversational-Audio-Data-Multi-Speaker-13-Languages-GPS-Verified-Global-Coverage

下载链接

链接失效反馈

官方服务：

资源简介：

High-quality conversational audio dataset collected through a global network of 3M+ verified data collectors across 150+ countries. Each recording captures natural, unscripted 2-3 person conversations in real-world environments — homes, cafes, markets, offices — not studio conditions. LANGUAGE COVERAGE: US English: Regional accent diversity (South, Midwest, Northeast, West Coast, NYC, Pacific NW, Southwest) Arabic: Egyptian, Gulf, Levantine, and Maghreb dialects across Egypt, Saudi Arabia, UAE, Jordan, Morocco European: French, German, Spanish, Portuguese Asian: Thai, Tagalog, Indonesian, Hindi African: Swahili, Yoruba, Amharic RECORDING SPECIFICATIONS: Duration: 10-30 minutes per conversation Format: WAV | 16-48 kHz sample rate Speakers: 2-3 per recording Type: Unscripted, natural conversations METADATA PER RECORDING: GPS-verified location (country, city, coordinates) Recording environment classification Device type and audio quality metrics (SNR) Conversation type and topic tags (daily life, food, family, work, shopping, storytelling, community) Per-speaker demographics: age range, gender, native language, education level LICENSING & COMPLIANCE: Explicit written consent from all participants Full commercial usage rights PII scrubbed from all metadata Auditable consent chain IDEAL FOR: ASR model training, speaker diarization, accent/dialect classification, conversational AI, emotion detection, language identification, and multilingual NLP. Scalable to custom volume, language, and demographic specifications on request.

提供机构：

Rwazi

5,000+

优质数据集

54 个

任务类型

进入经典数据集