jml2026/conversational-speech-dataset

Name: jml2026/conversational-speech-dataset
Creator: jml2026
Published: 2026-03-27 21:16:21
License: 暂无描述

Hugging Face2026-03-27 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/jml2026/conversational-speech-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-nc-4.0 language: - en task_categories: - automatic-speech-recognition - audio-classification tags: - voice-ai - speech-data - conversational-ai - real-world-audio - crowdsourced - multi-speaker - meeting-transcription pretty_name: "🎙️ Silencio Conversational Speech Dataset" configs: - config_name: conversational_english data_files: - split: conversations path: conversational_english/conversations/** size_categories: - n<1K --- # 🎙️ Silencio Network: Conversational Speech Dataset <p align="left"> <img src="https://cdn-uploads.huggingface.co/production/uploads/69162b50b89e7abe20de4b5a/LWhs4p2lPFcyiVsP0tluu.png" width="40%"> </p> [![Website](https://img.shields.io/badge/Website-silencioai.com-blue?style=flat-square)](https://www.silencioai.com) [![Contact](https://img.shields.io/badge/Contact-sofia@silencioai.com-green?style=flat-square)](mailto:sofia@silencioai.com) [![Data Available](https://img.shields.io/badge/Full_Corpus-100,000+_hours-orange?style=flat-square)](mailto:sofia@silencioai.com) --- ## Overview Sample conversational speech data from Silencio Network's crowdsourced voice AI platform. This dataset contains **multi-speaker meeting recordings** with word-level transcripts, speaker diarization, and rich demographic metadata. Each row represents one participant in a meeting and includes **3 audio files**: | Audio Column | Description | Format | |-------------|-------------|--------| | `file_name` (speaker audio) | Individual participant's isolated recording | WAV | | `full_meeting_single_channel` | Full meeting mixed to single channel | MP3 | | `full_meeting_multi_channel` | Full meeting with separate speaker channels | WAV | Plus **word-level meeting transcripts** with speaker turns and timestamps. ## Dataset Summary | Config | Language | Meetings | Participants | Total Audio | |--------|----------|----------|--------------|-------------| | `conversational_english` | English | 4 | 8 | 135.9 MB | ## 🚀 Quick Start ```python from datasets import load_dataset # Load conversational English samples ds = load_dataset("jml2026/conversational-speech-dataset", "conversational_english") conversations = ds['conversations'] for sample in conversations: speaker_audio = sample['audio'] # Individual speaker recording meeting_audio = sample['full_meeting_single_channel'] # Full meeting transcript = sample['meeting_transcript_text'] # Plain text transcript print(f"Speaker {sample['speaker_id']} ({sample['gender']}, {sample['dialect']})") print(f" Words spoken: {sample['speaker_word_count']}") print(f" Meeting duration: {sample['meeting_duration']}s") ``` ## Schema | Column | Type | Description | |--------|------|-------------| | `file_name` | Audio | Individual speaker's isolated audio recording | | `meeting_id` | int | Unique meeting identifier | | `speaker_id` | string | Deterministic UUID for the speaker | | `gender` | string | Speaker gender | | `ethnicity` | string | Speaker ethnicity | | `occupation` | string | Speaker occupation | | `birth_place` | string | Speaker birth place | | `dialect` | string | Speaker dialect | | `year_of_birth` | int | Speaker year of birth | | `years_at_birth_place` | int | Years lived at birth place | | `languages_data` | string | JSON array of languages spoken with proficiency levels | | `language` | string | Meeting language | | `meeting_duration` | int | Meeting duration in seconds | | `meeting_transcript_word_count` | int | Total word count of the meeting transcript | | `speaker_word_count` | int | Word count for this speaker | | `full_meeting_single_channel` | Audio | Full meeting audio mixed to single channel | | `full_meeting_multi_channel` | Audio | Full meeting audio with separate speaker channels | | `meeting_transcript_json` | string | Full meeting transcript as JSON with word-level timestamps and speaker IDs | | `meeting_transcript_text` | string | Plain text meeting transcript with speaker turns | ## About Silencio Network Silencio Network operates a global platform with **500,000+ contributors** across **130+ countries**, collecting voice data through a mobile app. The full corpus exceeds **100,000 hours** of validated speech data. For access to the complete dataset or custom data collection, contact **sofia@silencioai.com**. ## License This sample dataset is released under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/). Commercial licensing is available upon request.

提供机构：

jml2026

5,000+

优质数据集

54 个

任务类型

进入经典数据集