humyn-labs/Indic-High-Fidelity-MultiSpeaker-ASR
收藏Hugging Face2026-03-14 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/humyn-labs/Indic-High-Fidelity-MultiSpeaker-ASR
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
dataset_info:
features:
- name: language
dtype: string
- name: file_name
dtype: string
- name: audio
dtype: audio
- name: transcript_json
dtype: string
splits:
- name: train
num_bytes: 1065664379
num_examples: 37
download_size: 1064810251
dataset_size: 1065664379
task_categories:
- automatic-speech-recognition
language:
- hi
- ml
- mr
- te
- ta
- bn
- kn
- bh
- as
- gu
- pa
tags:
- ASR
- Conversational-speech
- multi-speaker
- indic-languages
size_categories:
- n<1K
---
# Dataset Overview
This dataset contains high-quality multi-speaker conversational audio recordings curated for Automatic Speech Recognition (ASR) research across multiple Indic languages.
The dataset includes:
- Paired audio + timestamped transcripts
- Natural, non-scripted conversational speech
- Dual-speaker interactions
- Segment-level speaker annotations
- Regionally diverse accents
# Audio Specifications
- Format: WAV (PCM 16-bit)
- Sampling Rate: 16 kHz
- Channel: Mono
- Speech Type: Natural conversational dialogue
- Recording Style: Dual-speaker spontaneous interaction
- Typical Duration: 10–30 minutes per recording
All audio files are normalized to ensure consistent duration reporting and playback compatibility.
# Supported Languages
This dataset includes conversational speech recordings in:
- Assamese
- Odia
- Bengali
- Bhojpuri
- Chhattisgarhi
- Gujarati
- Haryanvi
- Hindi
- Punjabi
- Marathi
- Tamil
- Kannada
- Malayalam
- Telugu
The dataset preserves natural accent variation and conversational speech characteristics.
# Speaker Representation
- Dual-speaker conversational recordings
- Natural, spontaneous dialogue
- Regionally representative speakers
- Conversational turn-taking preserved
# Dataset Creation Methodology
## Data Collection
Speech data was collected from native speakers across multiple Indian regions to ensure:
- Accent diversity
- Natural conversational flow
- Real-world dialogue patterns
- Informal and semi-formal speech contexts
Topics include:
- Everyday life discussions
- Social interactions
- Business and finance
- Public affairs
- General conversational topics
# Transcription Process
- Manual transcription by native speakers
- Reviewed for linguistic accuracy
- Timestamp-level segmentation
- Speaker-labeled segments
- Preserves conversational fillers and natural pauses
Each transcript entry contains:
- start timestamp
- end timestamp
- speaker label
- text content
# Intended Use
Designed for:
- Training and fine-tuning ASR models
- Conversational ASR benchmarking
- Speaker diarization research
- Speaker turn detection
- Multi-speaker modeling
- Academic and open research
# Out-of-Scope Uses
This dataset is not intended for:
- Safety-critical or real-time production systems without additional validation
- Commercial deployment without attribution (CC BY 4.0 required)
- Medical, clinical, legal, or diagnostic applications
# License
Creative Commons Attribution 4.0 International (CC BY 4.0)
📬 Contact
For dataset-related queries, please contact:-
[support@humynlabs.ai]
提供机构:
humyn-labs



