humyn-labs/Indic-High-Fidelity-SingleSpeaker-ASR
收藏Hugging Face2026-03-13 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/humyn-labs/Indic-High-Fidelity-SingleSpeaker-ASR
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
dataset_info:
features:
- name: audio
dtype:
audio:
sampling_rate: 16000
- name: file_name
dtype: string
- name: gender
dtype: string
- name: age
dtype: string
- name: language
dtype: string
- name: city
dtype: string
- name: transcript
dtype: string
splits:
- name: train
num_bytes: 21313122
num_examples: 139
download_size: 20014907
dataset_size: 21313122
task_categories:
- audio-classification
language:
- hi
- te
- mr
- bn
- gu
- ta
tags:
- single-speaker
- speech
- natural-speech
- ai-research
- voice-analysis
- ASR
- INDIC-languages
size_categories:
- n<1K
---
# Dataset Overview
This dataset contains high-quality single-speaker conversational audio recordings curated for Automatic Speech Recognition (ASR) research across multiple Indic languages.
The dataset includes:
- Paired audio + transcripts
- Natural, non-scripted speech
- Single-speaker interactions
- Regionally diverse accents
# Audio Specifications
- Format: WAV (PCM 16-bit)
- Sampling Rate: 16-24 kHz
- Channel: Mono
- Speech Type: Natural conversational dialogue
- Typical Duration: 10–30 minutes per recording
# Supported Languages
This dataset includes conversational speech recordings in:
- Bengali
- Gujarati
- Hindi
- Marathi
- Punjabi
- Tamil
- Telugu
- Odia
- Urdu
The dataset preserves natural accent variation and conversational speech characteristics.
# Speaker Representation
- Single-speaker recordings
- Natural, spontaneous dialogue
- Regionally representative speakers
# Dataset Creation Methodology
## Data Collection
Speech data was collected from native speakers across multiple Indian regions to ensure:
- Accent diversity
- Natural conversational flow
- Real-world dialogue patterns
- Informal and semi-formal speech contexts
Topics include:
- Everyday life discussions
- Social interactions
- Business and finance
- Public affairs
- General conversational topics
# Transcription Process
- Manual transcription by native speakers
- Reviewed for linguistic accuracy
- Preserves conversational fillers and natural pauses
# Intended Use
Designed for:
- Training and fine-tuning ASR models
- Conversational ASR benchmarking
- Speaker gender detection
- Single-speaker modeling
- Academic and open research
# Out-of-Scope Uses
This dataset is not intended for:
- Safety-critical or real-time production systems without additional validation
- Commercial deployment without attribution (CC BY 4.0 required)
- Medical, clinical, legal, or diagnostic applications
# License
Creative Commons Attribution 4.0 International (CC BY 4.0)
📬 Contact
For dataset-related queries, please contact:-
[support@humynlabs.ai]
提供机构:
humyn-labs



