five

humyn-labs/Indic-High-Fidelity-SingleSpeaker-ASR

收藏
Hugging Face2026-03-13 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/humyn-labs/Indic-High-Fidelity-SingleSpeaker-ASR
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 configs: - config_name: default data_files: - split: train path: data/train-* dataset_info: features: - name: audio dtype: audio: sampling_rate: 16000 - name: file_name dtype: string - name: gender dtype: string - name: age dtype: string - name: language dtype: string - name: city dtype: string - name: transcript dtype: string splits: - name: train num_bytes: 21313122 num_examples: 139 download_size: 20014907 dataset_size: 21313122 task_categories: - audio-classification language: - hi - te - mr - bn - gu - ta tags: - single-speaker - speech - natural-speech - ai-research - voice-analysis - ASR - INDIC-languages size_categories: - n<1K --- # Dataset Overview This dataset contains high-quality single-speaker conversational audio recordings curated for Automatic Speech Recognition (ASR) research across multiple Indic languages. The dataset includes: - Paired audio + transcripts - Natural, non-scripted speech - Single-speaker interactions - Regionally diverse accents # Audio Specifications - Format: WAV (PCM 16-bit) - Sampling Rate: 16-24 kHz - Channel: Mono - Speech Type: Natural conversational dialogue - Typical Duration: 10–30 minutes per recording # Supported Languages This dataset includes conversational speech recordings in: - Bengali - Gujarati - Hindi - Marathi - Punjabi - Tamil - Telugu - Odia - Urdu The dataset preserves natural accent variation and conversational speech characteristics. # Speaker Representation - Single-speaker recordings - Natural, spontaneous dialogue - Regionally representative speakers # Dataset Creation Methodology ## Data Collection Speech data was collected from native speakers across multiple Indian regions to ensure: - Accent diversity - Natural conversational flow - Real-world dialogue patterns - Informal and semi-formal speech contexts Topics include: - Everyday life discussions - Social interactions - Business and finance - Public affairs - General conversational topics # Transcription Process - Manual transcription by native speakers - Reviewed for linguistic accuracy - Preserves conversational fillers and natural pauses # Intended Use Designed for: - Training and fine-tuning ASR models - Conversational ASR benchmarking - Speaker gender detection - Single-speaker modeling - Academic and open research # Out-of-Scope Uses This dataset is not intended for: - Safety-critical or real-time production systems without additional validation - Commercial deployment without attribution (CC BY 4.0 required) - Medical, clinical, legal, or diagnostic applications # License Creative Commons Attribution 4.0 International (CC BY 4.0) 📬 Contact For dataset-related queries, please contact:- [support@humynlabs.ai]
提供机构:
humyn-labs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作