five

humyn-labs/Indic-High-Fidelity-MultiSpeaker-ASR

收藏
Hugging Face2026-03-14 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/humyn-labs/Indic-High-Fidelity-MultiSpeaker-ASR
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 configs: - config_name: default data_files: - split: train path: data/train-* dataset_info: features: - name: language dtype: string - name: file_name dtype: string - name: audio dtype: audio - name: transcript_json dtype: string splits: - name: train num_bytes: 1065664379 num_examples: 37 download_size: 1064810251 dataset_size: 1065664379 task_categories: - automatic-speech-recognition language: - hi - ml - mr - te - ta - bn - kn - bh - as - gu - pa tags: - ASR - Conversational-speech - multi-speaker - indic-languages size_categories: - n<1K --- # Dataset Overview This dataset contains high-quality multi-speaker conversational audio recordings curated for Automatic Speech Recognition (ASR) research across multiple Indic languages. The dataset includes: - Paired audio + timestamped transcripts - Natural, non-scripted conversational speech - Dual-speaker interactions - Segment-level speaker annotations - Regionally diverse accents # Audio Specifications - Format: WAV (PCM 16-bit) - Sampling Rate: 16 kHz - Channel: Mono - Speech Type: Natural conversational dialogue - Recording Style: Dual-speaker spontaneous interaction - Typical Duration: 10–30 minutes per recording All audio files are normalized to ensure consistent duration reporting and playback compatibility. # Supported Languages This dataset includes conversational speech recordings in: - Assamese - Odia - Bengali - Bhojpuri - Chhattisgarhi - Gujarati - Haryanvi - Hindi - Punjabi - Marathi - Tamil - Kannada - Malayalam - Telugu The dataset preserves natural accent variation and conversational speech characteristics. # Speaker Representation - Dual-speaker conversational recordings - Natural, spontaneous dialogue - Regionally representative speakers - Conversational turn-taking preserved # Dataset Creation Methodology ## Data Collection Speech data was collected from native speakers across multiple Indian regions to ensure: - Accent diversity - Natural conversational flow - Real-world dialogue patterns - Informal and semi-formal speech contexts Topics include: - Everyday life discussions - Social interactions - Business and finance - Public affairs - General conversational topics # Transcription Process - Manual transcription by native speakers - Reviewed for linguistic accuracy - Timestamp-level segmentation - Speaker-labeled segments - Preserves conversational fillers and natural pauses Each transcript entry contains: - start timestamp - end timestamp - speaker label - text content # Intended Use Designed for: - Training and fine-tuning ASR models - Conversational ASR benchmarking - Speaker diarization research - Speaker turn detection - Multi-speaker modeling - Academic and open research # Out-of-Scope Uses This dataset is not intended for: - Safety-critical or real-time production systems without additional validation - Commercial deployment without attribution (CC BY 4.0 required) - Medical, clinical, legal, or diagnostic applications # License Creative Commons Attribution 4.0 International (CC BY 4.0) 📬 Contact For dataset-related queries, please contact:- [support@humynlabs.ai]
提供机构:
humyn-labs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作