five

dianavdavidson/indicvoices-hinglish-spontaneous

收藏
Hugging Face2026-04-13 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/dianavdavidson/indicvoices-hinglish-spontaneous
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: audio dtype: audio - name: text dtype: string - name: duration dtype: float64 - name: lang dtype: string - name: samples dtype: int64 - name: verbatim dtype: string - name: normalized dtype: string - name: speaker_id dtype: string - name: scenario dtype: string - name: task_name dtype: string - name: gender dtype: string - name: age_group dtype: string - name: job_type dtype: string - name: qualification dtype: string - name: area dtype: string - name: district dtype: string - name: state dtype: string - name: occupation dtype: string - name: verification_report dtype: string - name: unsanitized_verbatim dtype: string - name: unsanitized_normalized dtype: string - name: unsanitized_no_noise_inds dtype: string - name: unsanitized_no_latin dtype: string - name: hinglish_mixed_scripts dtype: string - name: hinglish_mixed_script_lowercase dtype: string - name: english_words dtype: string - name: count_english_words dtype: int64 - name: count_dev_words_with_dupes dtype: int64 - name: count_dev_words_no_dupes dtype: int64 - name: ratio_hindi_words dtype: float64 - name: ratio_english_words dtype: float64 - name: ratio_english_words_range dtype: string - name: hindi_words dtype: string - name: unique_hindi_words sequence: string - name: unique_hindi_words_count dtype: int64 - name: unique_english_words sequence: string - name: unique_english_words_count dtype: int64 splits: - name: train num_bytes: 28403900600.673443 num_examples: 272168 - name: valid num_bytes: 7222807780.594915 num_examples: 69213 - name: test num_bytes: 390006987.0170343 num_examples: 4391 download_size: 35635929932 dataset_size: 36016715368.28539 configs: - config_name: default data_files: - split: train path: data/train-* - split: valid path: data/valid-* - split: test path: data/test-* ---
提供机构:
dianavdavidson
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作