five

calistacxy/imda-dataset

收藏
Hugging Face2023-05-17 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/calistacxy/imda-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: CHANNEL0FCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0FINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0FMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0FOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0Fall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0MCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0MINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0MMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0MOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0Mall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0allCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0allINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0allMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0allOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL0allall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1FCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1FINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1FMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1FOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1Fall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1MCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1MINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1MMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1MOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1Mall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1allCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1allINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1allMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1allOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL1allall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2FCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2FINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2FMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2FOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2Fall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2MCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2MINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2MMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2MOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2Mall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2allCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2allINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2allMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2allOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: CHANNEL2allall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 105139921 num_examples: 682 - name: test num_bytes: 103309694 num_examples: 693 download_size: 0 dataset_size: 208449615 - config_name: allFCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allFINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allFMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allFOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allFall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allMCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allMINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allMMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allMOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allMall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allallCHINESE features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allallINDIAN features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allallMALAY features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allallOTHERS features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 - config_name: allallall features: - name: audio dtype: audio: sampling_rate: 16000 - name: transcript dtype: string - name: mic dtype: string - name: audio_name dtype: string splits: - name: train num_bytes: 315419763 num_examples: 2046 - name: test num_bytes: 309929082 num_examples: 2079 download_size: 0 dataset_size: 625348845 ---
提供机构:
calistacxy
原始信息汇总

数据集概述

数据集配置名称

  • CHANNEL0FCHINESE
  • CHANNEL0FINDIAN
  • CHANNEL0FMALAY
  • CHANNEL0FOTHERS
  • CHANNEL0Fall
  • CHANNEL0MCHINESE
  • CHANNEL0MINDIAN
  • CHANNEL0MMALAY
  • CHANNEL0MOTHERS
  • CHANNEL0Mall
  • CHANNEL0allCHINESE
  • CHANNEL0allINDIAN
  • CHANNEL0allMALAY
  • CHANNEL0allOTHERS
  • CHANNEL0allall
  • CHANNEL1FCHINESE
  • CHANNEL1FINDIAN
  • CHANNEL1FMALAY
  • CHANNEL1FOTHERS
  • CHANNEL1Fall
  • CHANNEL1MCHINESE
  • CHANNEL1MINDIAN
  • CHANNEL1MMALAY
  • CHANNEL1MOTHERS
  • CHANNEL1Mall
  • CHANNEL1allCHINESE
  • CHANNEL1allINDIAN
  • CHANNEL1allMALAY
  • CHANNEL1allOTHERS
  • CHANNEL1allall
  • CHANNEL2FCHINESE
  • CHANNEL2FINDIAN
  • CHANNEL2FMALAY
  • CHANNEL2FOTHERS
  • CHANNEL2Fall
  • CHANNEL2MCHINESE
  • CHANNEL2MINDIAN
  • CHANNEL2MMALAY
  • CHANNEL2MOTHERS
  • CHANNEL2Mall
  • CHANNEL2allCHINESE
  • CHANNEL2allINDIAN
  • CHANNEL2allMALAY
  • CHANNEL2allOTHERS
  • CHANNEL2allall
  • allFCHINESE
  • allFINDIAN
  • allFMALAY
  • allFOTHERS
  • allFall

数据集特征

  • audio: 采样率为16000
  • transcript: 数据类型为string
  • mic: 数据类型为string
  • audio_name: 数据类型为string

数据集分割

  • train:
    • 字节数: 105139921
    • 示例数: 682
  • test:
    • 字节数: 103309694
    • 示例数: 693

数据集大小

  • 下载大小: 0
  • 数据集总大小: 208449615

汇总统计

  • 所有配置的数据集大小和分割统计一致,仅配置名称不同。
  • 所有配置的音频采样率均为16000。
  • 所有配置的特征类型包括音频、转录文本、麦克风标识和音频名称。
  • 所有配置的训练和测试集示例数和字节数相同。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作