five

Menlo/raw-speech-whispervq-v2

收藏
Hugging Face2024-08-30 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Menlo/raw-speech-whispervq-v2
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: dutch features: - name: source dtype: string - name: tokens sequence: int64 - name: index dtype: int64 - name: text dtype: string splits: - name: train num_bytes: 3129087752 num_examples: 374287 download_size: 574575043 dataset_size: 3129087752 - config_name: english features: - name: source dtype: string - name: tokens sequence: int64 - name: index dtype: int64 - name: text dtype: string splits: - name: train num_bytes: 20134474501 num_examples: 2420047 download_size: 3786301162 dataset_size: 20134474501 - config_name: french features: - name: source dtype: string - name: tokens sequence: int64 - name: index dtype: int64 - name: text dtype: string splits: - name: train num_bytes: 2168482757 num_examples: 258213 download_size: 401737677 dataset_size: 2168482757 - config_name: german features: - name: source dtype: string - name: tokens sequence: int64 - name: index dtype: int64 - name: text dtype: string splits: - name: train num_bytes: 3957952347 num_examples: 469942 download_size: 701424594 dataset_size: 3957952347 - config_name: italian features: - name: source dtype: string - name: tokens sequence: int64 - name: index dtype: int64 - name: text dtype: string splits: - name: train num_bytes: 518527170 num_examples: 62133 download_size: 94846254 dataset_size: 518527170 - config_name: polish features: - name: source dtype: string - name: tokens sequence: int64 - name: index dtype: int64 - name: text dtype: string splits: - name: train num_bytes: 217204197 num_examples: 26083 download_size: 40512114 dataset_size: 217204197 - config_name: portuguese features: - name: source dtype: string - name: tokens sequence: int64 - name: index dtype: int64 - name: text dtype: string splits: - name: train num_bytes: 338470564 num_examples: 39230 download_size: 62422578 dataset_size: 338470564 - config_name: spanish features: - name: source dtype: string - name: tokens sequence: int64 - name: index dtype: int64 - name: text dtype: string splits: - name: train num_bytes: 1846859432 num_examples: 220701 download_size: 346253619 dataset_size: 1846859432 configs: - config_name: dutch data_files: - split: train path: dutch/train-* - config_name: english data_files: - split: train path: english/train-* - config_name: french data_files: - split: train path: french/train-* - config_name: german data_files: - split: train path: german/train-* - config_name: italian data_files: - split: train path: italian/train-* - config_name: polish data_files: - split: train path: polish/train-* - config_name: portuguese data_files: - split: train path: portuguese/train-* - config_name: spanish data_files: - split: train path: spanish/train-* ---
提供机构:
Menlo
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作