nguyenvulebinh/libris-asr-alignment

Name: nguyenvulebinh/libris-asr-alignment
Creator: nguyenvulebinh
Published: 2024-01-04 09:24:28
License: 暂无描述

Hugging Face2024-01-04 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/nguyenvulebinh/libris-asr-alignment

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: - config_name: default features: - name: id dtype: string - name: text dtype: string - name: audio dtype: audio: sampling_rate: 16000 - name: words sequence: string - name: word_start sequence: float64 - name: word_end sequence: float64 - name: entity_start sequence: int64 - name: entity_end sequence: int64 - name: entity_label sequence: string splits: - name: train num_bytes: 62881306.53508912 num_examples: 282 - name: valid num_bytes: 7162211.0760928225 num_examples: 56 download_size: 67766544 dataset_size: 70043517.61118194 - config_name: libris features: - name: id dtype: string - name: text dtype: string - name: audio dtype: audio: sampling_rate: 16000 - name: words sequence: string - name: word_start sequence: float64 - name: word_end sequence: float64 - name: entity_start sequence: int64 - name: entity_end sequence: int64 - name: entity_label sequence: string splits: - name: train num_bytes: 62881306.53508912 num_examples: 282 - name: valid num_bytes: 7162211.0760928225 num_examples: 56 download_size: 203299632 dataset_size: 70043517.61118194 - config_name: mustc features: - name: id dtype: string - name: text dtype: string - name: audio dtype: audio: sampling_rate: 16000 - name: words sequence: string - name: word_start sequence: float64 - name: word_end sequence: float64 - name: entity_start sequence: int64 - name: entity_end sequence: int64 - name: entity_label sequence: string splits: - name: train num_bytes: 55538132.852963656 num_examples: 249 - name: valid num_bytes: 2617438.3984375 num_examples: 15 download_size: 58416692 dataset_size: 58155571.251401156 configs: - config_name: default data_files: - split: train path: data/train-* - split: valid path: data/valid-* - config_name: libris data_files: - split: train path: libris/train-* - split: valid path: libris/valid-* - config_name: mustc data_files: - split: train path: mustc/train-* - split: valid path: mustc/valid-* ---

提供机构：

nguyenvulebinh

原始信息汇总

数据集概述

数据集配置

默认配置 (`default`)

特征:
- id: 字符串类型
- text: 字符串类型
- audio: 音频类型，采样率为16000
- words: 字符串序列
- word_start: 浮点数序列
- word_end: 浮点数序列
- entity_start: 整数序列
- entity_end: 整数序列
- entity_label: 字符串序列
分割:
- train: 字节数为62881306.53508912，样本数为282
- valid: 字节数为7162211.0760928225，样本数为56
下载大小: 67766544字节
数据集大小: 70043517.61118194字节

Libris配置 (`libris`)

特征:
- id: 字符串类型
- text: 字符串类型
- audio: 音频类型，采样率为16000
- words: 字符串序列
- word_start: 浮点数序列
- word_end: 浮点数序列
- entity_start: 整数序列
- entity_end: 整数序列
- entity_label: 字符串序列
分割:
- train: 字节数为62881306.53508912，样本数为282
- valid: 字节数为7162211.0760928225，样本数为56
下载大小: 203299632字节
数据集大小: 70043517.61118194字节

MuST-C配置 (`mustc`)

特征:
- id: 字符串类型
- text: 字符串类型
- audio: 音频类型，采样率为16000
- words: 字符串序列
- word_start: 浮点数序列
- word_end: 浮点数序列
- entity_start: 整数序列
- entity_end: 整数序列
- entity_label: 字符串序列
分割:
- train: 字节数为55538132.852963656，样本数为249
- valid: 字节数为2617438.3984375，样本数为15
下载大小: 58416692字节
数据集大小: 58155571.251401156字节

数据文件路径

默认配置 (`default`)

train: data/train-*
valid: data/valid-*

Libris配置 (`libris`)

train: libris/train-*
valid: libris/valid-*

MuST-C配置 (`mustc`)

train: mustc/train-*
valid: mustc/valid-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集

nguyenvulebinh/libris-asr-alignment

数据集概述

数据集配置

默认配置 (default)

Libris配置 (libris)

MuST-C配置 (mustc)

数据文件路径

默认配置 (default)

Libris配置 (libris)

MuST-C配置 (mustc)

默认配置 (`default`)

Libris配置 (`libris`)

MuST-C配置 (`mustc`)

默认配置 (`default`)

Libris配置 (`libris`)

MuST-C配置 (`mustc`)