svenskpotatis/Youtube_nat_sci_it_tec_raw
收藏Hugging Face2024-06-23 更新2024-06-29 收录
下载链接:
https://hf-mirror.com/datasets/svenskpotatis/Youtube_nat_sci_it_tec_raw
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含音频文件和对应的文本转录。数据集分为训练集、测试集和验证集三个部分,分别包含18171、2272和2271个样本。训练集的大小约为16001329422.793297字节,测试集约为2009980567.4041076字节,验证集约为2013805619.834595字节。数据集的下载大小为19831297011字节,总大小为20025115610.031998字节。
This dataset contains audio files and their corresponding text transcripts. The dataset is divided into three parts: training set, test set, and validation set, containing 18171, 2272, and 2271 samples respectively. The size of the training set is approximately 16001329422.793297 bytes, the test set is approximately 2009980567.4041076 bytes, and the validation set is approximately 2013805619.834595 bytes. The download size of the dataset is 19831297011 bytes, and the total size is 20025115610.031998 bytes.
提供机构:
svenskpotatis
原始信息汇总
数据集概述
数据特征
- 音频:数据类型为音频。
- 转录文本:数据类型为字符串。
数据集划分
- 训练集:
- 样本数量:18171
- 数据大小:约16 GB
- 测试集:
- 样本数量:2272
- 数据大小:约2 GB
- 验证集:
- 样本数量:2271
- 数据大小:约2 GB
数据集大小
- 下载大小:约19.8 GB
- 总数据集大小:约20 GB
配置
- 默认配置:
- 训练集路径:
data/train-* - 测试集路径:
data/test-* - 验证集路径:
data/valid-*
- 训练集路径:



