kennethzhang/all_nba_datasets
收藏Hugging Face2025-02-13 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/kennethzhang/all_nba_datasets
下载链接
链接失效反馈官方服务:
资源简介:
这个数据集包含了音频段落的转录文本,主要用于微调自动语音识别(ASR)模型,例如OpenAI的Whisper。每个样本包括一个音频片段、对应的转录文本以及片段的时间戳信息(开始和结束时间)。数据集分为两个部分:训练集包含46个示例(约5.7MB),验证集包含16个示例(约1.5MB)。该数据集对于需要时间戳语音识别的任务特别有用。
This dataset contains transcriptions of audio segments, primarily designed for fine-tuning automatic speech recognition (ASR) models like OpenAIs Whisper. Each sample includes an audio clip, its corresponding transcription, and timestamp information (start and end times) for the segment. The data is organized into two splits: the training set contains 46 examples (~5.7MB), and the validation set contains 16 examples (~1.5MB). The dataset is particularly useful for tasks requiring timestamped speech recognition.
提供机构:
kennethzhang



