doof-ferb/VietMed_unlabeled
收藏Hugging Face2024-07-06 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/doof-ferb/VietMed_unlabeled
下载链接
链接失效反馈官方服务:
资源简介:
VietMed未标注数据集是一个包含越南语医疗领域语音数据的数据集,主要用于自动语音识别和文本到语音转换任务。数据集包含230,516个样本,总计966小时的音频数据。每个样本包含音频文件和元数据ID。数据集还提供了元数据信息,如领域、ICD-10代码和口音等。
The VietMed unlabeled set is a Vietnamese speech dataset primarily used for automatic speech recognition and text-to-speech tasks. It contains approximately 230,516 samples with a total duration of 966 hours. The dataset features include audio files and metadata IDs, where the audio files are of audio type and the metadata IDs are string type. The dataset is divided into a training set containing 230,516 samples. The download size of the dataset is 51,899,577,807 bytes, and the actual size is 57,670,081,699.38 bytes. The dataset configuration is named default, with data file paths as data/train-*. Additionally, the dataset includes metadata information such as domain, ICD-10 code, and accent.
提供机构:
doof-ferb



