Custom Common Voice 16.0 dataset using RVC 14min data
收藏arXiv2024-03-01 更新2024-06-21 收录
下载链接:
https://huggingface.co/Aniket-Tathe08/Custom Common Voice 16.0 dataset using RVC 14min data
下载链接
链接失效反馈官方服务:
资源简介:
本研究创建了一个名为‘Custom Common Voice 16.0 dataset using RVC 14min data’的数据集,由MES工程学院的Aniket Tathe主导。该数据集包含从YouTube视频中提取的14分钟个性化音频,通过Retrieval-Based Voice Conversion (RVC)模型处理,用于训练ASR模型。数据集的创建过程涉及音频提取、转换和格式化,以适应模型训练。该数据集主要用于个性化语音的自动语音识别和翻译,特别是在低资源语言如印地语中,旨在提高语音识别的准确性和翻译的流畅性。
This study developed a dataset named 'Custom Common Voice 16.0 dataset using RVC 14min data', led by Aniket Tathe from MES College of Engineering. The dataset contains 14 minutes of personalized audio extracted from YouTube videos, processed via the Retrieval-Based Voice Conversion (RVC) model for training Automatic Speech Recognition (ASR) models. The dataset creation process involves audio extraction, conversion, and formatting to meet the requirements of model training. This dataset is primarily intended for automatic speech recognition and translation of personalized speech, particularly in low-resource languages such as Hindi, with the goal of improving the accuracy of speech recognition and the fluency of translation.
提供机构:
MES工程学院
创建时间:
2024-03-01



