wjustus01/dana-voice-dataset2
收藏Hugging Face2025-04-01 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/wjustus01/dana-voice-dataset2
下载链接
链接失效反馈官方服务:
资源简介:
dana-voice-dataset2是一个为Unsloth微调优化的TTS(文本到语音)数据集,包含111个音频样本。音频样本的采样率为24000 Hz,平均音频时长为10.32秒。转录文本的平均token数为43.1,所有转录文本的token数均不超过1852(考虑了48个token的安全边际)。所有样本均使用名为Dana的声音。数据集经过预处理,确保所有转录文本长度适合,避免在使用类似Unsloth的库进行微调时出现错误。数据集适用于Orpheus等TTS模型,并在使用Unsloth进行微调时建议将max_seq_length参数设置为小于等于1900。
dana-voice-dataset2 is a TTS (Text-to-Speech) dataset optimized for fine-tuning with Unsloth, containing 111 audio samples. The audio samples have a sampling rate of 24000 Hz, with an average duration of 10.32 seconds. The average number of tokens per transcript is 43.1, with all transcripts below 1852 tokens (including a 48 token safety margin). All samples use the voice named Dana. The dataset has been pre-processed to ensure that all transcripts are of a suitable length, avoiding AssertionError: Padding is larger than block size errors during fine-tuning with libraries like Unsloth. The dataset is formatted for use with TTS models such as Orpheus, and when fine-tuning with Unsloth, it is recommended to set the max_seq_length parameter to less than or equal to 1900.
提供机构:
wjustus01



