dolly-vn/dolly-audio-1000h-vietnamese

Name: dolly-vn/dolly-audio-1000h-vietnamese
Creator: dolly-vn
Published: 2025-11-24 13:06:36
License: 暂无描述

Hugging Face2025-11-24 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/dolly-vn/dolly-audio-1000h-vietnamese

下载链接

链接失效反馈

官方服务：

资源简介：

--- configs: - config_name: default data_files: - split: train path: data/train-* dataset_info: features: - name: audio_filename dtype: string - name: text dtype: string - name: voice_id dtype: string - name: audio dtype: audio: decode: false splits: - name: train num_bytes: 165955597080 num_examples: 664125 download_size: 157800059320 dataset_size: 165955597080 language: - vi tags: - vietnamese - synthetic - audio - tts size_categories: - 100K<n<1M --- # *Dolly-Audio: Vietnamese Multi-Speaker High-Quality Speech Corpus* ## *Dataset Summary* *Dolly-Audio* is a large-scale, high-quality Vietnamese speech corpus created by the *Dolly AI Team*. Inspired by Dolly, the world’s first cloned mammal, the project aims to advance research in Vietnamese speech synthesis, speech recognition, and voice modeling. This release provides nearly *1,000 hours of professionally cleaned audio*, featuring *152 speakers* across different Vietnamese regions and speaking styles. Text transcripts span a wide variety of domains to ensure linguistic diversity and model robustness. --- ## *Key Features* * ~1,000 hours of high-quality Vietnamese speech * 152 multi-region speakers with diverse accents * Cleaned, noise-free audio; no background music * Sentence-level boundary trimming for natural prosody * Rich transcript domains (news, entertainment, education, conversational, etc.) * Estimated *near-zero WER (≈ 0%)* from manual sampling * Suitable for TTS, ASR, voice cloning, and speech research --- ## *Intended Use* The dataset is ideal for: * Multi-speaker text-to-speech (TTS) * Automatic speech recognition (ASR) * Voice cloning and speaker adaptation * Prosody modeling * Linguistic and phonetic research Commercial use is *not permitted* under the license. --- ## *Usage Restrictions* * Non-commercial *research use only* * Redistribution must comply with *CC-BY-NC-SA-4.0* * Users must verify dataset suitability for their research task * Institutional email required for access approval --- ## *Citation* If you use Dolly-Audio in your research, please credit the creators: *Nguyen Vinh Huy* — [nguyenvinhhuy@dtu.edu.vn](mailto:nguyenvinhhuy@dtu.edu.vn) *Nguyen Dinh Thuan* — [boyphuthien115@gmail.com](mailto:boyphuthien115@gmail.com) @dataset{dolly_audio_2025, title = {Dolly-Audio: Vietnamese Multi-Speaker High-Quality Speech Corpus}, author = {Nguyen, Vinh Huy and Nguyen, Dinh Thuan}, year = {2025}, publisher = {Dolly AI Team}, howpublished = {\url{https://huggingface.co/datasets/Dolly-AI/Dolly-Audio}}, note = {Released under CC-BY-NC-SA-4.0. Research use only.} } --- ## *Contact* For access requests or inquiries, please contact the maintainers via the emails above.

提供机构：

dolly-vn

5,000+

优质数据集

54 个

任务类型

进入经典数据集