dolly-vn/dolly-audio-1000h-vietnamese
收藏Hugging Face2025-11-24 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/dolly-vn/dolly-audio-1000h-vietnamese
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
dataset_info:
features:
- name: audio_filename
dtype: string
- name: text
dtype: string
- name: voice_id
dtype: string
- name: audio
dtype:
audio:
decode: false
splits:
- name: train
num_bytes: 165955597080
num_examples: 664125
download_size: 157800059320
dataset_size: 165955597080
language:
- vi
tags:
- vietnamese
- synthetic
- audio
- tts
size_categories:
- 100K<n<1M
---
# *Dolly-Audio: Vietnamese Multi-Speaker High-Quality Speech Corpus*
## *Dataset Summary*
*Dolly-Audio* is a large-scale, high-quality Vietnamese speech corpus created by the *Dolly AI Team*.
Inspired by Dolly, the world’s first cloned mammal, the project aims to advance research in Vietnamese speech synthesis, speech recognition, and voice modeling.
This release provides nearly *1,000 hours of professionally cleaned audio*, featuring *152 speakers* across different Vietnamese regions and speaking styles. Text transcripts span a wide variety of domains to ensure linguistic diversity and model robustness.
---
## *Key Features*
* ~1,000 hours of high-quality Vietnamese speech
* 152 multi-region speakers with diverse accents
* Cleaned, noise-free audio; no background music
* Sentence-level boundary trimming for natural prosody
* Rich transcript domains (news, entertainment, education, conversational, etc.)
* Estimated *near-zero WER (≈ 0%)* from manual sampling
* Suitable for TTS, ASR, voice cloning, and speech research
---
## *Intended Use*
The dataset is ideal for:
* Multi-speaker text-to-speech (TTS)
* Automatic speech recognition (ASR)
* Voice cloning and speaker adaptation
* Prosody modeling
* Linguistic and phonetic research
Commercial use is *not permitted* under the license.
---
## *Usage Restrictions*
* Non-commercial *research use only*
* Redistribution must comply with *CC-BY-NC-SA-4.0*
* Users must verify dataset suitability for their research task
* Institutional email required for access approval
---
## *Citation*
If you use Dolly-Audio in your research, please credit the creators:
*Nguyen Vinh Huy* — [nguyenvinhhuy@dtu.edu.vn](mailto:nguyenvinhhuy@dtu.edu.vn)
*Nguyen Dinh Thuan* — [boyphuthien115@gmail.com](mailto:boyphuthien115@gmail.com)
@dataset{dolly_audio_2025,
title = {Dolly-Audio: Vietnamese Multi-Speaker High-Quality Speech Corpus},
author = {Nguyen, Vinh Huy and Nguyen, Dinh Thuan},
year = {2025},
publisher = {Dolly AI Team},
howpublished = {\url{https://huggingface.co/datasets/Dolly-AI/Dolly-Audio}},
note = {Released under CC-BY-NC-SA-4.0. Research use only.}
}
---
## *Contact*
For access requests or inquiries, please contact the maintainers via the emails above.
提供机构:
dolly-vn



