five

voices-with-captions

收藏
魔搭社区2025-12-05 更新2025-11-15 收录
下载链接:
https://modelscope.cn/datasets/laion/voices-with-captions
下载链接
链接失效反馈
官方服务:
资源简介:
# 🗣️ Synthetic Voice Description Dataset for Voice Conversion ## Overview This dataset contains **3.180 synthetically generated voice samples**, each paired with a **caption** that roughly describes how the voice sounds. Captions typically include information about the **age**, **gender**, **accent**, and sometimes **general vocal qualities** (e.g., "old woman, Irish accent"). All voices are **synthetically generated** and do not represent any real person. This makes the dataset ideal for training and evaluating **voice conversion models** without infringing on personhood or identity rights. --- ## 📁 Dataset Contents - A **CSV file** containing pairs of: - `description`: A short text caption describing the voice - `matched_filename`: The name of the corresponding `.wav` file **Example rows from the CSV:** ```csv description,matched_filename "old woman, Irish accent",00399-000.wav "old man, valley girl accent",01074-001.wav "young man, Scottish accent",18297-001.wav "middle-aged man, Chinese accent",00952-002.wav ``` - A **TAR archive** that contains: - The `voice-captions.csv` file (as described above) - All 3.180 corresponding `.wav` audio files --- ## 🎯 Purpose The dataset is designed to serve as a **target files** for voice conversion tasks. It can be used to: - Convert existing voice data into synthetic voices with known, described properties --- ## ⚙️ Google Colab for Seed-VC Fine-Tuning A community-developed fine-tuning notebook for **Seed-VC** (a voice conversion model) is available on Google Colab: 🔗 **[Seed-VC Fine-Tune Notebook](https://colab.research.google.com/drive/1HeJgMIRpEMd87z5oAcfBfS8_YRLvrwr9?usp=sharing)** This notebook allows you to take real or synthetic voice data and convert it into one of the target voices.

# 🗣️ 语音转换(Voice Conversion)用合成语音描述数据集 ## 概述 本数据集包含**3180个合成生成的语音样本**,每个样本均搭配一段**语音描述标注**,用于大致描述该语音的声学特征。标注内容通常涵盖**年龄**、**性别**、**口音**,有时还会包含**通用嗓音特质**(例如“老年女性,爱尔兰口音”)。 所有语音均为合成生成,不代表任何真实自然人,因此本数据集可安全用于训练与评估**语音转换模型**,不会侵犯他人人格权与身份权益。 --- ## 📁 数据集内容 - 一个**CSV文件**,包含以下成对字段: - `description`:描述该语音的简短文本标注 - `matched_filename`:对应`.wav`音频文件的文件名 **CSV文件示例行:** csv description,matched_filename "老年女性,爱尔兰口音",00399-000.wav "老年男性,谷区少女口音",01074-001.wav "年轻男性,苏格兰口音",18297-001.wav "中年男性,中国口音",00952-002.wav - 一个**TAR归档文件**,内含: - 上述的`voice-captions.csv`文件 - 全部3180个对应`.wav`音频文件 --- ## 🎯 用途 本数据集旨在作为**语音转换任务的目标语音文件集**,可用于: - 将现有语音数据转换为具备已知可描述属性的合成语音 --- ## ⚙️ 用于Seed-VC微调的Google Colab工具 一款由社区开发的**Seed-VC(语音转换模型)微调笔记本**已发布于Google Colab: 🔗 **[Seed-VC 微调笔记本](https://colab.research.google.com/drive/1HeJgMIRpEMd87z5oAcfBfS8_YRLvrwr9?usp=sharing)** 该笔记本支持将真实或合成语音数据转换为上述目标语音之一。
提供机构:
maas
创建时间:
2025-10-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作