five

thu-spmi/librispeech-phoneme-labels

收藏
Hugging Face2025-12-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/thu-spmi/librispeech-phoneme-labels
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - automatic-speech-recognition - translation language: - en tags: - lecture pretty_name: d size_categories: - 10M<n<100M --- # LibriSpeech IPA Phoneme Labels This repository provides **IPA-based phoneme annotations and lexicon** for the LibriSpeech dataset. All phoneme labels are converted from **CMU Pronouncing Dictionary (CMU-Dict)** phonemes into **IPA symbols** using deterministic rules, with the help of the following toolkit: - https://pypi.org/project/pinyin-to-ipa The data is intended for **phoneme-based ASR**, **P2G/G2P research**, **phoneme CTC / AED models**, and **cross-lingual phoneme experiments**. ## Dataset Structure - `train-clean-100-phoneme` - `train-clean-360-phoneme` - `train-other-500-phoneme` - `dev-clean-phoneme` - `dev-other-phoneme` - `test-clean-phoneme` - `test-other-phoneme` - `lexicon.txt` - `phone_list`

许可证:Apache-2.0 任务类别: - 自动语音识别(Automatic Speech Recognition, ASR) - 机器翻译 语言: - 英语 标签: - 讲座 友好名称:d 数据规模分类: - 10M<n<100M # LibriSpeech 国际音标(International Phonetic Alphabet, IPA)音素标注集 本数据集仓库为LibriSpeech数据集提供**基于国际音标(IPA)的音素标注与发音词典**。 所有音素标注均通过确定性转换规则,借助下述工具包,将**CMU发音词典(CMU Pronouncing Dictionary, CMU-Dict)**的音素转换为国际音标符号: - https://pypi.org/project/pinyin-to-ipa 本数据集可应用于面向音素的自动语音识别(ASR)、音素-字形转换(P2G)与字形-音素转换(G2P)研究、音素级连接时序分类(CTC)与注意力端到端(AED)模型研发,以及跨语言音素相关实验。 ## 数据集结构 - `train-clean-100-phoneme` - `train-clean-360-phoneme` - `train-other-500-phoneme` - `dev-clean-phoneme` - `dev-other-phoneme` - `test-clean-phoneme` - `test-other-phoneme` - `lexicon.txt` - `phone_list`
提供机构:
thu-spmi
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作