LJ Speech - Aligned IPA transcriptions
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/7356907
下载链接
链接失效反馈官方服务:
资源简介:
Files:
grids.zip
contains TextGrids for all audio files containing three tiers words, phonemes and transcription
words contains the aligned normalized English words
phonemes contains IPA pronunciations transcribed using CMU dictionary which then were aligned with Montreal Forced Aligner. The pronunciations were then mapped from ARPAbet to IPA and duration marks were applied (without punctuation)
transcription contains unaligned phonemes including punctuation and word boundary labels (SIL0)
preview.png
preview of the first TextGrid opened in Praat
words-vocabulary.txt
contains all words from tier words
phonemes-vocabulary.txt
contains all phonemes from tier phonemes
transcription-vocabulary.txt
contains all phonemes/punctuation from tier transcription
phonemes-durations.pdf
contains the plotted phoneme duration distribution of tier phonemes
phonemes-durations-simple.pdf
contains the plotted phoneme duration distribution of tier phonemes if all duration markers are ignored
pronunciations.dict
contains the pronunciations for each word including punctuation and weights (occurrence)
script.sh
contains the script to reproduce all results
Phoneme duration marker:
˘ -> [0, 20) percentile
ˑ -> [80, 90) percentile
ː -> [90, inf) percentile
Silence marker:
SIL0 -> no silence
SIL1 -> [0, 33.33) percentile
SIL2 -> [33.33, 66.66) percentile
SIL3 -> [66.66, inf) percentile
创建时间:
2024-07-15



