phonemetransformers/IPA-CHILDES
收藏Hugging Face2025-04-08 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/phonemetransformers/IPA-CHILDES
下载链接
链接失效反馈官方服务:
资源简介:
IPA-CHILDES数据集包含了从儿童语言数据库CHILDES下载并转换为音素表示形式的语句。这个数据集经过预处理,保留下许多原始CHILDES数据库中的列,并添加了几个关键列,以方便进行语言模型训练。数据集覆盖了多种语言,每种语言都有详细的统计数据,如说话者数量、语句数量等。
The IPA-CHILDES dataset contains utterances downloaded from CHILDES, which have been pre-processed and converted into a phonemic representation. This dataset includes multiple language versions, each phonemized with a specific language code. It also includes key columns such as pre-processed orthographic transcription, phonemic transcription, character-split utterance, and an indicator of whether the utterance was spoken by a child. The dataset is sorted by child age and can be used to limit the training data. Each language has a detailed description including data sources, number of speakers, number of utterances, number of words, number of phonemes, and the percentage of child utterances.
提供机构:
phonemetransformers



