five

phonemetransformers/IPA-CHILDES

收藏
Hugging Face2025-04-08 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/phonemetransformers/IPA-CHILDES
下载链接
链接失效反馈
官方服务:
资源简介:
IPA-CHILDES数据集包含了从儿童语言数据库CHILDES下载并转换为音素表示形式的语句。这个数据集经过预处理,保留下许多原始CHILDES数据库中的列,并添加了几个关键列,以方便进行语言模型训练。数据集覆盖了多种语言,每种语言都有详细的统计数据,如说话者数量、语句数量等。

The IPA-CHILDES dataset contains utterances downloaded from CHILDES, which have been pre-processed and converted into a phonemic representation. This dataset includes multiple language versions, each phonemized with a specific language code. It also includes key columns such as pre-processed orthographic transcription, phonemic transcription, character-split utterance, and an indicator of whether the utterance was spoken by a child. The dataset is sorted by child age and can be used to limit the training data. Each language has a detailed description including data sources, number of speakers, number of utterances, number of words, number of phonemes, and the percentage of child utterances.
提供机构:
phonemetransformers
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作