five

CALLHOME Mandarin Chinese Lexicon

收藏
DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC96L15
下载链接
链接失效反馈
官方服务:
资源简介:
<p>The CALLHOME Mandarin Chinese collection includes a lexical component. The CALLHOME Mandarin Lexicon consists of 44,405 words and contains separate information fields with phonological, morphological and frequency information for each word.</p><br> <p>The token coverage by the LDC Mandarin lexicon of words occurring in the 20 LDC Mandarin CALLHOME devtest transcripts (ten minutes of conversation each) is 98%.</p><br> <p>Orthographic Chinese characters are GB-encoded and are simplified in the Mainland style. A representation of the headword in tone pinyin with strictly lexical tone, i.e. not reflecting phonetic/phonological processes is also provided.</p><br> <p>Here is a <a href="desc/addenda/LDC96L15_eg.gif" rel="nofollow">sample page </a> from the lexicon. The transcripts and documentation (<a href="http://catalog.ldc.upenn.edu/LDC96T16" rel="nofollow">LDC96T16</a>) are available separately, as is a corpus of telephone speech (<a href="http://catalog.ldc.upenn.edu/LDC96S34" rel="nofollow">LDC96S34</a>).</p></br>
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作