five

SemTransCNC

收藏
DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2020T12
下载链接
链接失效反馈
官方服务:
资源简介:
<h3>Introduction</h3><br> <p>SemTransCNC was developed by <a href="https://www.polyu.edu.hk/en/">The Hong Kong Polytechnic University</a>. It is comprised of a semantic transparency dataset of Chinese nominal compounds built using a series of crowd-based experiments.</p><br> <p>Nominal compounds were selected from the <a href="https://ckip.iis.sinica.edu.tw/project/sinicacorpus/">Sinica Corpus</a> and a modern Chinese lexicon. Crowd workers answered questionnaires that included demographic information and questions about the Chinese language. For assessing overall semantic transparency (OST) of selected compounds, they answered the question: "How is the sum of the meanings of <em>A</em> and <em>B</em> similar to the meaning of <em>AB</em>?" For assessing constituent semantic transparency (CST), they were asked to describe the similarity of <em>A</em> alone to its meaning in <em>AB</em> and the meaning of <em>B</em> alone to its meaning in <em>AB</em>.</p><br> <h3>Data</h3><br> <p>SemTransCNC consists of OST and CST data for 1,176 dimorphemic Chinese nominal compounds, which consist of free morphemes and have mid-range frequencies.</p><br> <p>The text data is presented as a UTF-8 encoded comma separated text file.</p><br> <h3>Samples</h3><br> <p>Please view this <a href="desc/addenda/LDC2020T12.csv">text sample (CSV)</a>.</p><br> <h3>Updates</h3><br> <p>None at this time.</p></br> Portions © 2020 The Hong Kong Polytechnic University, © 2020 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作