Ancient Chinese WordNet V1.0
收藏DataCite Commons2026-03-10 更新2026-05-03 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2026L03
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3>
<p>Ancient Chinese WordNet <a href="../../../LDC2026L03">(LDC2026L03)</a> was developed by <a href="https://www.njnu.edu.cn/">Nanjing Normal University</a> and contains lexical and semantic information for Ancient Chinese vocabulary dating back to the Pre-Qin period (before 221 BCE). The WordNet comprises 38,781 word forms and 55,100 senses, each manually linked to a corresponding synset in <a href="https://wordnet.princeton.edu/">Princeton WordNet 1.6</a>.</p>
<p>The Ancient Chinese WordNet (ACWN) project began in 2012 with the goal of creating a structured lexical database to support linguistic research and natural language processing applications involving historical Chinese language materials. ACWN organizes vocabulary using WordNet's noun, verb, adjective, and adverb hierarchies and provides WordNet definitions, semantic relations, and categorization for each sense.</p>
<h3>Data</h3>
<p>Ancient Chinese WordNet contains 55,100 records, where each record represents a single Ancient Chinese lexical item mapped to one WordNet synset. It follows WordNet 1.6 organizational structure, including 22 noun categories, 15 verb categories, and additional adjective and adverb categories.</p>
<p>Each entry includes the following fields:</p>
<ul>
<li>ID - The serial number of the ACWN entry</li>
<li>Word - Ancient Chinese word form</li>
<li>wn_offset - 8-digit WordNet 1.6 synset offset with trailing POS (n/v/a/s/r)</li>
<li>senseid - Sense number for this word form (ordinal among that word's senses)</li>
<li>pos - Part of speech (noun (n), verb (v), adj (a/s), adv (r))</li>
<li>wn_category - Numeric code for the WordNet 1.6 lexicographer file (category)</li>
<li>wn_synset - Synset headword(s) in WordNet 1.6</li>
<li>wn_definition - WordNet gloss for the synset</li>
<li>wn_similar to - Synset with similar meaning</li>
<li>wn_pertainym - Pertainym synset offset(s)</li>
<li>wn_attribute - Attribute synset offset(s)</li>
<li>wn_hypernym - Hypernym synset offset(s)</li>
<li>wn_hyponym - Hyponym synset offset(s)</li>
</ul>
<p>The data is presented in UTF-8 encoded CSV and XLSX formats.</p>
<h3>Updates</h3>
<p>No updates at this time.</p>
提供机构:
Linguistic Data Consortium
创建时间:
2026-03-10



