five

Ancient Chinese WordNet V1.0

收藏
DataCite Commons2026-03-10 更新2026-05-03 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2026L03
下载链接
链接失效反馈
官方服务:
资源简介:
<h3>Introduction</h3> <p>Ancient Chinese WordNet <a href="../../../LDC2026L03">(LDC2026L03)</a> was developed by <a href="https://www.njnu.edu.cn/">Nanjing Normal University</a> and contains lexical and semantic information for Ancient Chinese vocabulary dating back to the Pre-Qin period (before 221 BCE). The WordNet comprises 38,781 word forms and 55,100 senses, each manually linked to a corresponding synset in <a href="https://wordnet.princeton.edu/">Princeton WordNet 1.6</a>.</p> <p>The Ancient Chinese WordNet (ACWN) project began in 2012 with the goal of creating a structured lexical database to support linguistic research and natural language processing applications involving historical Chinese language materials. ACWN organizes vocabulary using WordNet's noun, verb, adjective, and adverb hierarchies and provides WordNet definitions, semantic relations, and categorization for each sense.</p> <h3>Data</h3> <p>Ancient Chinese WordNet contains 55,100 records, where each record represents a single Ancient Chinese lexical item mapped to one WordNet synset. It follows WordNet 1.6 organizational structure, including 22 noun categories, 15 verb categories, and additional adjective and adverb categories.</p> <p>Each entry includes the following fields:</p> <ul> <li>ID - The serial number of the ACWN entry</li> <li>Word - Ancient Chinese word form</li> <li>wn_offset - 8-digit WordNet 1.6 synset offset with trailing POS (n/v/a/s/r)</li> <li>senseid - Sense number for this word form (ordinal among that word's senses)</li> <li>pos - Part of speech (noun (n), verb (v), adj (a/s), adv (r))</li> <li>wn_category - Numeric code for the WordNet 1.6 lexicographer file (category)</li> <li>wn_synset - Synset headword(s) in WordNet 1.6</li> <li>wn_definition - WordNet gloss for the synset</li> <li>wn_similar to - Synset with similar meaning</li> <li>wn_pertainym - Pertainym synset offset(s)</li> <li>wn_attribute - Attribute synset offset(s)</li> <li>wn_hypernym - Hypernym synset offset(s)</li> <li>wn_hyponym - Hyponym synset offset(s)</li> </ul> <p>The data is presented in UTF-8 encoded CSV and XLSX formats.</p> <h3>Updates</h3> <p>No updates at this time.</p>
提供机构:
Linguistic Data Consortium
创建时间:
2026-03-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作