Ancient Chinese Corpus
收藏DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2017T14
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3><br>
<p>Ancient Chinese Corpus was developed at <a href="http://en.njnu.edu.cn/schools">Nanjing Normal University</a>. It contains word-segmented and part-of-speech tagged text from <em>Zuozhuan</em>, an ancient Chinese work believed to date from the Warring States Period (475-221 BC). <em>Zuozhuan</em> is a commentary on the <em>Chunqui</em>, a history of the Chinese Spring and Autumn period (770-476 BC). This release is part of a continuing project to develop a large, part-of-speech tagged ancient Chinese corpus.</p><br>
<h3>Data</h3><br>
<p>Ancient Chinese Corpus consists of 180,000 Chinese characters and 195,000 segment units (including words and punctuation). The part-of-speech tag set was developed by Nanjing Normal University and contains 17 tags.</p><br>
<p>This release contains two text files: 268 paragraphs and 10,560 lines. A line is one sentence; paragraphs are separated by one empty line. Each word is tagged with its part-of-speech and separated by a space.</p><br>
<p>The files are presented in UTF-8 plain text files using traditional Chinese script.</p><br>
<h3>Samples</h3><br>
<p>Please view this <a href="desc/addenda/LDC2017T14.txt">sample</a>.</p><br>
<h3>Updates</h3><br>
<p>None at this time.</p></br>
Portions © 2017 Xiaohe Chen, © 2017 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30



