English Chinese Translation Treebank v 1.0
收藏DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2007T02
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Description</h3><br>
<p>This release of English Chinese Translation Treebank v. 1.0 consists of 146,300 words in 325 files of individual news stories from Xinhua News Agency (corresponding to the Xinhua data in Chinese Treebank 5.0 <a href="../../../LDC2005T01">LDC2005T01</a>) that are translated into English, part-of-speech tagged and treebanked. The files were compressed using gzip.</p><br>
<p>The source files for the treebank annotation contain the final updated translation of these files. Translation errors that prevented complete treebank annotation have been corrected. This translation and annotation were completed in October 2004 and supersede any earlier translation.</p><br>
<p>This publication was compiled under <a href="http://www.nsf.gov/" rel="nofollow">National Science Foundation</a> Grant #IIS-0325646.</p><br>
<h3>Samples</h3><br>
<p>For an example of the data in this publication, please view this <a href="desc/addenda/LDC2007T02.jpg" rel="nofollow">sample</a>.</p></br>
Portions © 1994-1998 Xinhua News Agency, © 2004, 2007 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30
搜集汇总
数据集介绍

背景与挑战
背景概述
English Chinese Translation Treebank v 1.0是一个双语树库数据集,包含新华通讯社新闻稿的英文翻译,共146,300个单词,分为325个文件,经过词性标注和树库标注。该数据集专为机器翻译和自然语言处理研究设计,基于新闻文本,提供结构化的语言分析资源。
以上内容由遇见数据集搜集并总结生成



