five

English Chinese Translation Treebank v 1.0

收藏
DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2007T02
下载链接
链接失效反馈
官方服务:
资源简介:
<h3>Description</h3><br> <p>This release of English Chinese Translation Treebank v. 1.0 consists of 146,300 words in 325 files of individual news stories from Xinhua News Agency (corresponding to the Xinhua data in Chinese Treebank 5.0 <a href="../../../LDC2005T01">LDC2005T01</a>) that are translated into English, part-of-speech tagged and treebanked. The files were compressed using gzip.</p><br> <p>The source files for the treebank annotation contain the final updated translation of these files. Translation errors that prevented complete treebank annotation have been corrected. This translation and annotation were completed in October 2004 and supersede any earlier translation.</p><br> <p>This publication was compiled under <a href="http://www.nsf.gov/" rel="nofollow">National Science Foundation</a> Grant #IIS-0325646.</p><br> <h3>Samples</h3><br> <p>For an example of the data in this publication, please view this <a href="desc/addenda/LDC2007T02.jpg" rel="nofollow">sample</a>.</p></br> Portions © 1994-1998 Xinhua News Agency, © 2004, 2007 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
English Chinese Translation Treebank v 1.0是一个双语树库数据集,包含新华通讯社新闻稿的英文翻译,共146,300个单词,分为325个文件,经过词性标注和树库标注。该数据集专为机器翻译和自然语言处理研究设计,基于新闻文本,提供结构化的语言分析资源。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作