TRAD Chinese-French Parallel Text -- Blog
收藏DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2018T02
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3><br>
<p>TRAD Chinese-French Parallel Text -- Blog was developed by <a href="http://elda.org/en/">ELDA</a> as part of the <a href="http://www.elra.info/en/projects/archived-projects/pea-trad/">PEA-TRAD project</a>. It contains French translations of a subset of approximately 10,000 Chinese words from GALE Phase 1 Chinese Blog Parallel Text (<a href="../../../LDC2008T06">LDC2008T06</a>).</p><br>
<p>The PEA-TRAD project (Translation as a Support for Document Analysis) was supported by the French Ministry of Defense (DGA). Its purpose was to develop speech-to-speech translation technology for multiple languages (e.g., Arabic, Chinese, Pashto) from a variety of domains. ELDA developed several corpora for this effort.</p><br>
<p>The Linguistic Data Consortium (LDC) has also released the following TRAD corpora:</p><br>
<ul><br>
<li>TRAD Arabic-French Parallel Text -- Newsgroup (<a href="../../../LDC2018T13">LDC2018T13</a>)</li><br>
<li>TRAD Chinese-French Parallel Text -- Broadcast News (<a href="../../../LDC2018T17">LDC2018T17</a>)</li><br>
<li>TRAD Arabic-French Parallel Text -- Newswire (<a href="../../../LDC2018T21">LDC2018T21</a>)</li><br>
</ul><br>
<h3>Data</h3><br>
<p>This release consists of 444 segments (translation units) from 17 documents. The source data is Chinese blog text collected and translated into English by LDC for the DARPA GALE (Global Autonomous Language Exploitation) program. Information about the ELDA translation team, translation guidelines and validation results is contained in the documentation accompanying this release.</p><br>
<p>The Chinese source file contains 15,809 characters and the French reference translation contains 11,769 words. The data is presented in two unicode-encoded XML files along with an associated DTD.</p><br>
<h3>Samples</h3><br>
<p>Please view this <a href="desc/addenda/LDC2018T02.src.xml">source sample</a> and <a href="desc/addenda/LDC2018T02.ref.xml">reference sample</a>.</p><br>
<h3>Updates</h3><br>
<p>None at this time.</p></br>
Portions © 2018 ELDA, © 2005-2007, 2008, 2018 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30



