DiscAlign for Penn and RST Discourse Treebanks
收藏DataCite Commons2021-09-15 更新2024-07-13 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2021T16
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3><br>
<p>DiscAlign for Penn and RST Discourse Treebanks was developed by <a href="https://www.uni-saarland.de/en/home.html">Saarland University</a>. It consists of alignment information for the discourse annotations contained in <a href="../../../LDC2008T05">Penn Discourse Treebank Version 2.0 (LDC2008T05) </a> (PDTB 2.0) and <a href="../../../LDC2002T07">RST Discourse Treebank (LDC2002T07)</a> (RST-DT). PDTB 2.0 and RST-DT annotations overlap for 385 newspaper articles in sections 6, 11, 13, 19 and 23 of the Wall Street Journal corpus contained in <a href="../../../LDC95T7">Treebank-2 (LDC95T7)</a>. DiscAlign for Penn and RST Discourse Treebanks contains approximately 6,700 alignments between PDTB 2.0 and RST-DT relations.</p><br>
<p>DiscAlign for Penn and RST Treebanks is available at no cost to all licensees of PDTB 2.0 and RST-DT and appears in their download queues associated with these corpora as <em>DiscAlign_Penn_RST_DTB_LDC2021T16.zip</em>.</p><br>
<h3>Data</h3><br>
<p>The alignment table is presented as a single UTF-8 encoded CSV file with each row representing a PDTB discourse relation that has been mapped with an RST relation from the RST-DT corpus. Table columns provide some basic information about the source relation extracted from PDTB, the target relation extracted from RST-DT, and the quality of the alignment between the two. See the included documentation for more details on the columns and the mapping procedure.</p><br>
<h3>Samples</h3><br>
<p>Please view this <a href="desc/addenda/LDC2021T16.csv">sample (TXT)</a>.</p><br>
<h3>Updates</h3><br>
<p>None at this time.</p></br>
Portions © 2021 Saarland University, © 2021 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2021-09-03



