five

ACE Time Normalization (TERN) 2004 English Training Data v 1.0

收藏
DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2005T07
下载链接
链接失效反馈
官方服务:
资源简介:
<h3>Introduction</h3> <p>This file contains documentation on the ACE Time Normalization (TERN) 2004 English Training Data v 1.0, Linguistic Data Consortium (LDC) catalog number LDC2005T07 and ISBN 1-58563-331-3. </p><p>This release contains the English training data prepared for the 2004 Time Expression Recognition and Normalization (TERN) Evaluation, sponsored by the Automatic Content Extraction (ACE) program. The evaluation was held in August 2004 and a workshop in September 2004. Evaluation participants received this data for training purposes, and it is now being released for general use. </p><p> The annotation specifications for this corpus were developed under DARPA's Translingual Information Detection Extraction and Summarization (TIDES) program, with continuing support from ACE. </p><p> The purpose of this corpus and the TERN evaluation is to advance the state of the art in the automatic recognition and normalization of natural language temporal expressions. In most language contexts such expressions are indexical. For example, with "Monday," "last week," or "three months starting October 1," one must know the narrative reference time in order to pinpoint the time interval being conveyed by the expression. In addition, for data exchange purposes, it is essential that the identified interval be rendered according to an established standard, i.e., normalized. Accurate identification and normalization of temporal expressions is in turn essential for the temporal reasoning being demanded by advanced NLP applications such as question answering, information extraction, and summarization. </p><h3>Samples</h3> <p>Please examine this <a href="desc/addenda/LDC2005T07.txt" rel="nofollow">sample</a> to see an example of the corpus. </p><h3>Updates</h3> <p>Additional information, updates, bug fixes may be available in the LDC catalog entry for this corpus at <a href="http://catalog.ldc.upenn.edu/LDC2005T07" rel="nofollow">LDC2005T07</a>. </p> <p>"The World" is a co-production of Public Radio International and the British Broadcasting Corporation and is produced at WGBH Boston.</p></br> Portions © 1998 Los Angeles Times-Washington Post News Service, Inc., © 1998, 2000 American Broadcasting Corporation, © 1998, 2000 Cable News Network, LP, LLLP, © 1998, 2000 The Associated Press, © 1998, 2000 New York Times, © 1998, 2000 National Broadcasting Company, Inc., ©1998, 2000 Public Radio International, © 2005 Trustees of the University of Pennsylvania <br><br>"The World" is a co-production of Public Radio International and the British Broadcasting Corporation and is produced at WGBH Boston.
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作