five

The EventStatus Corpus

收藏
DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2017T09
下载链接
链接失效反馈
官方服务:
资源简介:
<h3>Introduction</h3><br> <p>The EventStatus Corpus was developed by researchers at <a href="https://engineering.tamu.edu/cse">Texas A&amp;M University</a>, <a href="https://linguistics.stanford.edu/">Stanford University</a> and <a href="http://www.cs.utah.edu/">The University of Utah</a>. It consists of approximately 3,000 English and 1,500 Spanish news articles about civil unrest events annotated with temporal tags.</p><br> <p>This corpus was designed to support the study of the temporal and aspectual properties of major events, that is, whether an event has already happened, is currently happening or may happen in the future. Since it focuses on a single domain (civil unrest events), it may be appropriate for tasks such as event extraction and temporal question answering.</p><br> <h3>Data</h3><br> <p>The relevant news articles were sourced from English Gigaword Fifth Edition (<a href="../../../LDC2011T07">LDC2011T07</a>) and Spanish Gigaword Third Edition (<a href="../../../LDC2011T12">LDC2011T12</a>). The civil unrest events include protests, demonstrations, marches and strikes. The data was annotated as PAST, ON-GOING or FUTURE and within each of those categories, as PLANNED, ALERT or POSSIBLE.</p><br> <p>In addition to the annotated articles, file lists used in experiments for tuning and test are included. 10-fold cross-validations were performed, and the specific 10-fold splits of the test are included as well. All text is presented as plain text and encoded in UTF-8.</p><br> <h3>Samples</h3><br> <p>Please view this <a href="desc/addenda/LDC2017T09.txt">sample</a>.</p><br> <h3>Updates</h3><br> <p>None at this time.</p></br> Portions © 1994-2010 Agence France Presse, © 1993-2010 The Associated Press, © 1997-2010 Central News Agency (Taiwan), © 1994-1998, 2003-2009 Los Angeles Times-Washington Post News Service, Inc., © 1994-2010 New York Times, © 2010 The Washington Post News Service with Bloomberg News, © 1995-2010 Xinhua News Agency, © 2017 Ruihong Huang, © 2003, 2005, 2006, 2007, 2009, 2011, 2013, 2017 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作