LDC2017T10 Dataset

Name: LDC2017T10 Dataset
Creator: Papers with Code
License: 暂无描述

paperswithcode.com2025-01-15 收录

下载链接：

https://paperswithcode.com/dataset/ldc2017t10

下载链接

链接失效反馈

官方服务：

资源简介：

Abstract Meaning Representation (AMR) Annotation Release 2.0 was developed by the Linguistic Data Consortium (LDC), SDL/Language Weaver, Inc., the University of Colorado's Computational Language and Educational Research group and the Information Sciences Institute at the University of Southern California. It contains a sembank (semantic treebank) of over 39,260 English natural language sentences from broadcast conversations, newswire, weblogs and web discussion forums. AMR captures “who is doing what to whom” in a sentence. Each sentence is paired with a graph that represents its whole-sentence meaning in a tree-structure. AMR utilizes PropBank frames, non-core semantic roles, within-sentence coreference, named entity annotation, modality, negation, questions, quantities, and so on to represent the semantic structure of a sentence largely independent of its syntax.

摘要意义表示（Abstract Meaning Representation，AMR）注释版本2.0由语言数据联盟（Linguistic Data Consortium，LDC）、SDL/Language Weaver公司、科罗拉多大学的计算语言与教育研究小组以及南加州大学信息科学研究所共同研发。该数据集包含超过39,260个来自广播对话、新闻通讯、网络日志和网络讨论论坛的英文自然语言句子，并附有语义语料库（semantibank，即语义树库）。 AMR捕捉句子中的“谁对谁做了什么”这一核心意义。每个句子都与一个图形相对应，该图形以树形结构表示句子的整体意义。AMR通过运用PropBank框架、非核心语义角色、句子内部的指称、命名实体标注、语气、否定、疑问、数量等手段，来表示句子的语义结构，其独立性很大程度上超越了其句法结构。

提供机构：

Papers with Code

5,000+

优质数据集

54 个

任务类型

进入经典数据集