PANACEA English V-SUBCAT gold-standard for ENV domain
收藏DataCite Commons2025-07-01 更新2025-04-09 收录
下载链接:
https://dataverse.csuc.cat/citation?persistentId=doi:10.34810/data371
下载链接
链接失效反馈官方服务:
资源简介:
This is a domain-specific gold-standard for English subcategorization frames, in the case, for environment (ENV) domain. This gold-standard was manually developed, choosing a set of 28 verbs and 200 senteces for each verb. For each sentence, the SCFs present for the studied verb were manually annotated. The sentences were selected from crawled Web pages that were automatically detected to be in the English language and were automatically classified as relevant to the ENV domain. Data collection took place in the summer of 2011. This gold-standard was created in the context of PANACEA http://www.panacea-lr.eu), an EU-FP7 Funded Project under Grant Agreement 248064.
本数据集为具体面向环境(ENV)领域的领域专属英语次范畴化框架(subcategorization frames)金标准数据集。该金标准数据集由人工构建,共选取28个动词,每个动词对应200个句子。针对每一句,研究者对句中目标动词所对应的次范畴化框架进行了人工标注。所用句子均来源于爬取的网页,这些网页经自动检测确认为英语文本,且被自动归类为与ENV领域相关的内容。数据采集工作于2011年夏季完成。本金标准数据集依托欧盟第七框架计划(EU-FP7)资助项目PANACEA(项目官网:http://www.panacea-lr.eu,资助协议编号:248064)构建而成。
提供机构:
CORA.Repositori de Dades de Recerca
创建时间:
2022-10-10



