Part-of-Speech Tagged Building Codes (PTBC)
收藏DataCite Commons2025-12-18 更新2025-04-16 收录
下载链接:
https://purr.purdue.edu/publications/3246/1
下载链接
链接失效反馈官方服务:
资源简介:
<p>This dataset of Part-of-Speech (POS) tagged building codes contains 1,522 sentences from Chapters 5 and 10 of 2015 International Building Code. It adopts the original version of Penn Treebank tag set for the POS tags. It includes tagging results from 5 human annotators and 7 machine taggers. It also provides the most commonly chosen POS tag for each word by machine taggers and by human annotators. For detailed explanations of the meanings of the POS tags, please refer to <em>Building a Large Annotated Corpus of English: The Penn Treebank </em>[1]. For an explanation of the development of this dataset, please refer to the following paper [2].</p>
<p>The authors would like to thank the National Science Foundation (NSF). This material is based on work supported by the NSF under Grant No. 1827733. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.</p>
<p><em>1. Marcus, Mitchell &amp; Ann Marcinkiewicz, Mary &amp; Santorini, Beatrice. (2002). Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics. 19. 313-330.</em></p>
<p><em>2. Xue, X., and Zhang, J. (2019).&nbsp;&quot;Evaluation of Seven Part-of-Speech Taggers in Tagging Building Codes: Identifying the Best Performing Tagger and Common Sources of Errors.&quot;&nbsp;Proc.,&nbsp;ASCE&nbsp;Construction Research Congress,&nbsp;ASCE, Reston, VA, submitted.</em></p>
提供机构:
Purdue University Research Repository
创建时间:
2019-08-05



