five

A harmonised testsuite for social media POS tagging (DE)

收藏
DataCite Commons2025-01-28 更新2025-04-17 收录
下载链接:
https://heidata.uni-heidelberg.de/citation?persistentId=doi:10.11588/DATA/KXLMHN
下载链接
链接失效反馈
官方服务:
资源简介:
<p>A harmonised POS testsuite of web data, CMC and Twitter microtext, with word forms and STTS pos tags (+ some additional CMC-specific tags). UD pos tags have been automatically converted, based on the STTS pos tags. The data does not contain (manually corrected) lemma information. The original data comes from 3 different sources: a twitter dataset with 21,181 tokens, and two datasets from the Empirist shared task 2015: web data (12,718 tokens) and computer-mediated communication (10,505 tokens).</p>
提供机构:
heiDATA
创建时间:
2020-03-26
二维码
社区交流群
二维码
科研交流群
商业服务