Replication Data for: \"Lost in Space: Geolocation in Event Data\"
收藏DataONE2018-03-19 更新2024-06-25 收录
下载链接:
https://search.dataone.org/view/sha256:46e87e2d38b66255fa757efbee2167a9e7d524458012177389fffef52a4f8b1c
下载链接
链接失效反馈官方服务:
资源简介:
Improving geolocation accuracy in text data has long been a goal of automated text processing. We depart from the conventional method and introduce a two-stage supervised machine learning algorithm that evaluates each location mention to be either correct or incorrect. We extract contextual information from texts, i.e., N-gram patterns for location words, mention frequency, and the context of sentences containing location words. We then estimate model parameters using a training dataset and use this model to predict whether a location word in the test dataset accurately represents the location of an event. We demonstrate these steps by constructing customized geolocation event data at the subnational level using news articles collected from around the world. The results show that the proposed algorithm outperforms existing geocoders even in a case added post hoc to test the generality of the developed algorithm.
创建时间:
2023-11-22



