five

Table 5 Replication Data

收藏
DataCite Commons2025-10-31 更新2026-04-25 收录
下载链接:
https://figshare.com/articles/dataset/Table_5_Replication_Data/30501746
下载链接
链接失效反馈
官方服务:
资源简介:
We conducted a validation exercise with two goals. First, to assess the <i>generalizability of our spatial ontology </i>to new datasets and second, to assess the <i>quality of geocoded routes </i>when compared against ground truth reports of congestion. The results presented in this paper pertained to tweets from the DTP, a single user that posts mostly traffic-related tweets from a single geography - Delhi NCR. We assess the generalizability of our spatial ontology beyond both a specific geography and a single user. We apply the ontology to extract spatial information from a dataset of New Jersey, USA containing a mixture of traffic-related and unrelated tweets posted by multiple users (Dabiri 2018). We classified each tweet into a spatial configuration without relying on gazetteer lookups or doing route computations since the goal here is only to assess whether the ontology adequately captures the spatial information present in the data. We hypothesize that traffic-related tweets will tend to contain “more” spatial information as compared to non-traffic related tweets. We test this assigning spatial information scores (SIS) to each configuration as follows. <b>Unclassified</b> → <b>0</b> (noise, little or no recoverable spatial information).<b>Primitive configurations</b> <b>P, O, D</b> → <b>0.5</b>. These provide collocation or a single locational cue but do not specify spatial relationships, flow or movement.<b>Complex configurations</b> <b>MP, L, ML, PonL, P&amp;D, O2D, O2MD, O2D@P, O2DonL</b> → <b>1</b>. These express direction, adjacency and/or parallel linear orientation and, for the O2D family, movement along a route.We then define the <i>spatial information index</i> (SII) for a dataset with N tweets as:SII = 1Ni=1NSISiThe spatial information index (SII) is the total normalized spatial information score for all the tweets in a dataset. According to our hypothesis, the SII should be higher for specialized datasets containing traffic-related tweets and lower for non-traffic related tweets and noisy datasets containing mixed tweets. The spatial information index is thus intended to measure the sensitivity of our ontology to the spatial information intrinsically present in data.  The results of validation are shown in Table 5.
提供机构:
figshare
创建时间:
2025-10-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作