five

Synthetic mobile device data

收藏
DataCite Commons2025-05-01 更新2025-05-17 收录
下载链接:
https://data.mendeley.com/datasets/cb2r6hv72b
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains two synthetic mobile device datasets, one for GPS location records ("input_case1_v2.csv") and the other for cellular location records ("input_case2_v2.csv"). The two datasets are stored in CSV files. In each CSV file, there are 12 data fields, explained in the "Dictionary.docx" file. The two datasets contain the location records for 582 individual mobile devices for a month. The GPS dataset ("input_case1_v2.csv") contains 668,939 location records, and the cellular dataset ("input_case2_v2.csv") contains 61,390 location records. Using the synthetic mobile data generation method developed by Chen et al. (2014), the two datasets are generated based on two real-world data sources. The first one is mobile app data, which comes from people using location-aware mobile apps. The mobile app data encompasses both GPS and cellular data, and covers the month of March in 2019 in the central Puget Sound region. It includes 582 individual mobile device users. The second data source is household travel survey data. It covers the month of March in 2017 in the central Puget Sound region, and includes 582 survey respondents. The 582 (mobile device) users and the 582 survey respondents are randomly linked. The visited locations in the household travel survey are viewed as the ground-truth stays. Four fields of information from the mobile app data are preserved in the synthetic location records: the number of location records, and the user ID (anonymized), timestamp of each location records, and location accuracy associated with a record. If the timestamp of a location record falls within the duration of a ground-truth stay, the location record will be associated to the stay. The latitudes and longitudes of synthetic location records are generated such that their spatial distribution is the same as that from the mobile app data for a given user on a given day. The spatial distribution is measured by the distance and angle from a location record to the corresponding stay. Methods to infer stays from the mobile data is described in Wang et al., (2019), which was developed using the method developed in (Chen et al. 2014). For synthetic location records not associated to any (ground-truth) stay, their locations are random deviates from locations evenly distributed on the straight line connecting the last and the next stays, as described in Chen et al. (2014).

本仓库包含两个合成移动设备数据集,分别对应GPS位置记录(文件名为"input_case1_v2.csv")与蜂窝网络位置记录(文件名为"input_case2_v2.csv")。两个数据集均以CSV格式存储,每个CSV文件包含12个数据字段,字段说明详见"Dictionary.docx"文件。 两个数据集涵盖了582台独立移动设备为期一个月的位置记录:其中GPS数据集("input_case1_v2.csv")包含668,939条位置记录,蜂窝数据集("input_case2_v2.csv")包含61,390条位置记录。 本数据集采用Chen等人2014年提出的合成移动数据生成方法,基于两个真实数据源构建。第一个数据源为移动应用数据,源自使用位置感知类移动应用的用户群体,该数据同时包含GPS与蜂窝网络位置信息,覆盖2019年3月普吉特海湾中部地区,涉及582名独立移动设备用户。第二个数据源为家庭出行调查数据,覆盖2017年3月普吉特海湾中部地区,包含582名调查受访者。研究通过随机匹配的方式将582名移动设备用户与582名调查受访者进行关联。 家庭出行调查中记录的到访地点被视为真实停留点。合成位置记录保留了移动应用数据中的四类信息:位置记录总数、匿名化用户ID、每条位置记录的时间戳,以及关联至该记录的位置精度。若某条位置记录的时间戳落在某一真实停留点的持续时间范围内,则将该记录关联至该停留点。 合成位置记录的经纬度生成逻辑为,其空间分布与给定用户在给定日期的移动应用数据的空间分布保持一致,该空间分布通过位置记录到对应停留点的距离与角度进行衡量。对于未关联至任何真实停留点的合成位置记录,其位置为随机生成的偏移点,分布于上一个与下一个停留点的连线上,且该连线上的位置服从均匀分布,具体实现方式详见Chen等人2014年的研究。 从移动数据中推断停留点的方法详见Wang等人2019年的研究,该方法基于Chen等人2014年提出的框架开发。
提供机构:
Mendeley
创建时间:
2021-05-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作