five

TNC and Taxi Processed Daily Trips Data with Exogenous Variables

收藏
DataONE2023-05-24 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:94d7c44033188e43eed58ea629aeb44cee822f79e1a093b6237ef2a18980c8a6
下载链接
链接失效反馈
官方服务:
资源简介:
The City of Chicago has published trip-level data for every TNC trip since November 1, 2018. To the best of our knowledge, this dataset is the only one that includes trip fare variables. As we wrote this paper in Oct 2022, the dataset includes approximately 263 million trip records (rows) and 21 features (columns) for trips dated from November 1, 2018, through October 1, 2022. The features of this data include Trip ID, Trip Start Timestamp (rounded to the nearest 15 minutes), Trip End Timestamp (rounded to the nearest 15 minutes), Trip Seconds, Trip Miles, Pickup Census Tract, Dropoff Census Tract, Pickup Community Area, Drop Off Community Area, Trip Fare, Tip, Additional Charges, Total Trip Fare, Shared Trip Authorized, Trips Pooled, Pickup Centroid Latitude, Pickup Centroid Longitude, Pickup Centroid Location, Dropoff Centroid Latitude, Dropoff Centroid Longitude, Dropoff Centroid Location. As the dataset is too large to be processed without a supercomputer, we generated a random sample of 2 million trips from Nov 2018 to June 2022 with valid pickup and drop-down area information. To explore the data, we processed the features to extract date information from the timestamp. We created new variables, including each trip's average fare per mile (excluding tips and additional charges, mainly taxes). In dataset (1), the sampled TNC trips data was processed and summarized to include the average daily fare per mile (USD/mile), and exogenous variables that impact the price were added to the data including holidays (Christmas, thanksgiving, Independence Day, easter and new year) and other variables including gas prices, and climate (snow, precipitation, and average daily temperature). The City of Chicago also publishes taxi trips from 2013 to the present. To protect privacy but allow for aggregate analyses, the Taxi ID is consistent for any given taxi medallion number but does not show the number, and times are rounded to the nearest 15 minutes. Due to the data reporting process, not all but most trips are reported. Taxicabs in Chicago, Illinois, are operated by private companies and licensed by the city. About seven thousand licensed cabs are operating within the city limits. As the dataset is too large to be processed without a supercomputer, we generated a random sample of 2 million trips from Nov 2018 to June 2022 with valid pickup and drop-down area information. To explore the data, we processed the features to extract date information from the timestamp. We created new variables, including each trip's average fare per mile (excluding tips and additional charges, mainly taxes). In dataset (2), the taxi trips data was processed and summarized to include the average daily fare per mile (USD/mile), and exogenous variables that impact the price were added to the data including holidays (Christmas, thanksgiving, Independence Day, easter and new year) and other variables including gas prices, and climate (snow, precipitation, and average daily temperature).
创建时间:
2023-11-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作