Chicago taxi rides
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14537441
下载链接
链接失效反馈官方服务:
资源简介:
Taxi trips from 2019 to 2021, derived from a larger dataset published by the City of Chicago Data Portal.
This subset has been converted into a Parquet file for use in the _Thinking in Arrays_ tutorial presented at SciPy 2022, 2023, and 2024.
Tutorial abstract (2023)
Video of tutorial (2023)
Tutorial materials (2023) and specific lesson that uses this dataset
It has this nested schema:
>>> import awkward as ak
>>> a = ak.from_parquet("data/chicago-taxi.parquet")
>>> a.type.show()
7728 * var * {
trip: {
sec: ?float32,
km: ?float32,
begin: {
lon: ?float64,
lat: ?float64,
time: ?datetime64[ms]
},
end: {
lon: ?float64,
lat: ?float64,
time: ?datetime64[ms]
},
path: var * {
londiff: float32,
latdiff: float32
}
},
payment: {
fare: ?float32,
tips: ?float32,
total: ?float32,
type: categorical[type=string]
},
company: categorical[type=string]
}
which is 7728 taxis, each with a variable number of trips. Each trip has a length in seconds and kilometers, a begin and end longitude/latitude/time, as well as a variable-length path (relative to the begin position). Each trip also has payment information (type can be "Unknown", "Cash", "Credit Card", "No Charge", "Mobile", "Prcard", "Dispute", "Pcard", "Prepaid"). Each taxi belongs to one of 65 companies.
This dataset was previously published at the following URL: https://pivarski-princeton.s3.us-east-1.amazonaws.com/chicago-taxi.parquet
创建时间:
2024-12-20



