DannHiroaki/Geolife-Spatial-Join-0.15B
收藏Hugging Face2026-01-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/DannHiroaki/Geolife-Spatial-Join-0.15B
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: GeoLife Spatial Join Benchmark
annotations_creators:
- machine-generated
language:
- en
license: other
license_name: microsoft-research-license-agreement
license_details: >-
Derived from Microsoft GeoLife Trajectories 1.3. Users must comply with the
original Microsoft Research License Agreement (non-commercial use).
task_categories:
- tabular-regression
tags:
- geospatial
- trajectory
- benchmark
- spatiotemporal
- spatial-join
size_categories:
- 100M<n<1B
configs:
- config_name: 3d_level1
data_files: dims=3/level=1/*.parquet
description: "3D (x,y,t) trajectories with Level 1 encounter threshold (20m, 60s)"
- config_name: 3d_level2
data_files: dims=3/level=2/*.parquet
description: "3D (x,y,t) trajectories with Level 2 encounter threshold (50m, 300s)"
- config_name: 3d_level3
data_files: dims=3/level=3/*.parquet
description: "3D (x,y,t) trajectories with Level 3 encounter threshold (200m, 1200s)"
- config_name: 4d_level1
data_files: dims=4/level=1/*.parquet
description: "4D (x,y,z,t) trajectories with Level 1 encounter threshold. Valid altitude only."
- config_name: 4d_level2
data_files: dims=4/level=2/*.parquet
description: "4D (x,y,z,t) trajectories with Level 2 encounter threshold. Valid altitude only."
- config_name: 4d_level3
data_files: dims=4/level=3/*.parquet
description: "4D (x,y,z,t) trajectories with Level 3 encounter threshold. Valid altitude only."
- config_name: dictionary
data_files: dict/trajectories.parquet
description: "Metadata mapping table for original trajectories (traj_id to source file)."
---
# Introduction
**GeoLife-Spatial-Join-149M** is a high-dimensional **rectangle–rectangle intersection join** benchmark derived from the **GeoLife GPS Trajectories (v1.3)** dataset. It converts trajectory points into **axis-aligned bounding boxes (AABBs) / hyper-rectangles** so that “encounter within spatial and temporal thresholds” can be evaluated via a pure **AABB intersection join** (closed-interval semantics).
This repository contains **~149M rectangle records** in total, organized into **six groups**:
- **3D**: `(x, y, t)` for `level = 1 / 2 / 3`
- **4D**: `(x, y, z, t)` for `level = 1 / 2 / 3` (only points with valid altitude)
#### **Units & encoding**
- `x, y, z`: centimeters (integer), projected to **EPSG:3857 (Web Mercator)**
- `t`: Unix epoch milliseconds (integer)
#### **Encounter thresholds (by level)**
- Level 1: Δd = 20 m, Δt = 60 s
- Level 2: Δd = 50 m, Δt = 300 s
- Level 3: Δd = 200 m, Δt = 1200 s
#### **Repository layout**
- `dims=3/level={1,2,3}/part-*.parquet`
- `dims=4/level={1,2,3}/part-*.parquet`
- `dict/trajectories.parquet`: `traj_id -> traj_src` mapping + per-trajectory stats
- `manifest.json`: build parameters + exact row counts + file list
Dataset construction details and the reference builder are available at: https://github.com/DANNHIROAKI/Geolife-Spatial-Join-0.15B-Builder
# Example
#### Installation
```bash
pip install -U huggingface_hub
```
#### Download the Entire Dataset
```bash
hf download DannHiroaki/Geolife-Spatial-Join-149M \
--repo-type dataset \
--local-dir ./Geolife-Spatial-Join-149M
```
#### Download specific shards
```bash
hf download DannHiroaki/Geolife-Spatial-Join-149M \
--repo-type dataset \
--local-dir ./Geolife-Spatial-Join-149M \
--include "manifest.json" \
--include "dict/trajectories.parquet" \
--include "dims=3/level=1/*.parquet"
```
#### Dry Run (Check size before downloading)
```bash
hf download DannHiroaki/Geolife-Spatial-Join-149M --repo-type dataset --dry-run
```
提供机构:
DannHiroaki



