fsq-os-places
收藏魔搭社区2026-01-06 更新2024-12-07 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/fsq-os-places
下载链接
链接失效反馈官方服务:
资源简介:
Foursquare OS Places is now a gated dataset on Hugging Face. Read more about why we are making this change here: https://medium.com/@foursquare/evolving-fsq-os-places-fa7a3f5197cd
# Access FSQ OS Places
With Foursquare’s Open Source Places, you can access free data to accelerate geospatial innovation and insights. View the [Places OS Data Schemas](https://docs.foursquare.com/data-products/docs/places-os-data-schema) for a full list of available attributes.
## Prerequisites
In order to access Foursquare's Open Source Places data, it is recommended to use Spark. Here is how to load the Places data in Spark from Hugging Face.
- For Spark 3, you can use the `read_parquet` helper function from the [HF Spark documentation](https://huggingface.co/docs/hub/datasets-spark). It provides an easy API to load a Spark Dataframe from Hugging Face, without having to download the full dataset locally:
```python
places = read_parquet("hf://datasets/foursquare/fsq-os-places/release/dt=2025-12-16/places/parquet/*.parquet")
```
- For Spark 4, there will be an official Hugging Face Spark data source available.
Alternatively you can download the following files to your local disk or cluster:
- Parquet Files:
- **Places** - [release/dt=2025-12-16/places/parquet](https://huggingface.co/datasets/foursquare/fsq-os-places/tree/main/release/dt%3D2025-11-18/places/parquet)
- **Categories** - [release/dt=2025-12-16/categories/parquet](https://huggingface.co/datasets/foursquare/fsq-os-places/tree/main/release/dt%3D2025-11-18/categories/parquet)
Hugging Face provides the following [download options](https://huggingface.co/docs/hub/datasets-downloading).
## Example Queries
The following are examples on how to query FSQ Open Source Places using Athena and Spark:
- Filter [Categories](https://docs.foursquare.com/data-products/docs/categories#places-open-source--propremium-flat-file) by the parent level
- Filter out [non-commercial venues](#non-commercial-categories-table)
- Find open and recently active POI
### Filter by Parent Level Category
**SparkSQL**
```sql SparkSQL
WITH places_exploded_categories AS (
-- Unnest categories array
SELECT fsq_place_id,
name,
explode(fsq_category_ids) as fsq_category_id
FROM places
),
distinct_places AS (
SELECT
DISTINCT(fsq_place_id) -- Get distinct ids to reduce duplicates from explode function
FROM places_exploded_categories p
JOIN categories c -- Join to categories to filter on Level 2 Category
ON p.fsq_category_id = c.category_id
WHERE c.level2_category_id = '4d4b7105d754a06374d81259' -- Restaurants
)
SELECT * FROM places
WHERE fsq_place_id IN (SELECT fsq_place_id FROM distinct_places)
```
### Filter out Non-Commercial Categories
**SparkSQL**
```sql SparkSQL
SELECT * FROM places
WHERE arrays_overlap(fsq_category_ids, array('4bf58dd8d48988d1f0931735', -- Airport Gate
'62d587aeda6648532de2b88c', -- Beer Festival
'4bf58dd8d48988d12b951735', -- Bus Line
'52f2ab2ebcbc57f1066b8b3b', -- Christmas Market
'50aa9e094b90af0d42d5de0d', -- City
'5267e4d9e4b0ec79466e48c6', -- Conference
'5267e4d9e4b0ec79466e48c9', -- Convention
'530e33ccbcbc57f1066bbff7', -- Country
'5345731ebcbc57f1066c39b2', -- County
'63be6904847c3692a84b9bb7', -- Entertainment Event
'4d4b7105d754a06373d81259', -- Event
'5267e4d9e4b0ec79466e48c7', -- Festival
'4bf58dd8d48988d132951735', -- Hotel Pool
'52f2ab2ebcbc57f1066b8b4c', -- Intersection
'50aaa4314b90af0d42d5de10', -- Island
'58daa1558bbb0b01f18ec1fa', -- Line
'63be6904847c3692a84b9bb8', -- Marketplace
'4f2a23984b9023bd5841ed2c', -- Moving Target
'5267e4d9e4b0ec79466e48d1', -- Music Festival
'4f2a25ac4b909258e854f55f', -- Neighborhood
'5267e4d9e4b0ec79466e48c8', -- Other Event
'52741d85e4b0d5d1e3c6a6d9', -- Parade
'4bf58dd8d48988d1f7931735', -- Plane
'4f4531504b9074f6e4fb0102', -- Platform
'4cae28ecbf23941eb1190695', -- Polling Place
'4bf58dd8d48988d1f9931735', -- Road
'5bae9231bedf3950379f89c5', -- Sporting Event
'530e33ccbcbc57f1066bbff8', -- State
'530e33ccbcbc57f1066bbfe4', -- States and Municipalities
'52f2ab2ebcbc57f1066b8b54', -- Stoop Sale
'5267e4d8e4b0ec79466e48c5', -- Street Fair
'53e0feef498e5aac066fd8a9', -- Street Food Gathering
'4bf58dd8d48988d130951735', -- Taxi
'530e33ccbcbc57f1066bbff3', -- Town
'5bae9231bedf3950379f89c3', -- Trade Fair
'4bf58dd8d48988d12a951735', -- Train
'52e81612bcbc57f1066b7a24', -- Tree
'530e33ccbcbc57f1066bbff9', -- Village
)) = false
```
### Find Open and Recently Active POI
**SparkSQL**
```sql SparkSQL
SELECT * FROM places p
WHERE p.date_closed IS NULL
AND p.date_refreshed >= DATE_SUB(current_date(), 365);
```
## Licensing information
The dataset is available under the [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0)
## Appendix
### Non-Commercial Categories Table
| Category Name | Category ID |
| :------------------------ | :----------------------- |
| Airport Gate | 4bf58dd8d48988d1f0931735 |
| Beer Festival | 62d587aeda6648532de2b88c |
| Bus Line | 4bf58dd8d48988d12b951735 |
| Christmas Market | 52f2ab2ebcbc57f1066b8b3b |
| City | 50aa9e094b90af0d42d5de0d |
| Conference | 5267e4d9e4b0ec79466e48c6 |
| Convention | 5267e4d9e4b0ec79466e48c9 |
| Country | 530e33ccbcbc57f1066bbff7 |
| County | 5345731ebcbc57f1066c39b2 |
| Entertainment Event | 63be6904847c3692a84b9bb7 |
| Event | 4d4b7105d754a06373d81259 |
| Festival | 5267e4d9e4b0ec79466e48c7 |
| Hotel Pool | 4bf58dd8d48988d132951735 |
| Intersection | 52f2ab2ebcbc57f1066b8b4c |
| Island | 50aaa4314b90af0d42d5de10 |
| Line | 58daa1558bbb0b01f18ec1fa |
| Marketplace | 63be6904847c3692a84b9bb8 |
| Moving Target | 4f2a23984b9023bd5841ed2c |
| Music Festival | 5267e4d9e4b0ec79466e48d1 |
| Neighborhood | 4f2a25ac4b909258e854f55f |
| Other Event | 5267e4d9e4b0ec79466e48c8 |
| Parade | 52741d85e4b0d5d1e3c6a6d9 |
| Plane | 4bf58dd8d48988d1f7931735 |
| Platform | 4f4531504b9074f6e4fb0102 |
| Polling Place | 4cae28ecbf23941eb1190695 |
| Road | 4bf58dd8d48988d1f9931735 |
| State | 530e33ccbcbc57f1066bbff8 |
| States and Municipalities | 530e33ccbcbc57f1066bbfe4 |
| Stopp Sale | 52f2ab2ebcbc57f1066b8b54 |
| Street Fair | 5267e4d8e4b0ec79466e48c5 |
| Street Food Gathering | 53e0feef498e5aac066fd8a9 |
| Taxi | 4bf58dd8d48988d130951735 |
| Town | 530e33ccbcbc57f1066bbff3 |
| Trade Fair | 5bae9231bedf3950379f89c3 |
| Train | 4bf58dd8d48988d12a951735 |
| Tree | 52e81612bcbc57f1066b7a24 |
| Village | 530e33ccbcbc57f1066bbff9 |
Foursquare OS Places 现已在 Hugging Face 平台成为受访问控制的数据集(gated dataset)。欲了解此次调整的更多详情,请访问:https://medium.com/@foursquare/evolving-fsq-os-places-fa7a3f5197cd
# 访问 Foursquare OS Places
借助 Foursquare 开源地点数据集,您可获取免费数据以加速地理空间创新与洞察产出。如需查看完整可用属性列表,请参阅[开源地点数据架构(Places OS Data Schemas)](https://docs.foursquare.com/data-products/docs/places-os-data-schema)。
## 前置要求
如需访问 Foursquare 开源地点数据集,推荐使用 Spark。以下为在 Spark 中从 Hugging Face 加载该数据集的方法:
- 针对 Spark 3,您可使用[Hugging Face Spark 文档](https://huggingface.co/docs/hub/datasets-spark)中的 `read_parquet` 辅助函数。该函数提供了简便的 API,可直接从 Hugging Face 加载 Spark DataFrame,无需将完整数据集下载至本地:
python
places = read_parquet("hf://datasets/foursquare/fsq-os-places/release/dt=2025-12-16/places/parquet/*.parquet")
- 针对 Spark 4,将推出官方 Hugging Face Spark 数据源。
您也可将以下文件下载至本地磁盘或计算集群:
- Parquet 文件:
- **地点数据** - [release/dt=2025-12-16/places/parquet](https://huggingface.co/datasets/foursquare/fsq-os-places/tree/main/release/dt%3D2025-11-18/places/parquet)
- **分类数据** - [release/dt=2025-12-16/categories/parquet](https://huggingface.co/datasets/foursquare/fsq-os-places/tree/main/release/dt%3D2025-11-18/categories/parquet)
Hugging Face 提供了以下[下载方式](https://huggingface.co/docs/hub/datasets-downloading)。
## 示例查询
以下为使用 Athena 与 Spark 查询 FSQ 开源地点数据集的示例:
- 按父级分类过滤[分类体系](https://docs.foursquare.com/data-products/docs/categories#places-open-source--propremium-flat-file)
- 过滤[非商业分类](#非商业分类表)
- 查找已上线且近期活跃的 POI(Point of Interest)
### 按父级分类过滤
**SparkSQL**
sql SparkSQL
WITH places_exploded_categories AS (
-- 展开分类数组
SELECT fsq_place_id,
name,
explode(fsq_category_ids) as fsq_category_id
FROM places
),
distinct_places AS (
SELECT
DISTINCT(fsq_place_id) -- 获取唯一 ID 以消除 explode 函数带来的重复项
FROM places_exploded_categories p
JOIN categories c -- 关联分类表以筛选二级分类
ON p.fsq_category_id = c.category_id
WHERE c.level2_category_id = '4d4b7105d754a06374d81259' -- 餐饮场所
)
SELECT * FROM places
WHERE fsq_place_id IN (SELECT fsq_place_id FROM distinct_places)
### 过滤非商业分类
**SparkSQL**
sql SparkSQL
SELECT * FROM places
WHERE arrays_overlap(fsq_category_ids, array('4bf58dd8d48988d1f0931735', -- 机场登机口
'62d587aeda6648532de2b88c', -- 啤酒节
'4bf58dd8d48988d12b951735', -- 公交路线
'52f2ab2ebcbc57f1066b8b3b', -- 圣诞集市
'50aa9e094b90af0d42d5de0d', -- 城市
'5267e4d9e4b0ec79466e48c6', -- 会议
'5267e4d9e4b0ec79466e48c9', -- 大会
'530e33ccbcbc57f1066bbff7', -- 国家
'5345731ebcbc57f1066c39b2', -- 郡/县
'63be6904847c3692a84b9bb7', -- 娱乐活动
'4d4b7105d754a06373d81259', -- 活动
'5267e4d9e4b0ec79466e48c7', -- 节庆活动
'4bf58dd8d48988d132951735', -- 酒店泳池
'52f2ab2ebcbc57f1066b8b4c', -- 交叉口
'50aaa4314b90af0d42d5de10', -- 岛屿
'58daa1558bbb0b01f18ec1fa', -- 线路
'63be6904847c3692a84b9bb8', -- 集市
'4f2a23984b9023bd5841ed2c', -- 移动目标
'5267e4d9e4b0ec79466e48d1', -- 音乐节
'4f2a25ac4b909258e854f55f', -- 街区
'5267e4d9e4b0ec79466e48c8', -- 其他活动
'52741d85e4b0d5d1e3c6a6d9', -- 游行
'4bf58dd8d48988d1f7931735', -- 飞机
'4f4531504b9074f6e4fb0102', -- 站台
'4cae28ecbf23941eb1190695', -- 投票站
'4bf58dd8d48988d1f9931735', -- 道路
'5bae9231bedf3950379f89c5', -- 体育赛事
'530e33ccbcbc57f1066bbff8', -- 州/省
'530e33ccbcbc57f1066bbfe4', -- 国家与行政区
'52f2ab2ebcbc57f1066b8b54', -- 庭院旧货售卖
'5267e4d8e4b0ec79466e48c5', -- 街头集市
'53e0feef498e5aac066fd8a9', -- 街头美食集会
'4bf58dd8d48988d130951735', -- 出租车
'530e33ccbcbc57f1066bbff3', -- 城镇
'5bae9231bedf3950379f89c3', -- 贸易展会
'4bf58dd8d48988d12a951735', -- 火车
'52e81612bcbc57f1066b7a24', -- 树木
'530e33ccbcbc57f1066bbff9', -- 村庄
)) = false
### 查找已上线且近期活跃的 POI
**SparkSQL**
sql SparkSQL
SELECT * FROM places p
WHERE p.date_closed IS NULL
AND p.date_refreshed >= DATE_SUB(current_date(), 365);
## 授权许可信息
本数据集采用[Apache 2.0 开源许可协议](https://www.apache.org/licenses/LICENSE-2.0)发布。
## 附录
### 非商业分类表
| 分类名称 | 分类 ID |
| :------------------------ | :----------------------- |
| 机场登机口 | 4bf58dd8d48988d1f0931735 |
| 啤酒节 | 62d587aeda6648532de2b88c |
| 公交路线 | 4bf58dd8d48988d12b951735 |
| 圣诞集市 | 52f2ab2ebcbc57f1066b8b3b |
| 城市 | 50aa9e094b90af0d42d5de0d |
| 会议 | 5267e4d9e4b0ec79466e48c6 |
| 大会 | 5267e4d9e4b0ec79466e48c9 |
| 国家 | 530e33ccbcbc57f1066bbff7 |
| 郡/县 | 5345731ebcbc57f1066c39b2 |
| 娱乐活动 | 63be6904847c3692a84b9bb7 |
| 活动 | 4d4b7105d754a06373d81259 |
| 节庆活动 | 5267e4d9e4b0ec79466e48c7 |
| 酒店泳池 | 4bf58dd8d48988d132951735 |
| 交叉口 | 52f2ab2ebcbc57f1066b8b4c |
| 岛屿 | 50aaa4314b90af0d42d5de10 |
| 线路 | 58daa1558bbb0b01f18ec1fa |
| 集市 | 63be6904847c3692a84b9bb8 |
| 移动目标 | 4f2a23984b9023bd5841ed2c |
| 音乐节 | 5267e4d9e4b0ec79466e48d1 |
| 街区 | 4f2a25ac4b909258e854f55f |
| 其他活动 | 5267e4d9e4b0ec79466e48c8 |
| 游行 | 52741d85e4b0d5d1e3c6a6d9 |
| 飞机 | 4bf58dd8d48988d1f7931735 |
| 站台 | 4f4531504b9074f6e4fb0102 |
| 投票站 | 4cae28ecbf23941eb1190695 |
| 道路 | 4bf58dd8d48988d1f9931735 |
| 州/省 | 530e33ccbcbc57f1066bbff8 |
| 国家与行政区 | 530e33ccbcbc57f1066bbfe4 |
| 庭院旧货售卖 | 52f2ab2ebcbc57f1066b8b54 |
| 街头集市 | 5267e4d8e4b0ec79466e48c5 |
| 街头美食集会 | 53e0feef498e5aac066fd8a9 |
| 出租车 | 4bf58dd8d48988d130951735 |
| 城镇 | 530e33ccbcbc57f1066bbff3 |
| 贸易展会 | 5bae9231bedf3950379f89c3 |
| 火车 | 4bf58dd8d48988d12a951735 |
| 树木 | 52e81612bcbc57f1066b7a24 |
| 村庄 | 530e33ccbcbc57f1066bbff9 |
提供机构:
maas
创建时间:
2024-12-06
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是Foursquare的Open Source Places,提供免费的地理空间数据,包含地点和类别信息,以Parquet格式存储,适用于加速地理空间创新和洞察。数据集建议使用Spark进行访问和查询,并提供了示例查询方法,如过滤类别和查找活跃地点。数据集采用Apache 2.0许可证,但已变为门控数据集,需注意访问限制。
以上内容由遇见数据集搜集并总结生成



