MapReader_railspace_London_imago_mundi_2025
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14522925
下载链接
链接失效反馈官方服务:
资源简介:
MapReader Outputs
Railspace patches and text for London maps inferred with MapReader software using https://huggingface.co/Livingwithmachines/mr_resnest101e_finetuned_OS_6inch_2nd_ed_railspace on Ordnance Survey 6-inch-to-1-mile 2nd edition map sheets from the National Library of Scotland.
How did we create the London polygon.
Our London polygon was defined as a 20 mile radius from a point in central London:
```python
# load point as geopandas geodataframe, for easy crs conversion
london = gpd.GeoDataFrame(
data = ["London"],
columns=["name"],
geometry=[Point(-0.1275, 51.507222)],
crs="EPSG:4326"
)
# convert to British National Grid, units are meters
london.to_crs("EPSG:27700", inplace=True)
# buffer 20 miles (32186 meters) around London centriod
london.geometry = london.geometry.buffer(32186)
```
Files:
100meter_patch_df.csv - 586,275 patches data
100meter_parent_df.csv - 329 maps metadata
railspace_predictions_patch_df.csv - 586,275 patches classified as either "no" or "railspace", 556,721 "no", 29,554 "railspace"
post_processed_railspace_predictions_patch_df.csv - 586,275 patches classified as either "no" or "railspace", 556,880 "no", 29,395 "railspace"
geo_predictions_deduplicated_point.csv - georeferenced text spotting predictions for all maps with polygons simplified to points. This file only contains point data.
geo_predictions_deduplicated_centroid.csv - georeferenced text spotting predictions for all maps with polygons simplified to points. This file contains both polygons and point data for text spotting predictions but will load points as geometry by default, you can update this by setting the geometry as the `polygon` column.
Note: new columns have been added in post processed dataframe with updated label + label index "new_predicted_label" and "new_pred". This post-processing was done using MapReader's context-based post-processing tool [here](https://mapreader.readthedocs.io/en/latest/using-mapreader/step-by-step-guide/5-post-process.html#context-post-processing). We used default confidence threshold of 0.7.
MapReader 输出结果。本数据集包含伦敦地图的铁路空间(Railspace)斑块与文本数据,系使用MapReader(MapReader)软件,依托Hugging Face平台上的`Livingwithmachines/mr_resnest101e_finetuned_OS_6inch_2nd_ed_railspace`模型,对苏格兰国家图书馆馆藏的英国地形测量局(Ordnance Survey)第二版6英寸比1英里比例尺地图图幅进行推理得到的。
### 伦敦多边形的构建方式
我们将伦敦多边形定义为以伦敦市中心某一点为中心、半径20英里的区域:
python
# load point as geopandas geodataframe, for easy crs conversion
london = gpd.GeoDataFrame(
data = ["London"],
columns=["name"],
geometry=[Point(-0.1275, 51.507222)],
crs="EPSG:4326"
)
# convert to British National Grid, units are meters
london.to_crs("EPSG:27700", inplace=True)
# buffer 20 miles (32186 meters) around London centriod
london.geometry = london.geometry.buffer(32186)
### 文件列表
1. `100meter_patch_df.csv`:包含586,275个斑块数据
2. `100meter_parent_df.csv`:包含329份地图元数据
3. `railspace_predictions_patch_df.csv`:包含586,275个斑块的分类结果,分为“无(no)”与“铁路空间(Railspace)”两类,其中556,721个标记为“无”,29,554个标记为“铁路空间”
4. `post_processed_railspace_predictions_patch_df.csv`:包含586,275个斑块的分类结果,分为“无(no)”与“铁路空间(Railspace)”两类,其中556,880个标记为“无”,29,395个标记为“铁路空间”
5. `geo_predictions_deduplicated_point.csv`:包含所有地图的地理配准文本定位识别结果,其中多边形已简化为点数据,本文件仅包含点几何数据。
6. `geo_predictions_deduplicated_centroid.csv`:包含所有地图的地理配准文本定位识别结果,其中文本识别预测结果同时包含多边形与点数据,默认将点作为几何数据加载,可通过将`polygon`列设置为几何列进行修改。
### 补充说明
后处理后的数据框新增了两列,分别为更新后的标签及标签索引:`new_predicted_label`与`new_pred`。本次后处理使用了MapReader基于上下文的后处理工具(详见[此处](https://mapreader.readthedocs.io/en/latest/using-mapreader/step-by-step-guide/5-post-process.html#context-post-processing)),并采用了默认的0.7置信度阈值。
创建时间:
2025-03-28



