1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/5148692
下载链接
链接失效反馈官方服务:
资源简介:
1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate.
How to use
from pathlib import Path
import requests
import pandas as pd
url = ("https://raw.githubusercontent.com/"
"epogrebnyak/ru-cities/main/assets/towns.csv")
# save file locally
p = Path("towns.csv")
if not p.exists():
content = requests.get(url).text
p.write_text(content, encoding="utf-8")
# read as dataframe
df = pd.read_csv("towns.csv")
print(df.sample(5))
Files:
towns.csv - city information
regions.csv - list of Russian Federation regions
alt_city_names.json - alternative city names
Сolumns (towns.csv):
Basic info:
city - city name (several cities have alternative names marked in alt_city_names.json)
population - city population, thousand people, Rosstat estimate as of 1.1.2020
lat,lon - city geographic coordinates
Region:
region_name - subnational region (oblast, republic, krai or AO)
region_iso_code - ISO 3166 code, eg RU-VLD
federal_district, eg Центральный
City codes:
okato
oktmo
fias_id
kladr_id
Data sources
City list and city population collected from Rosstat publication Регионы России. Основные социально-экономические показатели городов and parsed from publication Microsoft Word files.
City list corresponds to this Wikipedia article.
Alternative dataset is wiki-based Dadata city dataset (no population data).
Comments
City groups
Ханты-Мансийский and Ямало-Ненецкий autonomous regions excluded to avoid duplication as parts of Тюменская область.
Several notable towns are classified as administrative part of larger cities (Сестрорецк is a municpality at Saint-Petersburg, Щербинка part of Moscow). They are not and not reported in this dataset.
By individual city
Белоозерский not found in Rosstat publication, but should be considered a city as of 1.1.2020
Alternative city names
We suppressed letter "ё" city columns in towns.csv - we have Орел, but not Орёл. This affected:
Белоозёрский
Королёв
Ликино-Дулёво
Озёры
Щёлково
Орёл
Дмитриев and Дмитриев-Льговский are the same city.
assets/alt_city_names.json contains these names.
Tests
poetry install
poetry run python -m pytest
How to replicate dataset
1. Base dataset
Run:
download data stro rar/get.sh
convert Саратовская область.doc to docx
run make.py
Creates:
_towns.csv
assets/regions.csv
2. API calls
Note: do not attempt if you do not have to - this runs a while and loads third-party API access.
You have the resulting files in repo, so probably does not need to these scripts.
Run:
cd geocoding
run coord_dadata.py (needs token)
run coord_osm.py
Creates:
coord_dadata.csv
coord_osm.csv
3. Merge data
Run:
run merge.py
Creates:
assets/towns.csv
创建时间:
2021-08-06



