five

1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/5148692
下载链接
链接失效反馈
官方服务:
资源简介:
1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate.   How to use from pathlib import Path import requests import pandas as pd url = ("https://raw.githubusercontent.com/" "epogrebnyak/ru-cities/main/assets/towns.csv") # save file locally p = Path("towns.csv") if not p.exists(): content = requests.get(url).text p.write_text(content, encoding="utf-8") # read as dataframe df = pd.read_csv("towns.csv") print(df.sample(5))   Files: towns.csv - city information regions.csv - list of Russian Federation regions alt_city_names.json - alternative city names   Сolumns (towns.csv): Basic info: city - city name (several cities have alternative names marked in alt_city_names.json) population - city population, thousand people, Rosstat estimate as of 1.1.2020 lat,lon - city geographic coordinates Region: region_name - subnational region (oblast, republic, krai or AO) region_iso_code - ISO 3166 code, eg RU-VLD federal_district, eg Центральный City codes: okato oktmo fias_id kladr_id   Data sources City list and city population collected from Rosstat publication Регионы России. Основные социально-экономические показатели городов and parsed from publication Microsoft Word files. City list corresponds to this Wikipedia article. Alternative dataset is wiki-based Dadata city dataset (no population data).   Comments   City groups Ханты-Мансийский and Ямало-Ненецкий autonomous regions excluded to avoid duplication as parts of Тюменская область. Several notable towns are classified as administrative part of larger cities (Сестрорецк is a municpality at Saint-Petersburg, Щербинка part of Moscow). They are not and not reported in this dataset.   By individual city Белоозерский not found in Rosstat publication, but should be considered a city as of 1.1.2020   Alternative city names We suppressed letter "ё" city columns in towns.csv - we have Орел, but not Орёл. This affected: Белоозёрский Королёв Ликино-Дулёво Озёры Щёлково Орёл Дмитриев and Дмитриев-Льговский are the same city. assets/alt_city_names.json contains these names.   Tests poetry install poetry run python -m pytest   How to replicate dataset   1. Base dataset Run: download data stro rar/get.sh convert Саратовская область.doc to docx run make.py Creates: _towns.csv assets/regions.csv   2. API calls Note: do not attempt if you do not have to - this runs a while and loads third-party API access. You have the resulting files in repo, so probably does not need to these scripts. Run: cd geocoding run coord_dadata.py (needs token) run coord_osm.py Creates: coord_dadata.csv coord_osm.csv   3. Merge data Run: run merge.py Creates: assets/towns.csv
创建时间:
2021-08-06
二维码
社区交流群
二维码
科研交流群
商业服务