nltk-data-hub/gazetteers
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/nltk-data-hub/gazetteers
下载链接
链接失效反馈官方服务:
资源简介:
NLTK地名录(扩展版)是一个包含地理和人口统计词汇列表的数据集,扩展自原始的NLTK地名录语料库。数据集包含多个配置,每个配置对应一个特定的地理或人口统计类别,如国家名称、ISO国家代码、国籍形容词、加拿大省份、美国州及城市等。此外,数据集还扩展了世界城市、UN M49区域代码、ISO3166-2国家分区、多种语言的领土名称、维基数据国家信息以及OSM首都城市信息等内容。每个配置的条目以行为单位,包含一个或多个字段,如名称、国家代码、人口、经纬度等。数据集支持通过HuggingFace的datasets库或NLTK库进行访问和使用。
NLTK Gazetteers (Extended) is a dataset containing geographic and demographic word lists, extended from the original NLTK gazetteers corpus. The dataset includes multiple configs, each corresponding to a specific geographic or demographic category, such as country names, ISO country codes, nationality adjectives, Canadian provinces, US states and cities, etc. Additionally, the dataset extends to world cities, UN M49 regional codes, ISO3166-2 country subdivisions, territory names in multiple languages, Wikidata country information, and OSM capital cities. Each configs entries are row-based, containing one or more fields such as name, country code, population, latitude, and longitude. The dataset supports access and usage via HuggingFaces datasets library or the NLTK library.
提供机构:
nltk-data-hub



