Geography
收藏Databricks2024-05-09 收录
下载链接:
https://marketplace.databricks.com/details/99c894ba-6bd1-4af5-a5b6-378b677f2b61/John-Snow-Labs_Geography
下载链接
链接失效反馈官方服务:
资源简介:
**Overview**
This data package contains geography related datasets which include various codes, for example, airport codes, country and continent codes, IMO IMDG classification codes, dialing codes, currency codes, container codes, language codes, package codes, US states and territories codes. It also contains data on population figures and geolocation of cities and countries.
**Description**
This data package is useful when data on various codes and common abbreviations used to represent the states and territories of the United States, their spatial relations, population figures and major cities of the world and their geolocation is required.
In addition to the above data, this data package also contains detailed information on the following:
- Airport codes, i.e., International Air Transport Association (IATA) airport code, a three-letter code which is used in passenger reservation, ticketing and baggage-handling systems and the International Civil Aviation Organization.
- A list of countries by continent. Continent codes and country codes are also included in this dataset
- All IETF (Internet Engineering Task Force) language tags of the official resource indicated by Library of Congress Unicode. The default language for all the codes in the dataset is ""unicode-cldr"".
- International Organization for Standardization ISO 4217 is the International Standard for currency codes.
- A coded list of ISO 6346 shipping containers, used in international trade and electronic shipping messages.
**Benefits**
- Useful for researchers working in the field of geography and world statistics. comprehensive list of international codes, including continent codes, country codes and city codes. population figures and estimates available for analysis in normalized form.
**License Information**
The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the [Data Library](https://www.johnsnowlabs.com/marketplace/) on John Snow Labs website. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes.
**Included Datasets**
- [Airport Codes](https://www.johnsnowlabs.com/marketplace/airport-codes)
- The Airport codes includes the International Air Transport Association (IATA) airport code, a three-letter code which is used in passenger reservation, ticketing and baggage-handling systems and the International Civil Aviation Organization (ICAO) airport code which is a four letter code used by Air traffic control (ATC) systems and for airports that do not have an IATA airport code
- [All Countries Latitude Longitude](https://www.johnsnowlabs.com/marketplace/all-countries-latitude-longitude)
- This dataset provides country code, postal code, latitude, longitude, as well as names of state, county/province, community etc. for all countries where the data is available.
- [Classification of Countries by Income Economies](https://www.johnsnowlabs.com/marketplace/classification-of-countries-by-income-economies)
- This dataset lists the World Bank member countries and all other economies and classifies them according to low, middle and high income economies with populations of more than 30,000.
- [Countries Geographical Territory Containment](https://www.johnsnowlabs.com/marketplace/countries-geographical-territory-containment)
- This dataset contains a list of different geographical or continental regions and the relative list of codes of the territories/countries they contain.
- [IMO IMDG Classification Codes](https://www.johnsnowlabs.com/marketplace/imo-imdg-classification-codes)
- The International Maritime Dangerous Goods (IMDG) Code of the International Maritime Organization (IMO) was developed as a uniform international code for the transport of dangerous goods by sea covering such matters as packing, container traffic and stowage, with particular reference to the segregation of incompatible substances.
- [ISO 3166 Country Codes ITU Dialing Codes ISO 4217 Currency Codes](https://www.johnsnowlabs.com/marketplace/iso-3166-country-codes-itu-dialing-codes-iso-4217-currency-codes)
- This dataset contains a list of comprehensive data about (International Organization for Standardization)ISO-3166 Country Codes in English, (International Telecommunications Union) ITU Dialing Codes, ISO Currency Codes and the other related data with geographic information for different countries around the world. The list consists of 251 countries and islands.
- [ISO 6346 Container Codes](https://www.johnsnowlabs.com/marketplace/iso-6346-container-codes)
- This dataset consists of coded list of ISO 6346 shipping containers, used in international trade and electronic shipping messages.
- [Major Cities of The World](https://www.johnsnowlabs.com/marketplace/major-cities-of-the-world)
- This dataset lists cities which consists of above 15,000 inhabitants. Each city is associated with its country and sub-country to reduce the number of ambiguities. Subcountry can be the name of a state (eg in the United Kingdom or the United States of America) or the major administrative section (eg "region" in "France").
- [SMDG Master Terminal Facilities List](https://www.johnsnowlabs.com/marketplace/smdg-master-terminal-facilities-list)
- This Codes List has been created and will be maintained by the Secretariat of SMDG (User Group for Shipping Lines and Container Terminals) with the purpose of harmonizing the codes between EDI (Electronic Data Interchange) partners. The lists can be downloaded by all interested parties and used for the benefit of standardization worldwide. The members of SMDG and the SMDG secretariat cannot be held responsible for the correctness and completeness of the lists.
- [Spatial Relations Between Countries and Geographical Standards](https://www.johnsnowlabs.com/marketplace/spatial-relations-between-countries-and-geographical-standards)
- This dataset contains a list of two-letter codes in English for different countries around the world and their geographical information. The list consists of ISO 3166-1-alpha-2 code elements for 250 countries, neighboring countries, geographical location and Universal Transverse Mercator (UTM) Grid information.
- [UN-CEFACT Package Codes](https://www.johnsnowlabs.com/marketplace/un-cefact-package-codes)
- This dataset is a coded representation of package type names used in international trade, revision 8, annex v and annex vi (UN/ECE CEFACT Trade Facilitation Recommendation No.21).
- [US State New Jersey Municipalities With Geoname IDs](https://www.johnsnowlabs.com/marketplace/us-state-new-jersey-municipalities-with-geoname-ids)
- This dataset includes data about 565 municipalities of New Jersey, US state, including the GeoName ID for each municipality. The data describes New Jersey's municipalities with regard to the population size, county, type, the form of government and the aspects related to its establishment, change of type and name.
- [US States and Territories](https://www.johnsnowlabs.com/marketplace/us-states-and-territories)
- This dataset contains various codes and common abbreviations used to represent the states and territories of the United States.
**Data Engineering Overview**
**We deliver high-quality data**
- Each dataset goes through 3 levels of quality review
- 2 Manual reviews are done by domain experts
- Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints
- Data is normalized into one unified type system
- All dates, unites, codes, currencies look the same
- All null values are normalized to the same value
- All dataset and field names are SQL and Hive compliant
- Data and Metadata
- Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters
- Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated
- Data Updates
- Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted
**Our data is curated and enriched by domain experts**
Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts:
- Field names, descriptions, and normalized values are chosen by people who actually understand their meaning
- Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset
- Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations
- The data is always kept up to date – even when the source requires manual effort to get updates
- Support for data subscribers is provided directly by the domain experts who curated the data sets
- Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution.
**Need Help?**
If you have questions about our products, contact us at [info@johnsnowlabs.com](mailto:info@johnsnowlabs.com).
提供机构:
John Snow Labs



