Enriched Tourism Dataset London (POIs)
收藏DataCite Commons2025-04-01 更新2025-04-16 收录
下载链接:
https://data.mendeley.com/datasets/gw9hjn4v65
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains the London subset of the Tourpedia dataset, specifically focusing on points of interest (POIs) categorized as attractions (dataset available at \url{http://tour-pedia.org/download/london-attraction.csv}). The original dataset comprises 20,727 entries that encompass a variety of attractions across London, providing details on several attributes for each POI. These attributes include a unique identifier, POI name, category, location information (address), latitude, longitude, specific details, and user-generated reviews. The review fields contain textual feedback from users, aggregated from platforms such as Google Places, Foursquare, and Facebook, offering a qualitative insight into each location.
However, due to the initial dataset's high proportion of incomplete or inconsistently structured entries, a rigorous cleaning process was implemented. This process entailed the removal of erroneous and incomplete data points, ultimately refining the dataset to 2,341 entries that meet criteria for quality and structural coherence. These selected entries were subjected to further validation to ensure data integrity, enabling a more accurate representation of London's attractions.
- London.csv
It contains columns including a unique identifier, POI name, category, location information (address), latitude, longitude, specific details, and user-generated reviews. Those reviews have been previously retrieved and pre-processed from Google Places, Foursquare, and Facebook, and have different formats: all words, only nouns, nouns + verbs, noun + adjectives and nouns + verbs + adjectives.
- London_annotated.csv
It contains the ground truth relating to the previous dataset, with manual annotations made by humans on the categorisation of each of the POIs into 12 different pre-defined categories.
It has the following columns:
* POI name
* POI's address
* One column for each of the above categories. 1 means that the POI belongs to the category while blank indicates that it does not.
提供机构:
Mendeley Data
创建时间:
2024-10-30



