five

Musaed1/kudu-google-maps-reviews

收藏
Hugging Face2025-12-01 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Musaed1/kudu-google-maps-reviews
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-classification language: - ar - en tags: - reviews - restaurants - saudi-arabia - google-maps - sentiment - sentiment-analysis - arabic - kudu size_categories: - 100K<n<1M --- # Kudu Restaurant Reviews - Google Maps Dataset ## Dataset Description This dataset contains **282,761 customer reviews** from **364 Kudu restaurant locations** across Saudi Arabia, scraped from Google Maps. Kudu is one of the largest fast-food chains in Saudi Arabia, and this dataset provides comprehensive customer feedback data including ratings, review text (primarily in Arabic), and detailed aspect-based ratings. **Privacy Notice:** This dataset has been cleaned to remove personally identifiable information (PII) including reviewer names, IDs, and profile URLs to protect user privacy. ### Dataset Summary - **Total Reviews:** 282,761 - **Total Locations:** 364 - **Unique Reviewers:** 223,626 - **Languages:** Primarily Arabic (ar), with owner responses in both Arabic and English - **Date Range:** December 2011 to September 2025 (13.8 years) - **Average Rating:** 3.79/5.0 ## Dataset Structure ### Data Fields | Field | Type | Description | |-------|------|-------------| | `title` | string | Restaurant name in Arabic | | `reviewerNumberOfReviews` | int | Total reviews by this reviewer (anonymized) | | `isLocalGuide` | boolean | Whether reviewer is a Google Local Guide | | `text` | string | Review text (mostly Arabic, can be null) | | `textTranslated` | string | Translated review text (if available) | | `publishedAtDate` | datetime | ISO 8601 timestamp of review | | `likesCount` | int | Number of likes on the review | | `reviewId` | string | Unique Google review ID | | `reviewUrl` | string | Direct URL to the review | | `reviewOrigin` | string | Origin platform (Google) | | `stars` | int | Overall rating (1-5) | | `rating` | (int or null) | kept only for compatibility with original source. | | `responseFromOwnerDate` | datetime | When owner responded (if applicable) | | `responseFromOwnerText` | string | Owner's response to review | | `reviewImageUrls` | list | URLs of images attached to review | | `reviewContext` | dict | Additional context metadata | | `reviewDetailedRating` | dict | Aspect ratings: Food (الطعام), Service (الخدمة), Atmosphere (الأجواء) | | `visitedIn` | string | When the reviewer visited (if specified) | | `originalLanguage` | string | Original language of review | | `translatedLanguage` | string | Target language if translated | | `place_id` | string | Unique Google Place ID | | `place_name` | string | Location name (in English) | | `place_url` | string | Google Maps URL for the location | **Note:** Privacy-sensitive fields (reviewer names, IDs, profile URLs, and photos) have been removed from this dataset. ### Detailed Ratings Many reviews include aspect-based ratings (1-5 scale) for: - **Food (الطعام):** Food quality - **Service (الخدمة):** Service quality - **Atmosphere (الأجواء):** Restaurant atmosphere/ambiance ## Usage ### Loading the Dataset ```python from datasets import load_dataset # Load the dataset dataset = load_dataset("Musaed1/kudu-google-maps-reviews") # Access the data df = dataset['train'].to_pandas() print(f"Total reviews: {len(df):,}") print(f"Average rating: {df['stars'].mean():.2f}") ``` ### Using Parquet directly ```python import pandas as pd df = pd.read_parquet( "hf://datasets/Musaed1/kudu-google-maps-reviews/kudu_reviews_cleaned.parquet" ) ``` ## Dataset Statistics ### Rating Distribution - **5 stars:** 141,454 (50.0%) - **4 stars:** 45,316 (16.0%) - **3 stars:** 35,590 (12.6%) - **2 stars:** 14,665 (5.2%) - **1 star:** 45,736 (16.2%) ### Key Metrics - **Reviews with text:** 50.4% (142,401 reviews) - **Owner response rate:** 35.3% (99,948 responses) - **Reviews from Local Guides:** High percentage - **Average reviews per location:** 781 - **Reviews per reviewer:** 1.26 (median: 1) ### Aspect Ratings (Average) - **Food:** 4.3/5.0 - **Service:** 4.2/5.0 - **Atmosphere:** 4.1/5.0 ## Use Cases This dataset is suitable for: 1. **Sentiment Analysis:** Arabic sentiment classification on restaurant reviews 2. **Aspect-Based Sentiment Analysis:** Understanding sentiment about specific aspects (food, service, atmosphere) 3. **Rating Prediction:** Predicting overall ratings from review text 4. **Arabic NLP:** Training and evaluating Arabic language models 5. **Business Intelligence:** Understanding customer preferences and complaints 6. **Multilingual Analysis:** Comparing Arabic reviews with English owner responses 7. **Time Series Analysis:** Tracking rating trends and review patterns over time 8. **Customer Service Analysis:** Studying owner response patterns and effectiveness ## Data Collection - **Source:** Google Maps Reviews - **Collection Method:** Apify Google Maps Scraper API - **Collection Date:** Sep 2025 - **Geographic Scope:** Saudi Arabia only ## Privacy & Ethics This dataset has been carefully processed to protect user privacy: ### Removed Information - **Reviewer IDs:** Google user IDs have been removed - **Reviewer Names:** Personal names have been removed - **Profile URLs:** Links to reviewer profiles have been removed - **Profile Photos:** URLs to reviewer photos have been removed - **Redundant Fields:** Human-readable date field (publishAt) removed ### Retained Information - **Review Content:** The actual review text and ratings remain intact for analysis - **Aggregated Metrics:** Number of reviews per reviewer (for analyzing reviewer activity patterns) - **Local Guide Status:** Whether reviewer is a Local Guide (for quality analysis) - **Place Information:** Restaurant locations and identifiers remain for location-based analysis ## Limitations - Reviews are user-generated and may contain biases - Some reviews lack text content (rating only) - Text is primarily in Arabic with some mixed content - Owner responses may follow templates - Data represents a snapshot in time (not continuously updated) ## License This dataset is released under the **apache-2.0** license. ## Citation If you use this dataset in your research, please cite: ```bibtex @dataset{kudu_reviews_2025, title={Kudu Restaurant Reviews Dataset}, author={[Musaed Albedhani]}, year={2025}, publisher={HuggingFace}, url={https://huggingface.co/datasets/Musaed1/kudu-google-maps-reviews} } ``` ## Acknowledgments - Data sourced from Google Maps public reviews - Scraped using Apify platform - Restaurant chain: Kudu (Saudi Arabia) ## Contact For questions, issues, or collaboration requests, please reach out at: **[owdm.ai@gmail.com](mailto:owdm.ai@gmail.com)**
提供机构:
Musaed1
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作