Musaed1/kudu-google-maps-reviews
收藏Hugging Face2025-12-01 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Musaed1/kudu-google-maps-reviews
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-classification
language:
- ar
- en
tags:
- reviews
- restaurants
- saudi-arabia
- google-maps
- sentiment
- sentiment-analysis
- arabic
- kudu
size_categories:
- 100K<n<1M
---
# Kudu Restaurant Reviews - Google Maps Dataset
## Dataset Description
This dataset contains **282,761 customer reviews** from **364 Kudu restaurant locations** across Saudi Arabia, scraped from Google Maps. Kudu is one of the largest fast-food chains in Saudi Arabia, and this dataset provides comprehensive customer feedback data including ratings, review text (primarily in Arabic), and detailed aspect-based ratings.
**Privacy Notice:** This dataset has been cleaned to remove personally identifiable information (PII) including reviewer names, IDs, and profile URLs to protect user privacy.
### Dataset Summary
- **Total Reviews:** 282,761
- **Total Locations:** 364
- **Unique Reviewers:** 223,626
- **Languages:** Primarily Arabic (ar), with owner responses in both Arabic and English
- **Date Range:** December 2011 to September 2025 (13.8 years)
- **Average Rating:** 3.79/5.0
## Dataset Structure
### Data Fields
| Field | Type | Description |
|-------|------|-------------|
| `title` | string | Restaurant name in Arabic |
| `reviewerNumberOfReviews` | int | Total reviews by this reviewer (anonymized) |
| `isLocalGuide` | boolean | Whether reviewer is a Google Local Guide |
| `text` | string | Review text (mostly Arabic, can be null) |
| `textTranslated` | string | Translated review text (if available) |
| `publishedAtDate` | datetime | ISO 8601 timestamp of review |
| `likesCount` | int | Number of likes on the review |
| `reviewId` | string | Unique Google review ID |
| `reviewUrl` | string | Direct URL to the review |
| `reviewOrigin` | string | Origin platform (Google) |
| `stars` | int | Overall rating (1-5) |
| `rating` | (int or null) | kept only for compatibility with original source. |
| `responseFromOwnerDate` | datetime | When owner responded (if applicable) |
| `responseFromOwnerText` | string | Owner's response to review |
| `reviewImageUrls` | list | URLs of images attached to review |
| `reviewContext` | dict | Additional context metadata |
| `reviewDetailedRating` | dict | Aspect ratings: Food (الطعام), Service (الخدمة), Atmosphere (الأجواء) |
| `visitedIn` | string | When the reviewer visited (if specified) |
| `originalLanguage` | string | Original language of review |
| `translatedLanguage` | string | Target language if translated |
| `place_id` | string | Unique Google Place ID |
| `place_name` | string | Location name (in English) |
| `place_url` | string | Google Maps URL for the location |
**Note:** Privacy-sensitive fields (reviewer names, IDs, profile URLs, and photos) have been removed from this dataset.
### Detailed Ratings
Many reviews include aspect-based ratings (1-5 scale) for:
- **Food (الطعام):** Food quality
- **Service (الخدمة):** Service quality
- **Atmosphere (الأجواء):** Restaurant atmosphere/ambiance
## Usage
### Loading the Dataset
```python
from datasets import load_dataset
# Load the dataset
dataset = load_dataset("Musaed1/kudu-google-maps-reviews")
# Access the data
df = dataset['train'].to_pandas()
print(f"Total reviews: {len(df):,}")
print(f"Average rating: {df['stars'].mean():.2f}")
```
### Using Parquet directly
```python
import pandas as pd
df = pd.read_parquet(
"hf://datasets/Musaed1/kudu-google-maps-reviews/kudu_reviews_cleaned.parquet"
)
```
## Dataset Statistics
### Rating Distribution
- **5 stars:** 141,454 (50.0%)
- **4 stars:** 45,316 (16.0%)
- **3 stars:** 35,590 (12.6%)
- **2 stars:** 14,665 (5.2%)
- **1 star:** 45,736 (16.2%)
### Key Metrics
- **Reviews with text:** 50.4% (142,401 reviews)
- **Owner response rate:** 35.3% (99,948 responses)
- **Reviews from Local Guides:** High percentage
- **Average reviews per location:** 781
- **Reviews per reviewer:** 1.26 (median: 1)
### Aspect Ratings (Average)
- **Food:** 4.3/5.0
- **Service:** 4.2/5.0
- **Atmosphere:** 4.1/5.0
## Use Cases
This dataset is suitable for:
1. **Sentiment Analysis:** Arabic sentiment classification on restaurant reviews
2. **Aspect-Based Sentiment Analysis:** Understanding sentiment about specific aspects (food, service, atmosphere)
3. **Rating Prediction:** Predicting overall ratings from review text
4. **Arabic NLP:** Training and evaluating Arabic language models
5. **Business Intelligence:** Understanding customer preferences and complaints
6. **Multilingual Analysis:** Comparing Arabic reviews with English owner responses
7. **Time Series Analysis:** Tracking rating trends and review patterns over time
8. **Customer Service Analysis:** Studying owner response patterns and effectiveness
## Data Collection
- **Source:** Google Maps Reviews
- **Collection Method:** Apify Google Maps Scraper API
- **Collection Date:** Sep 2025
- **Geographic Scope:** Saudi Arabia only
## Privacy & Ethics
This dataset has been carefully processed to protect user privacy:
### Removed Information
- **Reviewer IDs:** Google user IDs have been removed
- **Reviewer Names:** Personal names have been removed
- **Profile URLs:** Links to reviewer profiles have been removed
- **Profile Photos:** URLs to reviewer photos have been removed
- **Redundant Fields:** Human-readable date field (publishAt) removed
### Retained Information
- **Review Content:** The actual review text and ratings remain intact for analysis
- **Aggregated Metrics:** Number of reviews per reviewer (for analyzing reviewer activity patterns)
- **Local Guide Status:** Whether reviewer is a Local Guide (for quality analysis)
- **Place Information:** Restaurant locations and identifiers remain for location-based analysis
## Limitations
- Reviews are user-generated and may contain biases
- Some reviews lack text content (rating only)
- Text is primarily in Arabic with some mixed content
- Owner responses may follow templates
- Data represents a snapshot in time (not continuously updated)
## License
This dataset is released under the **apache-2.0** license.
## Citation
If you use this dataset in your research, please cite:
```bibtex
@dataset{kudu_reviews_2025,
title={Kudu Restaurant Reviews Dataset},
author={[Musaed Albedhani]},
year={2025},
publisher={HuggingFace},
url={https://huggingface.co/datasets/Musaed1/kudu-google-maps-reviews}
}
```
## Acknowledgments
- Data sourced from Google Maps public reviews
- Scraped using Apify platform
- Restaurant chain: Kudu (Saudi Arabia)
## Contact
For questions, issues, or collaboration requests, please reach out at:
**[owdm.ai@gmail.com](mailto:owdm.ai@gmail.com)**
提供机构:
Musaed1



