almador2002/us-airbnb-eda-2023
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/almador2002/us-airbnb-eda-2023
下载链接
链接失效反馈官方服务:
资源简介:
<video src="https://huggingface.co/datasets/almador2002/us-airbnb-eda-2023/resolve/main/video.mp4" controls="controls" style="max-width: 720px;"></video>
# U.S. Airbnb Open Data 2023 — EDA Project
**Course:** Data Science
**Dataset Source:** [Kaggle - US Airbnb Open Data](https://www.kaggle.com/datasets/kritikseth/us-airbnb-open-data)
---
## Dataset Overview
This dataset contains over 230,000 Airbnb listings from 27 major U.S. cities in 2023.
Each listing includes information about price, room type, location, number of reviews, availability, and host details.
**Main Goal:** Explore what factors influence Airbnb listing prices across different U.S. cities.
---
## Data Cleaning Decisions
- Dropped `neighbourhood_group` — missing in 58% of rows, not useful
- Filled `reviews_per_month` with 0 for listings with no reviews
- Filled `last_review` with "No Review" for listings with no reviews
- Removed listings with `price = 0` (invalid)
- Capped prices at the 99th percentile ($1,803) to reduce outlier impact
- Removed listings with `minimum_nights > 365`
- **Final dataset: 229,632 rows**
---
## Research Questions & Findings
### Q1: Which cities have the highest median listing prices?

San Francisco and Boston are the most expensive markets, with a median price above $200 per night.
Columbus and Portland are significantly more affordable.
---
### Q2: How does room type affect price?

Entire homes make up over 72% of all listings and are significantly more expensive than private or shared rooms.
---
### Q3: What does the price distribution look like?

Prices are right-skewed — most listings are under $300, but a long tail of premium listings pulls the mean above the median.
No single feature strongly drives price on its own.
---
### Q4: Do hosts with more listings charge more?

Yes. Hosts with 20+ listings consistently charge higher prices, likely due to more professional management and better-located properties.
---
### Q5: Which cities have the most listings?

NYC and LA dominate in total listings. Availability varies across cities, hinting at different market dynamics.
---
### Q6: Is there a relationship between reviews and price?

No strong relationship. Expensive listings tend to have fewer reviews, likely because they are booked less frequently.
---
## Key Takeaways
1. **City** is the strongest indicator of price level
2. **Room type** significantly affects price — entire homes cost much more
3. **Professional hosts** (20+ listings) charge more than individual hosts
4. **Price distribution** is right-skewed — most listings are affordable
5. **Reviews** are not a reliable indicator of price
---
## Files in this Repository
| File | Description |
|------|-------------|
| `AB_US_2023.csv` | The raw dataset |
| `notebook.ipynb` | Full EDA notebook |
| `video.mp4` | Presentation video |
提供机构:
almador2002



