crawlfeeds/IKEA-Home-Decor-Furniture-Dataset
收藏Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/crawlfeeds/IKEA-Home-Decor-Furniture-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: cc-by-nc-4.0
task_categories:
- text-classification
- text-generation
- feature-extraction
task_ids:
- multi-class-classification
pretty_name: IKEA Home Decor & Furniture Product Dataset
size_categories:
- n<1K
tags:
- ikea
- home-decor
- furniture
- interior-design
- ecommerce
- product-data
- llm-training
- ai-training
- multimodal
- recommendation-systems
- computer-vision
- fine-tuning
- nlp
configs:
- config_name: default
data_files:
- split: train
path: crawlfeeds_ikea__limit-100000_category_1-home-decor_20260409_190938.csv
---
# IKEA Home Decor & Furniture Product Dataset
A rich, structured dataset of IKEA home decor and furniture products, featuring deep category taxonomies, full product descriptions, measurements, features, and image URLs. Ideal for training product recommendation models, interior design AI applications, multimodal models, and e-commerce search systems.
---
## Dataset Overview
| Field | Details |
|-------|---------|
| **Source** | IKEA (multi-country) |
| **Total Records** | 400+ |
| **Category Focus** | Home Decor, Furniture, Smart Home |
| **Language** | English |
| **Formats** | CSV / JSON |
| **Data Quality** | Validated and structured |
| **Provider** | [Crawl Feeds](https://crawlfeeds.com) |
---
## What Makes This Dataset Valuable
- **4-level category hierarchy** — category_1 through category_4 provides granular product taxonomy rarely available in open datasets, enabling precise product classification models
- **Rich text fields** — description, summary, and features together provide dense, high-quality product text suitable for LLM fine-tuning and semantic search training
- **Image URLs included** — primary_image and additional_images fields make this dataset multimodal-ready for computer vision and visual recommendation tasks
- **Structured measurements** — dimensional data for furniture and home products, invaluable for interior design AI and spatial planning applications
- **Multi-country coverage** — IKEA product data collected across multiple countries via the country field, enabling cross-market analysis
- **Package & variation data** — packages, sub_products, and variations fields support complex product configuration modeling
---
## Data Fields
| Field | Type | Coverage | Description |
|-------|------|----------|-------------|
| product_url | String | 100% | Direct product page URL |
| country | String | 100% | Country of the IKEA store |
| product_name | String | 100% | Full product name/title |
| product_id | String | 100% | IKEA internal product identifier |
| product_type | String | 100% | Product type classification |
| currency | String | 100% | Currency code |
| summary | String | 97.5% | Short product summary |
| item_number | String | 100% | IKEA item/article number |
| description | String | 100% | Full product description |
| price | Float | — | Price (available in full dataset) |
| average_rating | Float | — | Average rating (full dataset) |
| reviews_count | Int | — | Review count (full dataset) |
| ratings_breakdown | String | 79.7% | Rating distribution |
| category_1 | String | 100% | Top-level category |
| category_2 | String | 100% | Second-level category |
| category_3 | String | 100% | Third-level category |
| category_4 | String | 95.6% | Fourth-level category |
| breadcrumbs | String | 100% | Full navigation breadcrumb path |
| warranty_text | String | 51.9% | Warranty summary |
| warranty_description | String | 51.9% | Detailed warranty information |
| raw_product_details | String | 100% | Raw structured product attributes |
| additional_images | String | 100% | Additional product image URLs |
| primary_image | String | 99.7% | Main product image URL |
| packages | String | 100% | Package dimensions and weights |
| sub_products | String | 59% | Component/sub-product details |
| product_details | String | 98.2% | Structured product detail attributes |
| variations | String | — | Product variants (color, size) |
| measurements | String | 95.5% | Product dimensions |
| features | String | 97.3% | Key product features list |
| components | String | — | Included components |
| material_and_care | String | 99.6% | Materials used and care instructions |
| uniq_id | String | 100% | Unique record identifier |
| scraped_at | Date | 100% | Data collection timestamp |
---
## Use Cases
### Natural Language Processing
- **Product description generation** — fine-tune LLMs to generate product copy in IKEA's style
- **Semantic product search** — train dense retrieval models on rich product text
- **Zero-shot product classification** — leverage 4-level category labels for classification benchmarks
- **Multilingual product NLP** — use multi-country data for cross-lingual product understanding
### Recommendation Systems
- **Collaborative filtering** — use category taxonomy and product attributes as item features
- **Content-based filtering** — rich descriptions and features enable similarity-based recommendations
- **Bundle recommendations** — sub_products and components fields reveal natural product pairings
### Computer Vision & Multimodal AI
- **Product image classification** — train visual models using image URLs and category labels
- **Visual search systems** — pair product images with text descriptions for multimodal retrieval
- **Interior design AI** — combine measurements, images, and categories for room planning models
### E-commerce & Retail Intelligence
- **Competitor price monitoring** — track IKEA pricing across countries
- **Product taxonomy mapping** — map competitor catalogs to IKEA's category structure
- **Assortment analysis** — understand product range depth across home decor categories
---
## Sample Data
```json
{
"product_name": "BILLY Bookcase",
"product_type": "Bookcase",
"category_1": "Storage & organisation",
"category_2": "Bookcases & shelving units",
"category_3": "Bookcases",
"summary": "Adjustable shelves can be arranged according to your needs.",
"description": "You can customise your storage as needed with the adjustable shelves...",
"measurements": "Width: 80 cm, Depth: 28 cm, Height: 202 cm",
"material_and_care": "Particleboard, Fibreboard, Paper foil",
"primary_image": "https://www.ikea.com/...",
"country": "US",
"currency": "USD"
}
```
---
## Loading the Dataset
```python
from datasets import load_dataset
dataset = load_dataset("crawlfeeds/IKEA-Home-Decor-Furniture-Dataset")
df = dataset["train"].to_pandas()
# Filter by category
home_decor = df[df["category_1"] == "Home décor"]
# Get all products with measurements
with_measurements = df[df["measurements"].notna()]
# Text fields for LLM training
text_data = df[["product_name", "description", "features", "summary"]].dropna()
```
---
## Full Dataset & Custom Data
This is a **sample dataset**. The full CrawlFeeds IKEA dataset contains:
- Tens of thousands of records across all IKEA product categories
- Complete price data across multiple countries
- Ratings and reviews data
- Weekly and monthly refresh options
- Custom subsets by category, country, or product type
**Get the full dataset:** [crawlfeeds.com](https://crawlfeeds.com)
**Request custom IKEA data:** [https://crawlfeeds.com/contact_us](https://crawlfeeds.com/contact_us)
---
## Related Datasets by Crawl Feeds
- [Trustpilot Reviews Dataset – 20K Sample](https://huggingface.co/datasets/crawlfeeds/Trustpilot-Reviews-Dataset-20K-Sample)
- [Walmart Reviews Dataset](https://huggingface.co/datasets/crawlfeeds/walmart-reviews-dataset)
- [Medium Articles Corpus](https://huggingface.co/datasets/crawlfeeds/Medium-Articles-Corpus)
- [Fox News Headlines Dataset](https://huggingface.co/datasets/crawlfeeds/Curated-Fox-News-Headlines-and-Full-Text)
- Homedepot Smart Home Dataset *(coming soon)*
- Airbnb Reviews Dataset *(coming soon)*
---
## License
This dataset is made available under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/). It is intended for research and non-commercial use. For commercial licensing, please contact [crawlfeeds.com/contact](https://crawlfeeds.com/contact).
---
## Citation
If you use this dataset in your research or project, please cite:
```bibtex
@dataset{crawlfeeds_ikea_homedecor_2025,
author = {Crawl Feeds},
title = {IKEA Home Decor & Furniture Product Dataset},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/datasets/crawlfeeds/IKEA-Home-Decor-Furniture-Dataset}
}
```
---
*Data collected and maintained by [Crawl Feeds](https://crawlfeeds.com) — structured web data for AI, analytics, and business intelligence.*
提供机构:
crawlfeeds



