SAMPLE - Luxury Pens Price Analysis
收藏Databricks2026-01-16 收录
下载链接:
https://marketplace.databricks.com/details/c607f560-a1f9-4b28-b60f-67e348a3b980/AIDC-Inc-_SAMPLE---Luxury-Pens-Price-Analysis
下载链接
链接失效反馈官方服务:
资源简介:
**Overview**
A clean, analysis-ready dataset of 3,000 anonymized high-end writing instrument listings covering major luxury brands, item conditions, global seller locations, and declared shipping destinations.The dataset is formatted for immediate use in BI analysis, pricing models, catalog enrichment, and NLP-driven rarity discovery.
Built for analysts and data scientists, the collection supports price benchmarking, assortment optimization, anomaly detection, and trend/seasonality modeling.Titles and listing metadata are preserved for NLP feature extraction while PII has been removed so the dataset is ready for responsible analytics at scale.
**Provenance**
Compiled by Maths with Kanchana LLC on Aug 17, 2025.The dataset originates from publicly accessible e-commerce marketplace listings collected and aggregated by the compiler.Data collection followed ethical sourcing practices: only publicly visible listing data were captured, direct personal identifiers were removed or anonymized, and standard de-duplication and normalization were applied.The sample release is available under the CC BY 4.0 International license; full dataset purchase is available under the AIDC commercial purchase license.
**Use Cases**
This dataset is valuable for a variety of research and analytical applications, including:
- **Price Benchmarking & Margin Modeling**: Use Spark/SQL and Delta Lake to build brand- and condition-level pricing models to inform margin planning.
- **Assortment Optimization**: Optimize premium inventory mixes with clustering and demand-signal analytics for catalog and merchandising teams.
- **NLP Rarity & Edition Discovery**: Train NLP models to detect limited editions and rarity cues from listing titles and descriptions for valuation and catalog enrichment.
- **Geo-Logistics Strategy**: Analyze itemLocation and shipsTo coverage to guide regional shipping, warehousing, and market-entry decisions.
- **Anomaly Detection & Data Quality**: Deploy streaming or batch pipelines to flag mispriced or inconsistent listings using threshold and model-based detectors.
**Column Dictionary**
- **brand**: string — Declared manufacturer / marque of the pen.
- **type**: string — Instrument subtype (fountain, rollerball, ballpoint) when provided; sparse in some rows.
- **categoryinfo**: string — Primary marketplace taxonomy path/label for the item.
- **categoryinfo_additional**: string — Optional secondary taxonomy tags when present.
- **title**: string — Seller’s listing title with model/series and edition cues (good for NLP features).
- **price**: decimal — Captured item price at collection time (USD).Confirmed unit: USD.
- **lastUpdated**: timestamp — Timestamp of when this listing was last observed (May 04, 2017 – Aug 16, 2025).
- **seller**: string — Anonymized seller identifier (no direct personal data).
- **condition**: string — Seller-stated state (e.g., New, Used, Unspecified).
- **itemLocation**: string — Seller-reported city / region / country of the item.
- **shipsTo**: string — Destinations/regions the seller ships to (multi-value strings; consider normalizing to arrays or flags).
- **excludesShipping**: string — Countries/regions explicitly excluded from shipping (multi-value strings).
提供机构:
AIDC, Inc.



