five

SAMPLE - Luxury Pens Price Analysis

收藏
Databricks2026-01-16 收录
下载链接:
https://marketplace.databricks.com/details/c607f560-a1f9-4b28-b60f-67e348a3b980/AIDC-Inc-_SAMPLE---Luxury-Pens-Price-Analysis
下载链接
链接失效反馈
官方服务:
资源简介:
**Overview** A clean, analysis-ready dataset of 3,000 anonymized high-end writing instrument listings covering major luxury brands, item conditions, global seller locations, and declared shipping destinations.The dataset is formatted for immediate use in BI analysis, pricing models, catalog enrichment, and NLP-driven rarity discovery. Built for analysts and data scientists, the collection supports price benchmarking, assortment optimization, anomaly detection, and trend/seasonality modeling.Titles and listing metadata are preserved for NLP feature extraction while PII has been removed so the dataset is ready for responsible analytics at scale. **Provenance** Compiled by Maths with Kanchana LLC on Aug 17, 2025.The dataset originates from publicly accessible e-commerce marketplace listings collected and aggregated by the compiler.Data collection followed ethical sourcing practices: only publicly visible listing data were captured, direct personal identifiers were removed or anonymized, and standard de-duplication and normalization were applied.The sample release is available under the CC BY 4.0 International license; full dataset purchase is available under the AIDC commercial purchase license. **Use Cases** This dataset is valuable for a variety of research and analytical applications, including: - **Price Benchmarking & Margin Modeling**: Use Spark/SQL and Delta Lake to build brand- and condition-level pricing models to inform margin planning. - **Assortment Optimization**: Optimize premium inventory mixes with clustering and demand-signal analytics for catalog and merchandising teams. - **NLP Rarity & Edition Discovery**: Train NLP models to detect limited editions and rarity cues from listing titles and descriptions for valuation and catalog enrichment. - **Geo-Logistics Strategy**: Analyze itemLocation and shipsTo coverage to guide regional shipping, warehousing, and market-entry decisions. - **Anomaly Detection & Data Quality**: Deploy streaming or batch pipelines to flag mispriced or inconsistent listings using threshold and model-based detectors. **Column Dictionary** - **brand**: string — Declared manufacturer / marque of the pen. - **type**: string — Instrument subtype (fountain, rollerball, ballpoint) when provided; sparse in some rows. - **categoryinfo**: string — Primary marketplace taxonomy path/label for the item. - **categoryinfo_additional**: string — Optional secondary taxonomy tags when present. - **title**: string — Seller’s listing title with model/series and edition cues (good for NLP features). - **price**: decimal — Captured item price at collection time (USD).Confirmed unit: USD. - **lastUpdated**: timestamp — Timestamp of when this listing was last observed (May 04, 2017 – Aug 16, 2025). - **seller**: string — Anonymized seller identifier (no direct personal data). - **condition**: string — Seller-stated state (e.g., New, Used, Unspecified). - **itemLocation**: string — Seller-reported city / region / country of the item. - **shipsTo**: string — Destinations/regions the seller ships to (multi-value strings; consider normalizing to arrays or flags). - **excludesShipping**: string — Countries/regions explicitly excluded from shipping (multi-value strings).
提供机构:
AIDC, Inc.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作