Data and Codes for "Analyzing Price Efficiency Using Machine Learning Generated Price Indices: the Case of the Chilean Used Car Market "
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/jxby8pkww5
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains all the necessary data and codes to replicate the main findings of the article, which examines how new car import prices affect the valuation of used vehicles in Chile's secondary car market. The study uses event study and difference-in-differences (DiD) methodologies to evaluate price efficiency.
The replication package includes:
- Used car price indexes dataset: Filtered and pre-processed to include key vehicle attributes (model, year, mileage, transmission, fuel type, seller type, region, etc.).
A new car import dataset built from official customs records covering shipment dates, CIF prices, and units imported per model and version.
R and Python scripts for estimation and analysis:
"Unit Root Test.R": Tests for stationarity using Im, Pesaran, and Shin (IPS) panel tests; and "Event Study CARS.py":
"Event Study CARS.py": Performs event study estimation using cumulative abnormal returns (CAARs), including subsample analyses by vintage and vehicle segment.
"DiD Event Studies.R": Estimates difference-in-difference (DiD) regressions using staggered treatment timing and fixed effects and calculates cumulative abnormal returns (CAARs) from fitted values.
Each code script includes comments and references to the corresponding tables and figures in the paper.
Key findings that can be replicated using these files include:
- Evidence of prompt and statistically significant price responses in the used car market following increases in new import prices.
- Stronger responses among newer and high-end used cars.
- These responses occur before the public release of import data, suggesting high informational efficiency.
- The results are robust across different methodological approaches and sample partitions.
This replication package allows for the independent verification of results. All datasets are anonymized and formatted for reproducibility. The codes are compatible with R 4.2+ and Python 3.8+ environments.
本数据集包含复现该论文核心结论所需的全部数据与代码,该论文聚焦智利二手车二级市场中新车进口价格变动对二手车估值的影响机制。研究采用事件研究法与双重差分(difference-in-differences, DiD)方法评估市场价格效率。
本复现套件包含以下内容:
- 二手车价格指数数据集:经过筛选与预处理,纳入车辆关键属性,包括车型、出厂年份、里程数、变速箱类型、燃料类型、卖方类型、销售区域等。
- 新车进口数据集:基于官方海关记录构建,涵盖装运日期、到岸价(CIF)以及各车型与版本的进口数量。
用于估计与分析的R与Python脚本如下:
"Unit Root Test.R":采用Im、Pesaran与Shin(IPS)面板检验法开展平稳性检验;
"Event Study CARS.py":基于累计异常收益(cumulative abnormal returns, CAAR)执行事件研究估计,包含按车龄与车辆细分市场的子样本分析;
"DiD Event Studies.R":采用渐进式处理时点与固定效应构建双重差分(DiD)回归模型,并基于拟合值计算累计异常收益(CAAR)。
每份代码脚本均附带详细注释,并标注了论文中对应的表格与图表。
通过上述文件可复现的核心研究结论包括:
- 新车进口价格上调后,二手车市场会出现迅速且具有统计显著性的价格响应;
- 较新车型与高端二手车的价格响应更为强烈;
- 该响应在进口数据公开发布前即已出现,表明市场具备较高的信息效率;
- 研究结果在不同方法论与样本划分方式下均保持稳健。
本复现套件支持对研究结论进行独立验证。所有数据集均已完成匿名化处理,且格式适配可复现性要求。代码兼容R 4.2及以上、Python 3.8及以上版本的运行环境。
创建时间:
2025-07-15



