BentoUniAcc/Global_Economic_Indicators
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/BentoUniAcc/Global_Economic_Indicators
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
---
# Global Economic Indicators Analysis (2010–2023)
**Author:** Ben Topaz
**Data Source:** Kaggle - [World Bank Global Economic Indicators](https://www.kaggle.com/datasets/tanishksharma9905/global-economic-indicators-20102025)
Note: The python notebook is attached in the files
---
## Explanation Video
Youtube link: [https://youtu.be/m7d4ZOhI5aU]
<video src="https://cdn-uploads.huggingface.co/production/uploads/69d8c774af594a45bf54cc48/qnKZVInENRJyxlQD1V5TG.mp4" controls="controls" style="max-width: 720px;"></video>
---
## Project Overview
This project explores global economic trends over a 14-year period. By cleaning and analyzing data from 217 countries and territories, the study tests established economic theories regarding the relationships between inflation, unemployment, and GDP growth.
---
## Technical Decisions & Data Cleaning
To ensure a reliable correlation analysis, the following data cleaning steps were implemented:
* **Handling Missing Data:**
* **Column Removal:** Dropped indicators with >45% missing data (e.g., Public Debt, Tax Revenue, Government Revenue/Expense, Real Interest Rates).
* **Row Removal:** Removed the years 2024 and 2025 entirely. Analysis showed that nearly 100% of the key metrics were missing for these years, making them statistically irrelevant.
* **Country Name Spelling and Formatting:**
* Corrected country naming conventions to ensure compatibility with Plotly mapping tools (e.g. removing punctuation from names, changing "Russian Federation" to "Russia" and "Viet Nam" to "Vietnam").
* **Outlier Management:**
* Identified and removed 79 countries/territories that acted as extreme outliers (e.g., Venezuela, Zimbabwe, and Sudan due to hyper-inflation; Libya due to war-impacted growth spikes). This was necessary to find global trends without them being skewed by localized instability.
---
## Questions
The analysis was driven by three primary questions:
1. **Growth vs. Inflation:** Is there correlation between inflation (CPI) and annual GDP growth?
2. **Growth vs. Unemployment:** Is there a relationship between unemployment and economic growth?
3. **Inflation vs. Unemployment:** Does the data support the economic theory of an inverse relationship between inflation and unemployment?
The main purpose of the analysis was to test a change in one of these factors leads to a change in the other, as the classical economic theory suggests. The correlation coefficient will show the extent to which the theory is correct.
---
## Visualizations & Analysis
### Missing Values Grouped by Year
A histogram showing nulls (NaN) per year was used to indicate if certain years were unreliable. The findings from this histogram led to the removal of 2024-2025 from the dataframe.

### Global Heatmap of Missing Values
A global heat map was used to find the amount of missing values per country, and identify regions completely absent from the dataframe.

### Economic Indicator Distribution Box Plots
Box plots revealed a massive range of values even after initial cleaning. This led to the decision to filter countries based on outlier Mean, Max, and Min values across all four key metrics.
#### Before outlier cleaning

#### After outlier cleaning (after the mean min and max analysis shown below)

### Mean, Max and Min Economic Indicator Histograms and Global Heat Maps
For each of the four indicators a global heatmap and a histogram highlighting the outliers was generated. These visualizations were the basis for the indicator distribution analysis provided following every indicator, and the outlier country removal following the analyses.





Note: More images can be found in the files and in python notebook (also attched in the files)
### Correlation Scatter Plots
Regression lines were applied to scatter plots to visualize the strength of relationships between indicators.




---
## Insights & Answers
* **Inflation & GDP Growth (0.14):** There is a weak positive correlation between these indicators. While they generally move in the same direction, inflation alone is not a strong or reliable predictor of economic growth. The correlation coefficient for both measurment types of inflation were very similar indicating that the data used was reliable.
* **Unemployment & GDP Growth (-0.22):** This represents the strongest relationship identified in the study. The inverse correlation suggests that economic growth is significantly more sensitive to unemployment than to inflation. The direction of the relationship proves the economic theory.
* **Inflation & Unemployment (-0.075):** The analysis found a very weak relationship between these variables raising questions regarding the economic theory that links high unemployment with disinflation and even deflation.
---
## Conclusion
The analysis concludes that while modern economic indicators generally follow the direction of classical economic theory, the correlations are significantly weaker than the theory suggests, and missing moderating factors likely account for the weak correlation coefficients.
提供机构:
BentoUniAcc
搜集汇总
数据集介绍

构建方式
在宏观经济研究领域,构建一个具有代表性的全球数据集需要严谨的数据处理流程。Global_Economic_Indicators数据集源自世界银行的公开经济指标,涵盖了2010年至2023年间217个国家与地区的记录。为确保分析结果的可靠性,构建过程执行了系统的数据清洗:剔除了缺失率超过45%的指标列,如公共债务与税收收入;移除了数据几乎完全缺失的2024与2025年度;统一了国家名称的拼写格式以适配地理可视化工具;并通过统计方法识别并移除了79个因极端经济波动或局部冲突导致的异常值国家,从而聚焦于反映全球普遍趋势的样本。
使用方法
该数据集主要服务于宏观经济学的实证研究与教学演示。使用者可借助附带的Python代码直接加载清洗后的数据,进行跨国或跨年度的比较分析。典型应用包括计算关键经济指标间的相关系数,绘制散点图与回归线以直观展示变量关系,或生成全球热力图进行空间分布模式探索。研究人员可以此为基础,检验关于增长、通胀与失业之间关系的理论假设,或将其作为基准数据,引入其他调节变量进行更复杂的计量经济学建模。
背景与挑战
背景概述
Global_Economic_Indicators数据集由Ben Topaz基于世界银行全球经济指标数据构建,发布于2026年,旨在探究2010年至2023年间全球217个国家与地区的宏观经济动态。该数据集聚焦于检验经典经济学理论中通货膨胀、失业率与国内生产总值增长之间的关联性,通过系统性的数据清洗与可视化分析,为宏观经济研究提供了跨地域、跨时期的实证基础。其研究不仅深化了对全球经济趋势的理解,也为政策制定与学术探讨提供了数据驱动的洞察,在经济学与数据科学交叉领域具有显著影响力。
当前挑战
该数据集致力于解决宏观经济指标间关联性分析的挑战,核心在于验证通货膨胀、失业率与经济增长之间的理论关系,但实际分析揭示这些关联性较弱,突显了经济现象的复杂性与理论模型的局限性。在构建过程中,面临多重数据质量挑战:包括处理高缺失率指标如公共债务与税收收入,剔除2024至2025年几乎全缺数据的年份,修正国家名称拼写以适配地理映射工具,以及移除因恶性通货膨胀或战争导致经济数据极端异常的79个国家与地区,这些步骤旨在确保分析结果不受局部不稳定性的干扰,从而准确捕捉全球趋势。
常用场景
经典使用场景
在宏观经济研究领域,Global_Economic_Indicators数据集常被用于检验经典经济理论的有效性。研究者借助该数据集涵盖的全球217个国家与地区在2010年至2023年间的经济指标,通过相关性分析与可视化手段,深入探究通货膨胀、失业率与国内生产总值增长之间的动态关联。这一经典使用场景不仅为理论验证提供了实证基础,还揭示了不同经济变量在全球尺度上的相互作用模式。
解决学术问题
该数据集有效解决了宏观经济研究中关于核心指标关联性验证的若干学术问题。通过系统清理缺失数据与异常值,并运用统计方法分析通货膨胀、失业率与经济增长之间的相关性,研究能够客观评估菲利普斯曲线等经典理论在当代全球经济环境中的适用性。其意义在于提供了跨国家、长时序的实证证据,有助于修正或深化现有经济模型,推动宏观经济学向更精准、更贴合现实的方向发展。
实际应用
在实际应用层面,Global_Economic_Indicators数据集为政策制定者、国际组织及金融机构提供了关键决策支持。基于数据集的分析结果,用户能够评估不同国家或地区的经济稳定性,预测宏观经济走势,并设计更具针对性的财政或货币政策。例如,识别经济增长与失业率之间的负相关关系,有助于制定促进就业的经济刺激方案,从而在实践层面助力全球经济治理与可持续发展目标的实现。
数据集最近研究
最新研究方向
在宏观经济分析领域,Global_Economic_Indicators数据集正被用于探索传统经济理论在全球化背景下的适用性。前沿研究聚焦于利用该数据集检验通货膨胀、失业率与GDP增长之间的动态关联,尤其关注极端经济事件如恶性通胀或战争冲击对全球趋势的干扰。当前热点事件如地缘政治冲突与后疫情时代的经济复苏,促使学者重新评估这些指标的相关性强度,并引入机器学习方法识别数据中的非线性模式与潜在调节变量。此类研究不仅深化了对宏观经济稳定机制的理解,也为政策制定提供了基于实证的参考,凸显了高质量、清洁化跨国数据在验证经典理论中的关键作用。
以上内容由遇见数据集搜集并总结生成



