eliel2003/student-burnout-analysis2026
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/eliel2003/student-burnout-analysis2026
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: mit
size_categories:
- 1M+ (Sampled to 2,000)
task_categories:
- tabular-regression
tags:
- mental-health
- economics
- psychology
- student-burnout
pretty_name: Socio-Economic Drivers of Student Burnout
---
# 🔥 Predicting Academic Burnout: A Multivariate Analysis of Student Stressors
> *Exploring how financial pressure, family expectations, and social support shape burnout in university students.*
---
## Project Overview & Data Walkthrough
<video src="https://huggingface.co/datasets/eliel2003/student-burnout-analysis2026/resolve/main/%D7%A1%D7%A8%D7%98.mp4" controls="controls" style="max-width: 720px;"></video>
---
## 📋 Abstract
Academic burnout is an increasingly recognized phenomenon with far-reaching consequences for student wellbeing and performance. This study investigates the relationship between **external environmental stressors** — specifically financial stress and family expectations — and **academic burnout levels** among university students, while examining the moderating role of social support.
The analysis draws from a large-scale synthetic dataset of **1,000,000 student records** (20 features each). To ensure computational efficiency while maintaining statistical robustness, a **stratified random sample of n = 2,000 records** was extracted for this exploratory data analysis (EDA). The findings provide empirically grounded insights into the interplay of stressors and protective factors in academic environments.
---
## 🔬 Hypotheses
This study is guided by two primary hypotheses derived from the stress-buffering model in psychosocial research:
| # | Hypothesis | Direction |
|---|-----------|-----------|
| **H1** | Higher levels of financial stress and family expectations are **positively correlated** with increased academic burnout. | Stressor → Burnout ↑ |
| **H2** | Social support acts as a **moderating buffer**; the negative impact of environmental stressors on burnout will be significantly weaker for students with high social support. | Support → Burnout ↓ |
---
## 📦 The Dataset
| Property | Value |
|----------|-------|
| **Source** | [Kaggle – Student Mental Health and Burnout Dataset](https://www.kaggle.com/datasets/sharmajicoder/student-mental-health-and-burnout) |
| **Total Records** | 1,000,000 |
| **Features** | 20 |
| **Sample Used** | 2,000 (random, `random_state=42`) |
| **Target Variable** | `academic_burnout_level` (composite of Stress, Anxiety & Depression scores) |
> **⚠️ Dataset Note:** Based on structural characteristics — zero missing values across 1M records, perfectly uniform feature distributions, and absence of real-world noise — this dataset is assessed to be **synthetically generated**. While this enables clean, reproducible analysis, findings should be interpreted with caution and may not directly generalize to real-world student populations.
### Feature Categories
| Category | Features |
|----------|----------|
| **Demographics** | `age`, `gender`, `academic_year` |
| **Lifestyle** | `study_hours_per_day`, `sleep_hours`, `physical_activity`, `screen_time` |
| **Psychological** | `stress_level`, `anxiety_score`, `depression_score`, `exam_pressure` |
| **Environmental** | `financial_stress`, `family_expectation`, `social_support` |
| **Academic** | `academic_performance` |
| **Target** | `academic_burnout_level` |
---
## ⚙️ Methodology
### 1. Data Loading & Sampling
The full dataset was downloaded via `kagglehub` and a reproducible random sample of 2,000 rows was extracted. The target variable was standardized from `burnout_score` to `academic_burnout_level` for semantic clarity.
### 2. Data Cleaning
A systematic quality assessment was performed:
- **Missing values:** None detected across all 20 features (confirmed via `df.isnull().sum()`).
- **Duplicate rows:** Zero duplicates found (confirmed via `df.duplicated().sum()`).
- **Data types:** All features confirmed to be in expected formats; no parsing or type conversion required.
### 3. Outlier Detection & Decision
Box plots were generated for the four key research variables. While extreme values were observed — particularly in `academic_burnout_level` (high end) and `family_expectation` (low end) — **all outliers were retained**.
**Justification:** These values represent legitimate extreme experiences within the student population. Removing high-burnout cases would systematically bias the analysis against the very phenomenon under study.
### 4. Feature Engineering
To enable group-level comparisons and multivariate visualization, three categorical bin variables were engineered:
---
## 📊 Key Visualizations & Insights
### Figure 1 — Outlier Detection: Box Plots

> The box plots reveal that `academic_burnout_level` exhibits the most pronounced outliers, with a tail of students experiencing extreme burnout. `family_expectation` shows a minority cluster near zero, suggesting a subset of students reporting minimal family pressure. The interquartile ranges for all four variables are well-contained, indicating that the distribution is not pathologically skewed for the majority of the sample.
---
### Figure 2 — Feature Distributions: Histograms

> The distribution of `financial_stress` is approximately bell-shaped with a slight right skew, indicating that moderate financial pressure is most common while a minority of students experience severe financial hardship. `social_support` mirrors this pattern inversely, with most students reporting moderate support levels. `academic_burnout_level` follows a roughly normal distribution, centered around a mid-range value, confirming that the target variable captures meaningful variation across the sample.
---
### Figure 3 — Relationship Analysis: Scatter Plots

> The scatter plot of `financial_stress` vs. `academic_burnout_level` reveals a positive, though diffuse, linear trend — as financial stress increases, burnout tends to rise, consistent with H1. The plot of `social_support` vs. burnout demonstrates the opposite pattern, with higher support associated with reduced burnout, providing initial visual support for H2. The high dispersion in both plots underscores that burnout is a multi-determined outcome not fully explained by any single variable.
---
### Figure 4 — Research Questions: Bar Chart Panel

**Q1 — Does financial stress increase burnout?**
> Mean burnout levels rise monotonically from **1.12** (Low stress) to **2.66** (High stress), more than doubling across the financial stress spectrum. This constitutes strong empirical support for **H1**.
**Q2 — Does social support reduce burnout?**
> Mean burnout decreases from **2.45** (Low support) to **1.19** (High support) as social support increases — an inverse relationship of comparable magnitude to Q1. This provides initial support for **H2**.
**Q3 — Which factor correlates most strongly with burnout?**
> `financial_stress` shows the highest positive correlation with burnout (*r* = 0.32), followed by `family_expectation` (*r* = 0.23). `social_support` yields the strongest negative correlation (*r* = −0.23), confirming its role as a protective factor rather than an additional stressor.
**Q4 — Does burnout vary across age groups?**
> Burnout levels remain relatively stable across age groups (**1.64** for <20, **1.83** for 21–23, **1.84** for >23), suggesting that age is not a primary driver of burnout in this dataset.
---
### Figure 5 — Correlation Heatmap

> The heatmap confirms that `academic_burnout_level` is most strongly correlated with `financial_stress` (*r* = 0.32) and negatively with `social_support` (*r* = −0.23). Notably, `financial_stress` and `social_support` exhibit a low inter-correlation (*r* ≈ −0.02), indicating these are largely **independent constructs** — students with high financial stress are not systematically less likely to have social support, which strengthens the validity of treating them as separate predictors.
---
### Figure 6 — Multivariate Analysis: The Buffering Effect ⭐

> This chart constitutes the **key finding** of the study. Among students with **high financial stress**, those with **low social support** report an average burnout of **3.83**, while those with **high social support** report only **1.80** — a reduction of more than **50%**. This dramatic attenuation of the stress-burnout relationship at high support levels provides compelling support for **H2**: social support functions as a genuine psychological buffer against environmental stressors.
---
## 🏁 Final Conclusions
### Hypothesis Outcomes
| Hypothesis | Status | Evidence |
|-----------|--------|----------|
| **H1:** Financial stress & family expectations → higher burnout | ✅ **Confirmed** | Burnout doubles from Low → High stress group; *r* = 0.32 for financial stress |
| **H2:** Social support buffers the stress-burnout relationship | ✅ **Confirmed** | 50%+ reduction in burnout among high-stress students with high support |
### Main Takeaway
The central finding of this analysis is that **social support is not merely a correlate of lower burnout — it actively moderates the damage caused by financial stress.** A student experiencing high financial pressure is not destined for high burnout; robust social support networks can cut that risk in half.
From a policy standpoint, this suggests that university interventions targeting burnout should focus not only on reducing stressors (e.g., financial aid, managing family expectations) but critically on **strengthening social support systems** — peer programs, counseling access, and community-building initiatives — particularly for students in high-pressure financial circumstances.
---
## 🛠️ Technical Stack
```
Language : Python 3.12
Data : pandas, numpy
Visualization: matplotlib, seaborn
Dataset Hub : kagglehub
Environment : Google Colab
```
| Library | Version | Purpose |
|---------|---------|---------|
| `pandas` | ≥ 2.0 | Data loading, cleaning, feature engineering |
| `seaborn` | ≥ 0.13 | Statistical visualizations (heatmap, barplots) |
| `matplotlib` | ≥ 3.7 | Plot rendering and layout management |
| `kagglehub` | ≥ 1.0 | Dataset download from Kaggle |
| `numpy` | ≥ 1.24 | Numerical operations |
---
## 📁 Repository Contents
```
📦 student-burnout-eda/
├── 📓 Assignment_1_EDA_Dataset-4.ipynb # Full analysis notebook
├── 📄 README.md # This file
└── 🎥 presentation.mp4 # 2–3 min walkthrough video
```
---
*Analysis conducted as part of a Data Science coursework assignment — March 2026.*
提供机构:
eliel2003



