five

eliel2003/student-burnout-analysis2026

收藏
Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/eliel2003/student-burnout-analysis2026
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: mit size_categories: - 1M+ (Sampled to 2,000) task_categories: - tabular-regression tags: - mental-health - economics - psychology - student-burnout pretty_name: Socio-Economic Drivers of Student Burnout --- # 🔥 Predicting Academic Burnout: A Multivariate Analysis of Student Stressors > *Exploring how financial pressure, family expectations, and social support shape burnout in university students.* --- ## Project Overview & Data Walkthrough <video src="https://huggingface.co/datasets/eliel2003/student-burnout-analysis2026/resolve/main/%D7%A1%D7%A8%D7%98.mp4" controls="controls" style="max-width: 720px;"></video> --- ## 📋 Abstract Academic burnout is an increasingly recognized phenomenon with far-reaching consequences for student wellbeing and performance. This study investigates the relationship between **external environmental stressors** — specifically financial stress and family expectations — and **academic burnout levels** among university students, while examining the moderating role of social support. The analysis draws from a large-scale synthetic dataset of **1,000,000 student records** (20 features each). To ensure computational efficiency while maintaining statistical robustness, a **stratified random sample of n = 2,000 records** was extracted for this exploratory data analysis (EDA). The findings provide empirically grounded insights into the interplay of stressors and protective factors in academic environments. --- ## 🔬 Hypotheses This study is guided by two primary hypotheses derived from the stress-buffering model in psychosocial research: | # | Hypothesis | Direction | |---|-----------|-----------| | **H1** | Higher levels of financial stress and family expectations are **positively correlated** with increased academic burnout. | Stressor → Burnout ↑ | | **H2** | Social support acts as a **moderating buffer**; the negative impact of environmental stressors on burnout will be significantly weaker for students with high social support. | Support → Burnout ↓ | --- ## 📦 The Dataset | Property | Value | |----------|-------| | **Source** | [Kaggle – Student Mental Health and Burnout Dataset](https://www.kaggle.com/datasets/sharmajicoder/student-mental-health-and-burnout) | | **Total Records** | 1,000,000 | | **Features** | 20 | | **Sample Used** | 2,000 (random, `random_state=42`) | | **Target Variable** | `academic_burnout_level` (composite of Stress, Anxiety & Depression scores) | > **⚠️ Dataset Note:** Based on structural characteristics — zero missing values across 1M records, perfectly uniform feature distributions, and absence of real-world noise — this dataset is assessed to be **synthetically generated**. While this enables clean, reproducible analysis, findings should be interpreted with caution and may not directly generalize to real-world student populations. ### Feature Categories | Category | Features | |----------|----------| | **Demographics** | `age`, `gender`, `academic_year` | | **Lifestyle** | `study_hours_per_day`, `sleep_hours`, `physical_activity`, `screen_time` | | **Psychological** | `stress_level`, `anxiety_score`, `depression_score`, `exam_pressure` | | **Environmental** | `financial_stress`, `family_expectation`, `social_support` | | **Academic** | `academic_performance` | | **Target** | `academic_burnout_level` | --- ## ⚙️ Methodology ### 1. Data Loading & Sampling The full dataset was downloaded via `kagglehub` and a reproducible random sample of 2,000 rows was extracted. The target variable was standardized from `burnout_score` to `academic_burnout_level` for semantic clarity. ### 2. Data Cleaning A systematic quality assessment was performed: - **Missing values:** None detected across all 20 features (confirmed via `df.isnull().sum()`). - **Duplicate rows:** Zero duplicates found (confirmed via `df.duplicated().sum()`). - **Data types:** All features confirmed to be in expected formats; no parsing or type conversion required. ### 3. Outlier Detection & Decision Box plots were generated for the four key research variables. While extreme values were observed — particularly in `academic_burnout_level` (high end) and `family_expectation` (low end) — **all outliers were retained**. **Justification:** These values represent legitimate extreme experiences within the student population. Removing high-burnout cases would systematically bias the analysis against the very phenomenon under study. ### 4. Feature Engineering To enable group-level comparisons and multivariate visualization, three categorical bin variables were engineered: --- ## 📊 Key Visualizations & Insights ### Figure 1 — Outlier Detection: Box Plots ![Burnout Distribution](1) > The box plots reveal that `academic_burnout_level` exhibits the most pronounced outliers, with a tail of students experiencing extreme burnout. `family_expectation` shows a minority cluster near zero, suggesting a subset of students reporting minimal family pressure. The interquartile ranges for all four variables are well-contained, indicating that the distribution is not pathologically skewed for the majority of the sample. --- ### Figure 2 — Feature Distributions: Histograms ![Burnout Distribution](2) > The distribution of `financial_stress` is approximately bell-shaped with a slight right skew, indicating that moderate financial pressure is most common while a minority of students experience severe financial hardship. `social_support` mirrors this pattern inversely, with most students reporting moderate support levels. `academic_burnout_level` follows a roughly normal distribution, centered around a mid-range value, confirming that the target variable captures meaningful variation across the sample. --- ### Figure 3 — Relationship Analysis: Scatter Plots ![Burnout Distribution](3) > The scatter plot of `financial_stress` vs. `academic_burnout_level` reveals a positive, though diffuse, linear trend — as financial stress increases, burnout tends to rise, consistent with H1. The plot of `social_support` vs. burnout demonstrates the opposite pattern, with higher support associated with reduced burnout, providing initial visual support for H2. The high dispersion in both plots underscores that burnout is a multi-determined outcome not fully explained by any single variable. --- ### Figure 4 — Research Questions: Bar Chart Panel ![Burnout Distribution](4) **Q1 — Does financial stress increase burnout?** > Mean burnout levels rise monotonically from **1.12** (Low stress) to **2.66** (High stress), more than doubling across the financial stress spectrum. This constitutes strong empirical support for **H1**. **Q2 — Does social support reduce burnout?** > Mean burnout decreases from **2.45** (Low support) to **1.19** (High support) as social support increases — an inverse relationship of comparable magnitude to Q1. This provides initial support for **H2**. **Q3 — Which factor correlates most strongly with burnout?** > `financial_stress` shows the highest positive correlation with burnout (*r* = 0.32), followed by `family_expectation` (*r* = 0.23). `social_support` yields the strongest negative correlation (*r* = −0.23), confirming its role as a protective factor rather than an additional stressor. **Q4 — Does burnout vary across age groups?** > Burnout levels remain relatively stable across age groups (**1.64** for <20, **1.83** for 21–23, **1.84** for >23), suggesting that age is not a primary driver of burnout in this dataset. --- ### Figure 5 — Correlation Heatmap ![Burnout Distribution](5.1) > The heatmap confirms that `academic_burnout_level` is most strongly correlated with `financial_stress` (*r* = 0.32) and negatively with `social_support` (*r* = −0.23). Notably, `financial_stress` and `social_support` exhibit a low inter-correlation (*r* ≈ −0.02), indicating these are largely **independent constructs** — students with high financial stress are not systematically less likely to have social support, which strengthens the validity of treating them as separate predictors. --- ### Figure 6 — Multivariate Analysis: The Buffering Effect ⭐ ![Burnout Distribution](6) > This chart constitutes the **key finding** of the study. Among students with **high financial stress**, those with **low social support** report an average burnout of **3.83**, while those with **high social support** report only **1.80** — a reduction of more than **50%**. This dramatic attenuation of the stress-burnout relationship at high support levels provides compelling support for **H2**: social support functions as a genuine psychological buffer against environmental stressors. --- ## 🏁 Final Conclusions ### Hypothesis Outcomes | Hypothesis | Status | Evidence | |-----------|--------|----------| | **H1:** Financial stress & family expectations → higher burnout | ✅ **Confirmed** | Burnout doubles from Low → High stress group; *r* = 0.32 for financial stress | | **H2:** Social support buffers the stress-burnout relationship | ✅ **Confirmed** | 50%+ reduction in burnout among high-stress students with high support | ### Main Takeaway The central finding of this analysis is that **social support is not merely a correlate of lower burnout — it actively moderates the damage caused by financial stress.** A student experiencing high financial pressure is not destined for high burnout; robust social support networks can cut that risk in half. From a policy standpoint, this suggests that university interventions targeting burnout should focus not only on reducing stressors (e.g., financial aid, managing family expectations) but critically on **strengthening social support systems** — peer programs, counseling access, and community-building initiatives — particularly for students in high-pressure financial circumstances. --- ## 🛠️ Technical Stack ``` Language : Python 3.12 Data : pandas, numpy Visualization: matplotlib, seaborn Dataset Hub : kagglehub Environment : Google Colab ``` | Library | Version | Purpose | |---------|---------|---------| | `pandas` | ≥ 2.0 | Data loading, cleaning, feature engineering | | `seaborn` | ≥ 0.13 | Statistical visualizations (heatmap, barplots) | | `matplotlib` | ≥ 3.7 | Plot rendering and layout management | | `kagglehub` | ≥ 1.0 | Dataset download from Kaggle | | `numpy` | ≥ 1.24 | Numerical operations | --- ## 📁 Repository Contents ``` 📦 student-burnout-eda/ ├── 📓 Assignment_1_EDA_Dataset-4.ipynb # Full analysis notebook ├── 📄 README.md # This file └── 🎥 presentation.mp4 # 2–3 min walkthrough video ``` --- *Analysis conducted as part of a Data Science coursework assignment — March 2026.*
提供机构:
eliel2003
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作