five

高校学生学业发展与风险预警数据集

收藏
阿里云天池2026-06-09 更新2026-05-02 收录
下载链接:
https://tianchi.aliyun.com/dataset/225644
下载链接
链接失效反馈
官方服务:
资源简介:
1. 数据集概述 本数据集采集了2022年至2025年入学的四届高校学生(截至2026年4月,涵盖大一至大四学年)的综合学业档案。数据并非单纯的考试成绩,而是融合了高考基础、在校学业表现、日常行为习惯、心理健康指标以及经济/生源背景的多维数据集。其核心价值在于揭示了学生背景与学业结果之间的潜在联系,并直接关联了挂科风险等级这一关键预测目标。 2. 核心特征与维度 数据集共包含 49个维度,可以归纳为以下五大核心板块: 学生画像与背景 (Profile & Background) 基础身份:学号、性别、年级、入学年份。 学院专业:所属学院(如信息工程、土木工程、文学院等)、专业名称、学科类型(理/工/文/综)、班级信息。 生源背景:生源地(华东、华北等)、生源地类型(城市/县镇/农村)、是否独生子女、家庭收入水平、是否贫困生、高考成绩。 学业表现 (Academic Performance) 课程成绩:数学、英语、政治、专业基础课、专业核心课成绩。 综合指标:上学期GPA、当前GPA、班级排名、排名波动情况、已获学分、当前学分修读率。 奖惩记录:是否通过CET-4/6、是否获得奖学金、是否有学业警告历史。 行为与生活习惯 (Behavior & Habits) 学习投入:课程难度感知、教师评分、到课率、作业完成率、课堂参与度、线上学习时长、图书馆访问次数。 生活作息:手机使用时长、运动频率、平均睡眠时长、消费规律性、睡眠规律性。 辅导情况:是否接受过辅导(家教/补习)。 心理与社交 (Psychological & Social) 心理指标:心理测评得分、压力等级、社交互动频率。 风险标签 (Risk Label) 核心标签:挂科风险等级(分为:低风险、中风险、高风险)。 明细指标:过往挂科总数、当前学期挂科门数。 3. 数据时效性与场景 (Context) 时间跨度:数据记录的时间跨度为 2022年至2026年4月。 当前状态:截至2026年4月,2022级学生已处于大四下学期(毕业班),2025级学生处于大一阶段。 应用场景: 学业预警模型训练:利用大一至大三的数据预测“挂科风险等级”。 教育数据挖掘:分析家庭背景(如是否贫困、生源地)对学业成绩(GPA、排名)的影响。 学生画像构建:研究手机使用时长、睡眠规律与奖学金获得率之间的相关性。

1. Dataset Overview This dataset collects four cohorts of college students enrolled from 2022 to 2025, covering freshmen to senior students as of April 2026, with comprehensive academic archives. The dataset is not limited to simple exam scores, but is a multi-dimensional collection integrating college entrance examination scores, on-campus academic performance, daily behavioral habits, mental health indicators, and economic/student source background. Its core value lies in revealing the potential correlations between student backgrounds and academic outcomes, and directly links to the key prediction target of failed course risk level. 2. Core Features and Dimensions The dataset contains a total of 49 dimensions, which can be categorized into the following five core sections: Profile & Background Basic Identity: Student ID, gender, grade, enrollment year. College and Major: Affiliated college (e.g., School of Information Engineering, Civil Engineering, School of Chinese Language and Literature, etc.), major name, discipline type (science/engineering/humanities/comprehensive), class information. Student Source Background: Student source region (East China, North China, etc.), source region type (urban/county/town/rural), only child status, household income level, poverty student status, college entrance examination score. Academic Performance Course Scores: Scores of Mathematics, English, Politics, basic major courses and core major courses. Comprehensive Indicators: Previous semester GPA, current GPA, class ranking, ranking fluctuation, earned credits, current credit completion rate. Rewards and Sanctions Records: Whether passed CET-4/6, whether received scholarships, whether there is a history of academic warnings. Behavior & Habits Learning Engagement: Perceived course difficulty, teacher rating, attendance rate, assignment completion rate, classroom participation, online learning duration, library visit frequency. Daily Routine: Mobile phone usage duration, exercise frequency, average sleep duration, consumption regularity, sleep regularity. Tutoring Status: Whether received tutoring (private tutoring/after-school classes). Psychological & Social Psychological Indicators: Mental assessment score, stress level, social interaction frequency. Risk Label Core Label: Failed Course Risk Level (divided into: low risk, medium risk, high risk). Detailed Indicators: Total number of past failed courses, number of failed courses in the current semester. 3. Data Timeliness and Application Context Time Span: The time span of data recording is from 2022 to April 2026. Current Status: As of April 2026, the 2022 cohort is in the second semester of their senior year (graduating class), while the 2025 cohort is in their freshman year. Application Scenarios: Academic Early Warning Model Training: Use data from freshmen, sophomores and juniors to predict the "failed course risk level". Educational Data Mining: Analyze the impact of family background (e.g., poverty status, student source region) on academic performance (GPA, ranking). Student Profile Construction: Study the correlations between mobile phone usage duration, sleep regularity and scholarship award rate.
提供机构:
阿里云天池
创建时间:
2026-05-01
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集整合了2022年至2025年入学高校学生的多维度学业档案,涵盖学生背景、学业表现、行为习惯、心理社交及风险标签五大板块共49个特征。数据时间跨度为2022年至2026年4月,适用于学业预警模型训练、教育数据挖掘和学生画像构建等应用场景。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务