Students suspicious behaviors detection dataset for AI-powered online exam proctoring
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/39xs8th543
下载链接
链接失效反馈官方服务:
资源简介:
Our research hypothesizes that student cheating during online exams can be accurately detected through multimodal analysis of visual behavioral cues captured via standard webcams. By combining facial movements, hand gestures, gaze tracking, head pose, and phone interaction data, AI-based proctoring systems can identify dishonest behavior. To validate this, we developed this dataset, specifically designed to support the training, testing, and benchmarking of machine learning models for automated and scalable online exam proctoring.
What the Data Shows
The dataset consists of 5,500 structured records, each representing a snapshot of a student’s behavior during an online exam. Each record includes 38 attributes extracted using computer vision techniques and classified into two categories [see Table 1]:
• Cheating behavior (label = 1)
• Non-cheating behavior (label = 0)
The class distribution is nearly balanced, with 2,619 cheating and 2,881 non-cheating instances, making it suitable for supervised binary classification tasks. The recorded features fall under the following categories:
• Face Detection: Captures face presence, count, bounding box, and key landmarks.
• Hand Tracking: Records hand count, positions, and object interaction status.
• Head Pose Estimation: Includes pitch, yaw, and roll angles indicating head orientation.
• Mobile Phone Detection: Indicates phone presence, location, and detection confidence.
• Eye Gaze Tracking: Tracks gaze direction, screen focus, gaze points, and pupil positions.
How the Data Was Gathered
Data were collected in a controlled, simulated online exam environment using a standard webcam and implemented with computer vision modules. The system used:
• MediaPipe for real-time face and hand tracking.
• OpenCV for image processing and frame analysis.
• Custom models for gaze estimation, head pose, and mobile phone detection.
Notable Findings
Machine learning models like Random Forest and XGBoost achieved high precision and recall on this dataset. Notably:
• Hand-object interactions and phone presence are key indicators of cheating.
• Head pose deviations and off-screen gaze also suggest suspicious behavior.
• Combining multiple behavioral cues enhances detection accuracy over single-modality approaches.
How the Data Can Be Interpreted and Used
This dataset is designed for researchers, developers, and educators aiming to:
• Build AI-powered online proctoring systems
• Develop behavior recognition models for academic monitoring
• Benchmark cheating detection techniques in machine learning and computer vision
• Explore the ethical implications of surveillance technologies in education
Each record is fully anonymized, containing no raw images or personal identifiers, making it safe for public research use. The structured numerical format ensures compatibility with various machine learning libraries and tools.
创建时间:
2025-07-08



