five

Independent edge evaluation for directed acyclic graphs in probabilistic sequence modeling

收藏
Mendeley Data2026-05-21 收录
下载链接:
https://data.mendeley.com/datasets/sn5r9g45rr
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains the official replication package and data artifacts for the Step-Level Diagnostic Engine (SLDE) framework and the Independent Edge Evaluation (IEE) methodology, ensuring complete reproducibility of the empirical results presented in the manuscript. The dataset consists of three primary components structured to support process-oriented assessment space evaluation.   The first component is the ASSISTments 2009-2010 Dataset (stored in skill_builder_data.csv). This file contains the benchmark, publicly available raw sequential tracking data from the ASSISTments 2009-2010 skill-builder corpus. To isolate multi-step cognitive structures, the replication script applies strict preprocessing rules directly to this raw file. For main tasks (original=1), records with null skill names or non-binary correctness are discarded. For sub-step scaffold rows (original=0), records are retained and programmatically inherit latent dependency labels from the parent main task via the assistment_id. The final execution filters a restricted cohort of exactly M = 669 unique learners, structured to enforce zero temporal data leakage during downstream sequence modeling.   The second component is the Prerequisite Graph Topology (stored in assistments_prereq_edges.sql). This file provides the explicit graph-theoretic formalization of the assessment space used to evaluate cross-skill transfer. Parsed directly via regular expressions within the scripts, it contains a mapping of 110 unique Knowledge Components (KCs), directed prerequisite edges mapping cognitive transitions, and a set of 857 pre-enumerated valid pedagogical paths ordered longest-first, acting as the ground-truth Directed Acyclic Graph (DAG) templates.   The third component is the Complete Source Code Package (stored in IEE_DAG_Final_Submit.ipynb). This comprehensive Python replication notebook contains the complete implementation pipeline. It includes environment setup pinning sympy==1.13.1 to resolve native compatibility conflicts with PyTorch 2.x optimizers. It provides the exact mathematical architecture for the proposed IEE-BKT model (implementing localized evidence vectors and K-step sequential Bayesian updates) along with re-implemented baselines including Standard BKT, LSTM-based Deep Knowledge Tracing (DKT), and Self-Attentive Knowledge Tracing (SAKT). Finally, it executes a strict student-level 10-fold cross-validation routine to control for intra-student tracking bias.   Researchers can utilize these files to fully replicate the evaluation metrics, including paired t-test distributions, RMSE improvements, and visual ROC/AUC patterns where IEE-BKT secures a baseline performance of AUC = 0.5514.
创建时间:
2026-05-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作