five

Table_1_Validation of automated scoring for learning progression-aligned Next Generation Science Standards performance assessments.DOCX

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/Table_1_Validation_of_automated_scoring_for_learning_progression-aligned_Next_Generation_Science_Standards_performance_assessments_DOCX/21722345
下载链接
链接失效反馈
官方服务:
资源简介:
IntroductionThe Framework for K-12 Science Education promotes supporting the development of knowledge application skills along previously validated learning progressions (LPs). Effective assessment of knowledge application requires LP-aligned constructed-response (CR) assessments. But these assessments are time-consuming and expensive to score and provide feedback for. As part of artificial intelligence, machine learning (ML) presents an invaluable tool for conducting validation studies and providing immediate feedback. To fully evaluate the validity of machine-based scores, it is important to investigate human-machine score consistency beyond observed scores. Importantly, no formal studies have explored the nature of disagreements between human and machine-assigned scores as related to LP levels. MethodsWe used quantitative and qualitative approaches to investigate the nature of disagreements among human and scores generated by two approaches to machine learning using a previously validated assessment instrument aligned to LP for scientific argumentation. ResultsWe applied quantitative approaches, including agreement measures, confirmatory factor analysis, and generalizability studies, to identify items that represent threats to validity for different machine scoring approaches. This analysis allowed us to determine specific elements of argumentation practice at each level of the LP that are associated with a higher percentage of misscores by each of the scoring approaches. We further used qualitative analysis of the items identified by quantitative methods to examine the consistency between the misscores, the scoring rubrics, and student responses. We found that rubrics that require interpretation by human coders and items which target more sophisticated argumentation practice present the greatest threats to the validity of machine scores. DiscussionWe use this information to construct a fine-grained validity argument for machine scores, which is an important piece because it provides insights for improving the design of LP-aligned assessments and artificial intelligence-enabled scoring of those assessments.
创建时间:
2022-12-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作