five

Pitfalls in quantifying exploration in reward-based motor learning: simulated datasets

收藏
DataCite Commons2025-07-02 更新2025-04-09 收录
下载链接:
https://dataverse.nl/citation?persistentId=doi:10.34894/ANJOPR
下载链接
链接失效反馈
官方服务:
资源简介:
When learning a movement based on binary success information, one is more variable following failure than following success. Theoretically, the additional variability post failure might reflect exploration of possibilities to obtain success. When average behavior is changing (as in learning), variability can be estimated from differences between subsequent movements. Can one estimate exploration reliably from such trial-to-trial changes when studying reward-based motor learning? To answer this question, we tried to reconstruct the exploration underlying learning as described by four existing reward-based motor learning models. We simulated learning for various learner and task characteristics. If we simply determined the additional change post failure, estimates of exploration were sensitive to learner and task characteristics. We identified two pitfalls in quantifying exploration based on trial-to-trial changes. Firstly, performance-dependent feedback can cause correlated samples of motor noise and exploration on successful trials, which biases exploration estimates. Secondly, the trial relative to which trial-to-trial change is calculated may also contain exploration, which causes underestimation. As a solution, we developed the additional trial-to-trial change (ATTC) method. By moving the reference trial one trial back and subtracting trial-to-trial changes following specific sequences of trial outcomes, exploration can be estimated reliably for the three models that explore based on the outcome of only the previous trial. Since ATTC estimates are based on a selection of trial sequences, this method requires many trials. In conclusion, if exploration is a binary function of previous trial outcome, the ATTC method allows for a model-free quantification of exploration.

基于二元成功信息学习动作时,失败后的动作变异性高于成功后。理论上,失败后的额外变异性可能反映了为获得成功而进行的可能性探索。当平均行为发生变化时(如学习过程中),变异性可通过连续动作间的差异来估计。在研究基于奖励的运动学习时,能否通过这种试次间的变化可靠地估计探索行为?为回答这一问题,我们尝试重构四种现有基于奖励的运动学习模型所描述的学习背后的探索行为。我们针对不同的学习者特征和任务特征进行了学习模拟。若仅简单测定失败后的额外变化,探索行为的估计结果会对学习者特征和任务特征敏感。我们发现了基于试次间变化量化探索行为的两个陷阱:其一,依赖表现的反馈会导致成功试次中运动噪声与探索行为的样本相关,从而使探索行为估计产生偏差;其二,用于计算试次间变化的参考试次本身可能也包含探索行为,导致估计结果被低估。作为解决方案,我们提出了额外试次间变化(ATTC)方法。通过将参考试次后移一个试次,并减去特定试次结果序列后的试次间变化,可对三种仅基于前一试次结果进行探索的模型可靠地估计探索行为。由于ATTC估计基于试次序列的选择,该方法需要大量试次。综上,若探索行为是前一试次结果的二元函数,则ATTC方法可实现探索行为的无模型量化。
提供机构:
DataverseNL
创建时间:
2021-06-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作