On the strategic learning of signal associations
收藏DataONE2022-03-25 更新2025-05-31 收录
下载链接:
https://search.dataone.org/view/sha256:5e13388c0a6486233b4f7c64c677f25fdbb802270b1ab765b3ed14487cf56347
下载链接
链接失效反馈官方服务:
资源简介:
Signal detection theory (SDT) has been widely used to identify the optimal response of a receiver to a stimulus when it could be generated by more than one signaller type. While SDT assumes that the receiver adopts the optimal response at the outset, in reality receivers often have to learn how to respond. We therefore recast a simple signal detection problem as a multi-armed bandit (MAB) in which inexperienced receivers chose between accepting a signaller (gaining information and an uncertain payoff) and rejecting it (gaining no information but a certain payoff). An exact solution to this exploration-exploitation dilemma can be identified by solving the relevant dynamic programming equation (DPE). However, to evaluate how the problem is solved in practice, we conducted an experiment. Here humans (n = 135) were repeatedly presented with a four readily discriminable signaller types, some of which were on average profitable, and others unprofitable to accept in the long term. We then comp...
信号检测理论(Signal Detection Theory, SDT)已被广泛用于解决当某一刺激可由多种信号发送者类型产生时,确定接收者对该刺激的最优响应策略的问题。尽管信号检测理论假设接收者在初始阶段即可采用最优响应策略,但现实场景中,接收者往往需要通过学习才能掌握恰当的响应方式。为此,我们将一个简单的信号检测问题重构为多臂老虎机(Multi-Armed Bandit, MAB)问题:在该问题框架下,缺乏经验的接收者需在两种选项中做出抉择——接受信号发送者(可获取信息与不确定收益),或是拒绝信号发送者(无信息收益但可获得确定收益)。该探索-利用两难问题的精确解可通过求解对应的动态规划方程(Dynamic Programming Equation, DPE)得到。不过,为了评估该问题在实际场景中的求解方式,我们开展了一项实验。本次实验共招募135名人类被试,向其反复呈现四种易于区分的信号发送者类型:其中部分类型长期接受后平均可获得收益,其余类型则长期接受后无利可图。随后我们[原文此处未完成,仅显示“comp...”]
创建时间:
2025-05-16



