five

Data for Gradient boosted decision trees reveal nuances of auditory discrimination behavior

收藏
rdr.ucl.ac.uk2024-03-22 更新2025-03-23 收录
下载链接:
https://rdr.ucl.ac.uk/articles/dataset/Data_for_Gradient_boosted_decision_trees_reveal_nuances_of_auditory_discrimination_behavior/25386565/1
下载链接
链接失效反馈
官方服务:
资源简介:
Raw data for the article: Gradient boosted decision trees reveal nuances of auditory discrimination behaviour (PLOS Computational Biology).This data repository contains the csv files after extraction of the raw MATLAB metadata files into pandas (Python) dataframes (helper function author: Jules Lebert). The csv files can easily be loaded back into dataframe objects using pandas before the subsampling steps (as documented in the paper, we used subsampling to ensure the number of F0-roved and control F0 trials were relatively equal) are completed.Link to GitHub repository to run the models on this data: https://github.com/carlacodes/boostmodelsA full description of each of the variables within the dataframe can be found in the data_description_instructions_for_datasets_plos_bio.pdf.Abstract: Animal psychophysics can generate rich behavioral datasets, often comprised of many 1000s of trials for an individual subject. Gradient-boosted models are a promising machine learning approach for analyzing such data, partly due to the tools that allow users to gain insight into how the model makes predictions. We trained ferrets to report a target word’s presence, timing, and lateralization within a stream of consecutively presented non-target words. To assess the animals’ ability to generalize across pitch, we manipulated the fundamental frequency (F0) of the speech stimuli across trials, and to assess the contribution of pitch to streaming, we roved the F0 from word token-to-token. We then implemented gradient-boosted regression and decision trees on the trial outcome and reaction time data to understand the behavioral factors behind the ferrets’ decision-making. We visualized model contributions by implementing SHAPs feature importance and partial dependency plots. While ferrets could accurately perform the task across all pitch-shifted conditions, our models reveal subtle effects of shifting F0 on performance, with within-trial pitch shifting elevating false alarms and extending reaction times. Our models identified a subset of non-target words that animals commonly false alarmed to. Follow-up analysis demonstrated that the spectrotemporal similarity of target and non-target words rather than similarity in duration or amplitude waveform was the strongest predictor of the likelihood of false alarming. Finally, we compared the results with those obtained with traditional mixed effects models, revealing equivalent or better performance for the gradient-boosted models over these approaches.

原始数据集描述:该数据集收录了发表于《PLOS计算生物学》的论文《梯度提升决策树揭示听觉辨别行为的细微差异》的相关数据。本数据存储库包含将原始的 MATLAB 元数据文件提取并转换为 pandas (Python) 数据框后的 csv 文件(辅助函数作者:Jules Lebert)。在完成样本子集化步骤(如论文所述,我们采用子集化以确保 F0 确认和对照 F0 试验的数量相对均衡)之前,这些 csv 文件可以轻松地被加载回数据框对象。链接至 GitHub 仓库以运行该数据集上的模型:https://github.com/carlacodes/boostmodels。数据框中每个变量的详细描述可查阅数据集之 PLOS 生物学数据描述指南.pdf。摘要:动物的心理物理学研究能够生成丰富的行为数据集,通常包含单个受试者数以千计的试验。梯度提升模型是分析此类数据的有前景的机器学习方法,部分原因是其提供的工具允许用户深入了解模型预测的机制。我们训练了貂来报告目标词在一系列连续呈现的非目标词流中的存在、时序和侧化。为了评估动物跨音高的泛化能力,我们在试验中操纵了语音刺激的基本频率(F0),并且为了评估音高对连续性的贡献,我们从词素到词素地调整了 F0。然后,我们针对试验结果和反应时间数据实现了梯度提升回归和决策树,以理解貂决策背后的行为因素。通过实现 SHAPs 特征重要性和部分依赖图,我们可视化了模型贡献。尽管貂能够在所有音高偏移条件下准确执行任务,但我们的模型揭示了 F0 偏移对表现产生的微妙影响,其中试验内的音高偏移会增加误报并延长反应时间。我们的模型识别出了一组动物常见误报的非目标词。后续分析表明,目标词和非目标词的声谱时间相似性,而非持续时间或幅度波形相似性,是误报可能性最强的预测因子。最后,我们将结果与传统混合效应模型的结果进行了比较,发现梯度提升模型在这些方法上的表现相当或更好。
提供机构:
rdr.ucl.ac.uk
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作