five

Visual and audiovisual speech perception associated with increased functional connectivity between sensory and motor regions

收藏
OpenNeuro2021-07-05 更新2026-03-14 收录
下载链接:
https://openneuro.org/datasets/ds003717
下载链接
链接失效反馈
官方服务:
资源简介:
Project description =================== In the current study we visual and audiovisual processing of single words in adult participants. Words were presented in quiet for auditory only, visual only, and audiovisual stimuli. Audiovisual words were also presented in background at signal-to-noise ratios (SNRs) of +5, 0, -5, and -10 dB. Supplemental materials (including stimuli, analysis scripts, extracted data, and figures) can be found at <https://osf.io/qxcu8/>. ## Materials Seven lists of 50 words were created. The stimuli were recordings of a female actor’s head and shoulders speaking single words. The talker sat in front of a neutral background and spoke words along with the carrier phrase “Say the word _______” into the camera. The actor was instructed to allow her mouth to relax to a slightly open and neutral position before each target word was spoken. The edited versions of the recordings used in the experiment did not include a carrier phrase and were each 1.5 seconds long. Recordings were from a Canon Elura 85 digital video camera connected via IEEE 1394 connection to a Dell Precision 670 for capture and digital storage. Digital capture and editing was done using Adobe Premiere Elements. The original capture format for the video was uncompressed .avi. The final versions used in the study were compressed as high quality .wmv files. Audio was leveled to ensure that each word had the same RMS amplitude using Adobe Audition. Conditions that included background noise used RMS leveled six-talker babble that was mixed and included in the final version of the file. The 350 recordings used in the study were selected from a corpus of 970 recordings of high frequency words originally selected from the 40,481 words listed in the English Lexicon Project (Balota et al., 2007). The words that were selected for presentation in the lipreading (V-only) or audiovisual (AV) conditions in varying signal-to-noise ratios (SNR) were selected from the larger corpus based on V-only behavioral performance on each word from 149 participants (22-90 years old) who were tested using the entire corpus. The words selected ranged from 10%–93% correct in the lipreading-only behavioral tests. They were distributed among the six conditions that included visual information (AV in Quiet, AV +5 SNR, AV 0 SNR, AV -5 SNR, AV -10, and V-only) so they would be equivalent for lipreading difficulty. The words used in the A-only condition were selected from the remaining words. ## Participants We collected data from 65 adults aged 18–34 years. Of these, we excluded fMRI data from 5 participants. The remaining 60 participants ranged in age from 18–34 years (M = 22.42, SD = 3.24). All were right-handed. Where available, pure-tone thresholds are given for the left and right ear (L_250 = threshold for left ear at 250 Hz in dB, and so on). ## Procedure All participants completed a behavioral lip-reading assessment, an MRI safety screening, and were consented before being tested in the fMRI scanner. They were positioned in the scanner with insert earphones inserted and a viewing mirror placed above the eyes to see a two-sided projection screen located at the head-side of the scanner. Those that wore glasses were provided scanner-friendly lenses that fit their prescription. Participants were also given a response box that they held in a comfortable position on their torso during testing. Each of the sequences presented included trials with recordings of audio, visual-only, audiovisual speech stimuli, or printed text via an image projected on the screen that was visible to the participant through the viewing mirror. A camera positioned at the entrance to the scanner bore was used to monitor participant movement. A well-being check and short conversation occurred between each sequence and, if needed, they were reminded to stay alert and asked to try to reduce movement during the session. Seven sequences were presentenced during the session. Each one lasted approximately 5.5 minutes. The first six sequences contained 98 trails each. The stimuli were presented in blocks of five experimental trials plus two null trials for each condition. The total result was 14 blocks resulting in 70 experimental trials plus 28 null trials. All trials included 800 ms of quiet without a visual presentation before the stimuli began. During the null trials participants were presented with a fixation cross for 1.5 seconds instead of the audiovisual presentation. The A-only condition was still in .wmv format but did not include visual stimuli, instead a black screen was presented where video was presented the other experimental conditions. The blocks were quasi-randomized so that no two blocks from the same condition were presented after the other and two null trials never occurred after another. To keep attention high, half of the experimental trials required a response from the participant. For these trials a set of two dots appeared on the screen after the audiovisual/audio presentation. The right-side dot was green and the left-side dot was red. The participant was instructed to use the right-hand button on the response box to indicate “yes” they were confident that they were able to identify the previous word and the left-hand button if they felt they did not identify the previous word correctly. After the initial six runs, a final run of 60 trials was presented that included only orthographic words on the screen. The items were the same 50 words used for the behavioral V-only assessment. Each word stayed on the screen for 2.5 seconds. The words were followed by two green dots that appeared for 2.5 seconds. Participants were asked to say aloud the word that was presented during the period that the dots were on the screen. Ten null trials were randomly distributed throughout the sequence. Null trial lasted 2.5 seconds and included a fixation cross on the screen. ## MRI data acquisition MRI images were acquired on a Siemens Prisma 3T scanner using a 32 channel head coil. Structural images were acquired using a T1-weighted MPRAGE sequence (details) with a voxel size of 1 x 1 x 1 mm. Functional images were acquired using a multiband sequence (Feinberg et al., 2010) with an acceleration factor of 8. Each volume took 0.770 s to acquire. We used a sparse imaging paradigm (Edmister et al., 1999; Hall et al., 1999) with a repetition time of 3.07 s, leaving 2.3 s of silence on each trial. We presented words during this silent period, and on repeat blocks, instructed participants to speak during a silent period to minimize the influence of head motion on the data.
创建时间:
2021-07-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作