five

Preference of voice and performance style: Data of the listening test and acoustic analysis

收藏
DataCite Commons2024-01-19 更新2024-08-19 收录
下载链接:
https://figshare.com/articles/dataset/Preference_of_voice_and_performance_style_Data_of_the_listening_test_and_acoustic_analysis/25028036
下载链接
链接失效反馈
官方服务:
资源简介:
This study sought to find out which style of radio advertisement performance listeners consider likable and which acoustic features differentiate the likable from the unlikable. The same speakers presented a gender neutral pretend-advertisement in two styles: calm and energetic. Listeners had to rate the likability of the performances. The results showed that listener likability scores were consistent and did not depend on listener gender. The listeners overwhelmingly preferred advertisements presented in a calm style, regardless of performer or their age or gender. For each advertisement, 88 parameters of the extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS) were calculated. Most of these significantly differentiated likable and unlikable performances. Likable performances were characterised by lower pitch, faster articulation rate, a quieter voice with no abrupt changes in loudness, and a breathy voice. The study showed the importance of determining which performance style listeners prefer, as the voice of the performer is directly affected by the performance style. Listeners might like a voice in one style, but not the other.<b>List of eGeMAPS parameter abbreviations</b><b><i>A</i></b><sub><strong>1,2,3</strong></sub><b> </b>= difference of log amplitude of first, second and third harmonic to <i>f</i><sub>0</sub> amplitude<b><i>A</i></b><sub><strong>F1,F2,F3</strong></sub> = difference of log amplitude of first, second and third formant to <i>f</i><sub>0</sub> amplitude<b>alpha ratio</b> = ratio of the summed energy from 50–1000 Hz and 1–5 kHz<b><i>B</i></b><sub><strong>F1,2,3</strong></sub> = bandwidth of first, second and third formant<b><i>f</i></b><sub><strong>0</strong></sub><b> </b>= logarithmic fundamental frequency on a semitone frequency scale, starting at 27.5 Hz (semitone 0)<b><i>F</i></b><sub><strong>1,2,3</strong></sub> = centre frequency of first, second and third formant<b>Hammarberg index</b> = ratio of the strongest energy peak in the 0–2 kHz region to the strongest peak in the 2–5 kHz region<b>harmonic difference </b><b><i>H</i></b><sub><strong>1</strong></sub><b>–</b><b><i>H</i></b><sub><strong>2</strong></sub> = ratio of energy of the first <i>f</i><sub>0</sub> harmonic (<i>H</i><sub>1</sub>) to the energy of the second <i>f</i><sub>0 </sub>harmonic (<i>H</i><sub>2</sub>)<b>harmonic difference </b><b><i>H</i></b><sub><strong>1</strong></sub><b>–</b><b><i>A</i></b><sub><strong>3</strong></sub><b> </b>= ratio of energy of the first <i>f</i><sub>0</sub> harmonic (<i>H</i><sub>1</sub>) to the energy of the highest harmonic in the third formant range (<i>A</i><sub>3</sub>)<b>HNR</b> = harmonics-to-noise ratio<b>LEq</b><i> = </i>equivalent sound level, computed by converting the average of the per-frame RMS energies to a logarithmic (dB) scale<b>MFCC</b><sub><strong>1,2,3,4</strong></sub> = first, second, third and fourth Mel-frequency cepstral coefficient<b>loudness</b> = estimate of perceived signal intensity from an auditory spectrum<b>pctl </b>= percentile<b>pctlrg</b> = range of the 20th to 80th percentile<b>SD</b><sub><strong>norm</strong></sub> = standard deviation normalised by the arithmetic mean (coefficient of variation)<b>shimmer</b> = difference of the peak amplitudes of consecutive <i>f</i><sub>0</sub> periods<b>spectral flux</b> = difference of the spectra of two consecutive frames<b>spectral slope 0–500 Hz or 500–1500 Hz</b> = linear regression slope of the logarithmic power spectrum for 0<b>–</b>500 Hz or 500<b>–</b>1500 Hz region<b>UVR</b> = unvoiced regions<b>VR </b>= voiced regions<br>
提供机构:
figshare
创建时间:
2024-01-19
二维码
社区交流群
二维码
科研交流群
商业服务