five

PB2007 French acoustic-articulatory speech database

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6390597
下载链接
链接失效反馈
官方服务:
资源简介:
PB2007 acoustic-articulatory speech dataset Badin, P.,Bailly G., Ben Youssef A., Elisei F., Savariaux C., Hueber T.  Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France LICENSE: ======== This dataset is made available under the Creative Commons Attribution Share-Alike (CC-BY-SA) license CREDITS - ATTRIBUTION: ====================== If using this dataset, please cite one of the following studies (all of them exploit this dataset)  - Ben Youssef, A., Badin, P., Bailly, G. & Heracleous, P. (2009). Acoustic-to-articulatory inversion using speech recognition and trajectory formation based on phoneme hidden Markov models. In Interspeech 2009, vol., pp. 2255-2258. Brighton, UK. - Ben Youssef, A., Badin, P. & Bailly, G. (2010). Can tongue be recovered from face? The answer of data-driven statistical models. In Interspeech 2010 (11th Annual Conference of the International Speech Communication Association) (T. Kobayashi, K. Hirose & S. Nakamura, editors), vol., pp. 2002-2005. Makuhari, Japan. - Hueber T., Bailly G., Badin P., Elisei F., "Speaker Adaptation of an Acoustic-Articulatory Inversion Model using Cascaded Gaussian Mixture Regressions", Proceedings of Interspeech, Lyon, France, 2013, pp. 2753-2757.    DATA FILES DESCRIPTION: ======================= /_seq/:      Electro-magnetic Articulography data, recorded at 100Hz     Sensors :         PAR01 : LT_x (lower incisor, x coordinate)         PAR02 : tip_x (tongue tip, x coordinate)         PAR03 : mid_x (tongue dorsum, x coordinate)         PAR04 : bck_x (tongue back, x coordinate)         PAR05 : LL_vis_x (lower lips, x coordinate)         PAR06 : UL_vis_x (upper lips, x coordinate)         PAR07 : LT_z (lower incisor, z coordinate)         PAR08 : tip_z (tongue tip, z coordinate)         PAR09 : mid_z (tongue dorsum, z coordinate)         PAR10 : bck_z (tongue back, z coordinate)         PAR11 : LL_vis_z (lower lips, z coordinate)         PAR12 : UL_vis_z (upper lips, z coordinate) /_wav16:      subject audio signal, synchronized with the EMA data     Format: PCA wav, 16kHz, 16bits /_lab: phonetic segmentation using the following set __ (long pause), _ (short pause), a, e^ (as in "lait"), e (as in "blé"), i, y (as in "voiture"), u (as in "loup"), o^ (as in "pomme"),x (as in "pneu"), x^ (as in "coeur"), a~ (as in "flan"), e~ (as in "in"), x~ (as in "un"), o~ (as in "mon"), p, t, k, f, s, s^ (as in "CHat"), b, d, g, v, z, z^ (as in "les Gens"), m, n, r, l, w, h, j, o, q (schwa)
创建时间:
2022-03-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作