PB2007 French acoustic-articulatory speech database

NIAID Data Ecosystem2026-03-13 收录

下载链接：

https://zenodo.org/record/6390597

下载链接

链接失效反馈

官方服务：

资源简介：

PB2007 acoustic-articulatory speech dataset Badin, P.,Bailly G., Ben Youssef A., Elisei F., Savariaux C., Hueber T. Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France LICENSE: ======== This dataset is made available under the Creative Commons Attribution Share-Alike (CC-BY-SA) license CREDITS - ATTRIBUTION: ====================== If using this dataset, please cite one of the following studies (all of them exploit this dataset) - Ben Youssef, A., Badin, P., Bailly, G. & Heracleous, P. (2009). Acoustic-to-articulatory inversion using speech recognition and trajectory formation based on phoneme hidden Markov models. In Interspeech 2009, vol., pp. 2255-2258. Brighton, UK. - Ben Youssef, A., Badin, P. & Bailly, G. (2010). Can tongue be recovered from face? The answer of data-driven statistical models. In Interspeech 2010 (11th Annual Conference of the International Speech Communication Association) (T. Kobayashi, K. Hirose & S. Nakamura, editors), vol., pp. 2002-2005. Makuhari, Japan. - Hueber T., Bailly G., Badin P., Elisei F., "Speaker Adaptation of an Acoustic-Articulatory Inversion Model using Cascaded Gaussian Mixture Regressions", Proceedings of Interspeech, Lyon, France, 2013, pp. 2753-2757. DATA FILES DESCRIPTION: ======================= /_seq/: Electro-magnetic Articulography data, recorded at 100Hz Sensors : PAR01 : LT_x (lower incisor, x coordinate) PAR02 : tip_x (tongue tip, x coordinate) PAR03 : mid_x (tongue dorsum, x coordinate) PAR04 : bck_x (tongue back, x coordinate) PAR05 : LL_vis_x (lower lips, x coordinate) PAR06 : UL_vis_x (upper lips, x coordinate) PAR07 : LT_z (lower incisor, z coordinate) PAR08 : tip_z (tongue tip, z coordinate) PAR09 : mid_z (tongue dorsum, z coordinate) PAR10 : bck_z (tongue back, z coordinate) PAR11 : LL_vis_z (lower lips, z coordinate) PAR12 : UL_vis_z (upper lips, z coordinate) /_wav16: subject audio signal, synchronized with the EMA data Format: PCA wav, 16kHz, 16bits /_lab: phonetic segmentation using the following set __ (long pause), _ (short pause), a, e^ (as in "lait"), e (as in "blé"), i, y (as in "voiture"), u (as in "loup"), o^ (as in "pomme"),x (as in "pneu"), x^ (as in "coeur"), a~ (as in "flan"), e~ (as in "in"), x~ (as in "un"), o~ (as in "mon"), p, t, k, f, s, s^ (as in "CHat"), b, d, g, v, z, z^ (as in "les Gens"), m, n, r, l, w, h, j, o, q (schwa)

创建时间：

2022-03-29

5,000+

优质数据集

54 个

任务类型

进入经典数据集