The Grid Audio-Visual Speech Corpus

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/3625686

下载链接

链接失效反馈

官方服务：

资源简介：

The Grid Corpus is a large multitalker audiovisual sentence corpus designed to support joint computational-behavioral studies in speech perception. In brief, the corpus consists of high-quality audio and video (facial) recordings of 1000 sentences spoken by each of 34 talkers (18 male, 16 female), for a total of 34000 sentences. Sentences are of the form "put red at G9 now". audio_25k.zip contains the wav format utterances at a 25 kHz sampling rate in a separate directory per talker alignments.zip provides word-level time alignments, again separated by talker s1.zip, s2.zip etc contain .jpg videos for each talker [note that due to an oversight, no video for talker t21 is available] The Grid Corpus is described in detail in the paper jasagrid.pdf included in the dataset.

创建时间：

2024-07-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集