five

ATIS0 Read

收藏
DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC93S4B-2
下载链接
链接失效反馈
官方服务:
资源简介:
<a href="http://catalog.ldc.upenn.edu/LDC93S4A" rel="nofollow">LDC93S4A</a> - Complete ATIS0 corpus <a href="http://catalog.ldc.upenn.edu/LDC93S4B" rel="nofollow">LDC93S4B</a> - ATIS0 Pilot LDC93S4B-2 - ATIS0 Read <a href="http://catalog.ldc.upenn.edu/LDC93S4B-3" rel="nofollow">LDC93S4B-3</a> - ATIS0 SD-Read <p> The ATIS0 Corpus totals six CD-ROMs: one with spontaneous data from 36 speakers; one with read versions of the data from 20 of those speakers, along with some adaptation material; and four with extensive speaker dependent material from the ATIS domain, read by ten of the same speakers. </p><p> All ATIS speech data is recorded at 16kHz sample rate, 16-bit quantization, from two different microphones, a close-talking (Sennheiser HMD414) and a desk-top (Crown PCC-160) model. </p><p> The first disc (ATIS0 Pilot) contains spontaneous utterances elicited in a "Wizard-of-Oz" simulation, along with the relational database containing the travel information (excluding connecting flights). 36 speakers produced a total of 912 utterances.</p><p> The second disc (ATIS0 Read) contains "read" versions of the spontaneous utterances for 20 of the 36 speakers above, for a total of 478 productions. This is supplemented by a set of 40 "adaptation" sentences read by each of the 20 speakers.</p><p> The third through the sixth discs (ATIS0 SD-Read) contain "read" speech in the ATIS domain for ten of the speakers on the first disc. They read a total of 3,171 utterances, or approximately 317 utterances per speaker. This data was collected for the purpose of training speaker-dependent speech recognition systems for the ATIS0 domain. Two of these four discs contain the close-talking (Sennheiser) microphone data and the other two contain corresponding data for the desk-top (Crown PCC-160) microphone. Thus there are 6,342 waveform files on the four discs.</p></br> Portions © 1993 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作