Samrómur Icelandic Speech 1.0
收藏DataCite Commons2022-05-16 更新2024-07-13 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2022S05
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3><br>
<p>Samrómur Icelandic Speech 1.0 was developed by the <a href="https://lvl.ru.is/">Language and Voice Lab, Reykjavik University</a> in cooperation with <a href="https://almannaromur.is/">Almannarómur, Center for Language Technology</a>. The corpus contains 145 hours of Icelandic prompted speech from 8,392 speakers representing 100,000 utterances.</p><br>
<p>This version 1.0 is equivalent to "Samrómur Icelandic Speech 21.05" as used by the Language Technology Programme for Icelandic 2019-2023.</p><br>
<h3>Data</h3><br>
<p>Speech data was collected between October 2019 and May 2021 using the <a href="https://samromur.is">Samrómur website</a> which displayed prompts to participants. The prompts were mainly from <a href="http://clarin.is/en/resources/gigaword">The Icelandic Gigaword Corpus</a>, which includes text from novels, news, plays, and from a list of location names in Iceland. Additional prompts were taken from <a href="https://www.visindavefur.is/">the Icelandic Web of Science</a> and others were created by combining a name followed by a question or a demand. Prompts and speaker metadata are included in the corpus.</p><br>
<p>The audio data is divided into train, dev, and test sets and is presented as flac compressed, single channel, 16 kHz, 16-bit linear PCM.</p><br>
<h3>Samples</h3><br>
<p>Please view this <a href="desc/addenda/LDC2022S05.flac">audio sample (FLAC)</a>.</p><br>
<h3>Updates</h3><br>
<p>None at this time.</p></br>
Portions © 2022 Reykjavik University, © 2022 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2022-04-26



