Samrómur Children Icelandic Speech 1.0
收藏DataCite Commons2025-05-06 更新2024-07-13 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2022S11
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3>
<p>Samrómur Children Icelandic Speech 1.0 was developed by the <a href="https://lvl.ru.is/">Language and Voice Lab, Reykjavik University </a>in cooperation with <a href="https://almannaromur.is/">Almannarómur, Center for Language Technology</a>. The corpus contains 131 hours of Icelandic prompted speech from 3,175 speakers (children, aged 4-17 years) representing 137,597 utterances.</p>
<p>This version 1.0 is equivalent to "Samrómur Children Icelandic Speech 21.09" as used by the Language Technology Programme for Icelandic 2019-2023.</p>
<h3>Data</h3>
<p>Speech data was collected between October 2019 and September 2021 using the <a href="https://samromur.is">Samrómur website</a> which displayed prompts to participants. The prompts were mainly from <a href="http://clarin.is/en/resources/gigaword">The Icelandic Gigaword Corpus</a>, which includes text from novels, news, plays, and from a list of location names in Iceland. Additional prompts were taken from <a href="https://www.visindavefur.is/">the Icelandic Web of Science</a> and others were created by combining a name followed by a question or a demand. Prompts and speaker metadata are included in the corpus.</p>
<p>The audio data is divided into train, dev, and test sets and is presented as flac compressed, single channel, 16 kHz, 16-bit linear PCM.</p>
<h3>Samples</h3>
<p>Please listen to this <a href="desc/addenda/LDC2022S11.flac">audio sample (FLAC)</a>.</p>
<h3>Updates</h3>
<p>None at this time.</p>
提供机构:
Linguistic Data Consortium
创建时间:
2022-11-14



