five

Samrómur Children Icelandic Speech 1.0

收藏
DataCite Commons2025-05-06 更新2024-07-13 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2022S11
下载链接
链接失效反馈
官方服务:
资源简介:
<h3>Introduction</h3> <p>Samr&oacute;mur Children Icelandic Speech 1.0 was developed by the <a href="https://lvl.ru.is/">Language and Voice Lab, Reykjavik University&nbsp;</a>in cooperation with&nbsp;<a href="https://almannaromur.is/">Almannar&oacute;mur, Center for Language Technology</a>. The corpus contains 131 hours of Icelandic prompted speech from 3,175 speakers (children, aged 4-17 years) representing 137,597 utterances.</p> <p>This version 1.0 is equivalent to "Samr&oacute;mur Children Icelandic Speech 21.09" as used by the Language Technology Programme for Icelandic 2019-2023.</p> <h3>Data</h3> <p>Speech data was collected between October 2019 and September 2021 using the&nbsp;<a href="https://samromur.is">Samr&oacute;mur website</a>&nbsp;which displayed prompts to participants. The prompts were mainly from&nbsp;<a href="http://clarin.is/en/resources/gigaword">The Icelandic Gigaword Corpus</a>, which includes text from novels, news, plays, and from a list of location names in Iceland. Additional prompts were taken from&nbsp;<a href="https://www.visindavefur.is/">the Icelandic Web of Science</a>&nbsp;and others were created by combining a name followed by a question or a demand. Prompts and speaker metadata are included in the corpus.</p> <p>The audio data is divided into train, dev, and test sets and is presented as flac compressed, single channel, 16 kHz, 16-bit linear PCM.</p> <h3>Samples</h3> <p>Please listen to this <a href="desc/addenda/LDC2022S11.flac">audio sample (FLAC)</a>.</p> <h3>Updates</h3> <p>None at this time.</p>
提供机构:
Linguistic Data Consortium
创建时间:
2022-11-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作