five

Samrómur Icelandic Speech 1.0

收藏
DataCite Commons2022-05-16 更新2024-07-13 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2022S05
下载链接
链接失效反馈
官方服务:
资源简介:
<h3>Introduction</h3><br> <p>Samr&oacute;mur Icelandic Speech 1.0 was developed by the <a href="https://lvl.ru.is/">Language and Voice Lab, Reykjavik University</a> in cooperation with <a href="https://almannaromur.is/">Almannar&oacute;mur, Center for Language Technology</a>. The corpus contains 145 hours of Icelandic prompted speech from 8,392 speakers representing 100,000 utterances.</p><br> <p>This version 1.0 is equivalent to "Samr&oacute;mur Icelandic Speech 21.05" as used by the Language Technology Programme for Icelandic 2019-2023.</p><br> <h3>Data</h3><br> <p>Speech data was collected between October 2019 and May 2021 using the <a href="https://samromur.is">Samr&oacute;mur website</a> which displayed prompts to participants. The prompts were mainly from <a href="http://clarin.is/en/resources/gigaword">The Icelandic Gigaword Corpus</a>, which includes text from novels, news, plays, and from a list of location names in Iceland. Additional prompts were taken from <a href="https://www.visindavefur.is/">the Icelandic Web of Science</a> and others were created by combining a name followed by a question or a demand. Prompts and speaker metadata are included in the corpus.</p><br> <p>The audio data is divided into train, dev, and test sets and is presented as flac compressed, single channel, 16 kHz, 16-bit linear PCM.</p><br> <h3>Samples</h3><br> <p>Please view this <a href="desc/addenda/LDC2022S05.flac">audio sample (FLAC)</a>.</p><br> <h3>Updates</h3><br> <p>None at this time.</p></br> Portions © 2022 Reykjavik University, © 2022 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2022-04-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作