five

Ravnursson Faroese Speech and Transcripts

收藏
DataCite Commons2025-06-03 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2024S09
下载链接
链接失效反馈
官方服务:
资源简介:
<h3>Introduction</h3> <p>Ravnursson Faroese Speech and Transcripts contains 109 hours of Faroese prompted speech from 433 speakers (249 female, 184 male), corresponding transcripts and speaker metadata. It is an extract from the <a href="https://mtd.setur.fo/en/resource/ravnur-blark-1-0/">Basic Language Resource Kit 1.0 (BLARK 1.0)</a>&nbsp;developed by the Faroe Islands'&nbsp;<a href="https://mtd.setur.fo/en/">Ravnur Project</a>.</p> <h3>Data</h3> <p>Speech data was collected in 2022. Speakers from all major dialect areas in the Faroe Islands in three age groups -- 15-35, 36-60, and 61+ years -- read texts that included a word list, a phrase list, closed vocabulary readings, and short texts. Recordings also contain spontaneous speech.</p> <p>TASCAM DR-40 Linear PCM audio recorders captured speech data at 48 kHz, downsampled for this corpus. The audio data is divided into train, development, and test sets and is presented as flac compressed, single channel, 16 kHz, 16-bit linear PCM.&nbsp;</p> <p>Recordings were orthographically transcribed and time-stamped. Transcripts and speaker metadata are included in a tab separated file.</p> <h3>Samples</h3> <p>Please view this <a href="desc/addenda/LDC2024S09.tsv" target="_blank" rel="noopener">metadata sample (TSV)</a> and <a href="desc/addenda/LDC2024S09.flac">audio sample (FLAC)</a>.</p> <h3>Updates</h3> <p>None at this time.</p>
提供机构:
Linguistic Data Consortium
创建时间:
2024-08-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作