five

Blizzard Challenge 2025 - Training Material

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14792456
下载链接
链接失效反馈
官方服务:
资源简介:
Blizzard Challenge 2025 - Participant Information The 2025 edition of the Blizzard Challenge is focussing on synthesizing speech for Bildts. Bildts (Indo-European > West Germanic) is a unique language variety spoken in the north of the Netherlands, specifically in the the Bildt region of Friesland, one of the country’s twelve provinces. It represents a living example of language diversity in Europe, with its own distinct characteristics and rich cultural heritage. It has about 10,000 speakers, who acquired it as a first or second language. The language has been documented through grammatical resources, dictionaries, literature, and media including weekly radio broadcasts, theater productions, and regular newspaper columns. This choice for the Blizzard Challenge 2025 aligns with our theme "Scaling down: sustainable synthesis for language diversity" as it provides an opportunity to advance speech synthesis capabilities for languages beyond the usual major languages while working with carefully curated but naturally limited data resources. Material For this challenge, we provide around 7h of speech from one male speaker (Jan de Groot from Omrop Fryslân):  WAV files (44.1kHz, 16bits, mono) normalized using sv56demo  TextGrid (in UTF-8) composed of three tiers: Full Text (graphemes) :: the full text Segments (graphemes):: the segmented text we obtained using pydub and then hand-corrected Expanded Segments (graphemes) :: the expanded version of the segmented text - numbers and some acronyms are spelled out (full upcase words are acronyms which should be spelled out) We also recommend the participants to get familiar with the online resources provided at https://wiki.mercator-research.eu/languages:bildts_in_the_netherlands#online_learning_resources. More information Description of the challenge :: https://blogs.helsinki.fi/ssw13-2025/the-blizzard-challenge-2025/ Rules of the challenge :: https://blogs.helsinki.fi/ssw13-2025/the-rules-of-the-blizzard-challenge-2025/ About Bildts :: https://wiki.mercator-research.eu/languages:bildts_in_the_netherlands Change Log v0.2 - fix heterogeneous sampling rate (44.1kHz, 48kHz => 44.1kHz), fix some number expansion issues v0.1 - initial release
创建时间:
2025-03-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作