Automatic Catalan KWS Database for Projecte AINA

Name: Automatic Catalan KWS Database for Projecte AINA
Creator: CORA.Repositori de Dades de Recerca
Published: 2025-06-10 23:18:57
License: 暂无描述

DataCite Commons2025-06-10 更新2024-07-13 收录

下载链接：

https://dataverse.csuc.cat/citation?persistentId=doi:10.34810/data1400

下载链接

链接失效反馈

官方服务：

资源简介：

Automatically extracted Catalan word database using alignment techniques (Montreal Forced Alignment, MFA) from speech databases with transcriptions. Precisely: Mozilla Common Voice, ParlamentParla, and OpenSLR-69. Usable for training keyword spotting models for home automation. MFA leverages algorithms to accurately synchronize speech signals with the corresponding text at the phoneme level. Two versions of the database have been created: general: This version encompasses all data, providing a comprehensive dataset for various analyses and applications. split: This version is divided into train, dev, and test to ease the task of training a keyword spotting model. Speaker-wise, It is divided by 80%, 10%, and 10%.

提供机构：

CORA.Repositori de Dades de Recerca

创建时间：

2024-06-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集