asr-malayalam/Norm_Malayalam_Evaluation_samples

Name: asr-malayalam/Norm_Malayalam_Evaluation_samples
Creator: asr-malayalam
Published: 2024-07-19 18:19:45
License: 暂无描述

Hugging Face2024-07-19 更新2024-07-13 收录

下载链接：

https://hf-mirror.com/datasets/asr-malayalam/Norm_Malayalam_Evaluation_samples

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含使用Malayalam ASR模型vrclc/Whisper_small_malayalam在google/fleurs数据集上的评估结果。模型使用50小时的马来语语音数据进行训练，并在google/fleurs数据集的测试集上进行评估，评估了500个样本，去除标点后的词错误率（WER）为39%。该数据集旨在提供错误分析，帮助识别马来语语音识别中的常见错误和改进领域。

This dataset contains evaluation results from the Malayalam ASR model vrclc/Whisper_small_malayalam using the google/fleurs dataset. It includes 500 samples of Malayalam speech data for model evaluation, with a Word Error Rate (WER) of 53%. Primarily used for error analysis to improve Malayalam speech recognition.

提供机构：

asr-malayalam

原始信息汇总

Malayalam ASR Reference Prediction dataset

概述

数据集名称: Malayalam ASR Reference Prediction dataset
任务类别: 句子相似度
语言: 马拉雅拉姆语 (ml)
数据集规模: n<1K
许可证: cc-by-4.0

详细信息

ASR模型名称: vrclc/Whisper_small_malayalam
数据集: google/fleurs
数据集用途: 用于评估vrclc/Whisper_small_malayalam模型在马拉雅拉姆语语音识别中的表现。
评估结果:
- 使用google/fleurs数据集的测试集进行评估。
- 评估了500个样本，在去除标点符号后的WER为39%。

用途

直接用途: 提供错误分析，帮助识别马拉雅拉姆语语音识别中的常见错误和改进领域。

5,000+

优质数据集

54 个

任务类型

进入经典数据集