tech4humans/Audio-Transcription-Models-Comparison-PT-BR

Name: tech4humans/Audio-Transcription-Models-Comparison-PT-BR
Creator: tech4humans
Published: 2026-02-11 14:35:26
License: 暂无描述

Hugging Face2026-02-11 更新2026-02-07 收录

下载链接：

https://hf-mirror.com/datasets/tech4humans/Audio-Transcription-Models-Comparison-PT-BR

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集旨在比较不同人工智能模型在巴西葡萄牙语中的转录性能，特别是在具有挑战性的场景下。数据集涵盖了地方词汇、口音和文化表达等区域特色，以及自然语言中的非正式表达和不流畅现象。评估的模型包括OpenAI Whisper、Google Gemini、Qwen2-Audio等，重点关注这些模型在葡萄牙语中的表现和泛化能力。数据集提供了定量指标（如词错误率、实时因子和每秒字数）和定性分析，用于跨模型比较和错误分析。所有实验和评估结果均通过Weights & Biases进行跟踪和记录，确保透明度和可重复性。

This dataset was created to store and compare transcription results from different Artificial Intelligence models in challenging scenarios, focusing exclusively on Brazilian Portuguese. It covers regionalism, informality, disfluency, and numeric entities in natural speech. The evaluated models include OpenAI Whisper, Google Gemini, Qwen2-Audio, and others, selected for their performance and generalization capabilities in Portuguese. The dataset enables the computation of metrics like Word Error Rate (WER), Real-Time Factor (RTF), and Words Per Second (WPS), and supports cross-model comparisons and error analysis. All experiments are tracked using Weights & Biases for transparency and reproducibility.

提供机构：

tech4humans

5,000+

优质数据集

54 个

任务类型

进入经典数据集