Persian Speech to Test dataset

NIAID Data Ecosystem2026-03-14 收录

下载链接：

https://zenodo.org/record/7486153

下载链接

链接失效反馈

官方服务：

资源简介：

The Persian Speech to Text dataset is an open source dataset for training machine learning models for the task of transcribing audio files in the Persian language into text. It is the largest open source dataset of its kind, with a size of approximately 60GB of data. The dataset consists of audio files in the WAV format and their transcripts in CSV file format. This dataset is a valuable resource for researchers and developers working on natural language processing tasks involving the Persian language, and it provides a large and diverse set of data to train and evaluate machine learning models on. The open source nature of the dataset means that it is freely available to be used and modified by anyone, making it an important resource for advancing research and development in the field.

创建时间：

2022-12-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集