Lingalingeswaran/common_voice_tamil_preprocessed

Name: Lingalingeswaran/common_voice_tamil_preprocessed
Creator: Lingalingeswaran
Published: 2024-12-17 21:18:23
License: 暂无描述

Hugging Face2024-12-17 更新2024-12-21 收录

下载链接：

https://hf-mirror.com/datasets/Lingalingeswaran/common_voice_tamil_preprocessed

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含两个主要特征：input_features和labels。input_features是一个序列的序列，数据类型为float32；labels是一个序列，数据类型为int64。数据集分为训练集和测试集，训练集包含53,468个样本，占用51,373,961,936字节；测试集包含11,815个样本，占用11,351,425,032字节。总下载大小为12,809,657,418字节，总数据集大小为62,725,386,968字节。数据文件路径配置为：训练集路径为data/train-*，测试集路径为data/test-*。

The dataset includes two main features: input_features and labels. input_features is a sequence of sequences with a data type of float32; labels is a sequence with a data type of int64. The dataset is divided into a training set and a test set. The training set contains 53,468 samples, occupying 51,373,961,936 bytes; the test set contains 11,815 samples, occupying 11,351,425,032 bytes. The total download size is 12,809,657,418 bytes, and the total dataset size is 62,725,386,968 bytes. The data file paths are configured as: training set path is data/train-*, and test set path is data/test-*.

提供机构：

Lingalingeswaran

5,000+

优质数据集

54 个

任务类型

进入经典数据集