ta4tsering/Lhasa_kanjur_transcription_datasets
收藏Hugging Face2024-05-06 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/ta4tsering/Lhasa_kanjur_transcription_datasets
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: filename
dtype: string
- name: label
dtype: string
splits:
- name: train
num_bytes: 62378404
num_examples: 133024
- name: eval
num_bytes: 7805922
num_examples: 16640
- name: test
num_bytes: 7810197
num_examples: 16640
download_size: 23787490
dataset_size: 77994523
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: eval
path: data/eval-*
- split: test
path: data/test-*
---
This dataset consists of three main parts: train, eval, and test. Each part includes corresponding filenames and labels. The train set contains 133024 samples, while both the eval and test sets contain 16640 samples each. The total download size of the dataset is 23787490 bytes, and the total dataset size is 77994523 bytes. The dataset configuration is set to default, with data files stored in respective branch paths.
提供机构:
ta4tsering
原始信息汇总
数据集概述
数据集特征
- filename:文件名,数据类型为字符串。
- label:标签,数据类型为字符串。
数据集分割
- 训练集 (train):包含133,024个样本,总大小为62,378,404字节。
- 评估集 (eval):包含16,640个样本,总大小为7,805,922字节。
- 测试集 (test):包含16,640个样本,总大小为7,810,197字节。
数据集大小
- 下载大小:23,787,490字节。
- 数据集总大小:77,994,523字节。
配置文件
- 默认配置 (default):
- 训练集路径:
data/train-* - 评估集路径:
data/eval-* - 测试集路径:
data/test-*
- 训练集路径:



