joaosanches/tedtalks_train_no_duplicates
收藏Hugging Face2024-01-30 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/joaosanches/tedtalks_train_no_duplicates
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: pt
dtype: string
- name: pt-br
dtype: string
splits:
- name: train
num_bytes: 26649615
num_examples: 126984
download_size: 18481563
dataset_size: 26649615
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
The dataset includes two features: pt and pt-br, both of which are string type. The dataset is divided into a training set with 126984 samples, totaling 26649615 bytes. The download size of the dataset is 18481563 bytes. The dataset configuration is named default, and the training data file path is data/train-*.
提供机构:
joaosanches
原始信息汇总
数据集概述
数据集信息
- 特征:
pt: 类型为字符串pt-br: 类型为字符串
数据分割
- 训练集:
- 字节数: 26649615
- 样本数: 126984
数据集大小
- 下载大小: 18481563 字节
- 数据集大小: 26649615 字节
配置
- 默认配置:
- 数据文件:
- 分割: 训练集
- 路径:
data/train-*
- 数据文件:



