rasdani/cohere-wikipedia-2023-11-it
收藏Hugging Face2024-05-14 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/rasdani/cohere-wikipedia-2023-11-it
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: _id
dtype: string
- name: url
dtype: string
- name: title
dtype: string
- name: text
dtype: string
splits:
- name: train
num_bytes: 4931956643
num_examples: 10462162
download_size: 2530836541
dataset_size: 4931956643
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
The dataset includes four main features: _id (string type), url (string type), title (string type), and text (string type). It consists of one training set (train) with 10,462,162 samples, totaling 4,931,956,643 bytes. The download size of the dataset is 2,530,836,541 bytes, and the total dataset size is 4,931,956,643 bytes. The dataset configuration is default, with data file paths at data/train-*.
提供机构:
rasdani
原始信息汇总
数据集概述
数据集特征
- _id: 数据类型为字符串。
- url: 数据类型为字符串。
- title: 数据类型为字符串。
- text: 数据类型为字符串。
数据集划分
- 训练集 (train):
- 数据量: 10462162条记录
- 存储大小: 4931956643字节
数据集大小
- 下载大小: 2530836541字节
- 总数据集大小: 4931956643字节
配置信息
- 配置名称: default
- 数据文件路径:
- 训练集路径:
data/train-*
- 训练集路径:



