five

bethea/cc-preprocessed-sampled-1k

收藏
Hugging Face2024-07-10 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/bethea/cc-preprocessed-sampled-1k
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含三个主要字段:dataID(数据ID)、summary(摘要)和combined_texts(组合文本)。数据集分为训练集、验证集和测试集,分别包含1000、100和100个样本。训练集的大小约为9.11MB,验证集约为912KB,测试集约为912KB。整个数据集的下载大小为5.28MB,总大小约为10.94MB。数据文件的路径配置为:训练集路径为data/train-*,验证集路径为data/validation-*,测试集路径为data/test-*。

The dataset contains three main fields: dataID (data ID), summary, and combined_texts. The dataset is divided into training, validation, and test sets, containing 1000, 100, and 100 samples respectively. The training set size is approximately 9.11MB, the validation set is about 912KB, and the test set is about 912KB. The total download size of the dataset is 5.28MB, and the overall size is approximately 10.94MB. The data file paths are configured as: training set path is data/train-*, validation set path is data/validation-*, and test set path is data/test-*.
提供机构:
bethea
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作