kothasuhas/segmented-ensembling-first-half-samples-ctx16-12800000

Name: kothasuhas/segmented-ensembling-first-half-samples-ctx16-12800000
Creator: kothasuhas
Published: 2025-06-29 05:50:13
License: 暂无描述

Hugging Face2025-06-29 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/kothasuhas/segmented-ensembling-first-half-samples-ctx16-12800000

下载链接

链接失效反馈

官方服务：

资源简介：

这是一个包含文本数据和对应的输入ID序列的数据集，主要用于训练机器学习模型。数据集分为训练集，包含大约1280万个样本，总大小约为1.6GB。提供了默认配置以方便用户加载数据。

This dataset consists of text data and corresponding input ID sequences, primarily used for training machine learning models. The dataset is split into a training set containing approximately 12.8 million samples, with a total size of about 1.6GB. A default configuration is provided to facilitate data loading.

提供机构：

kothasuhas

5,000+

优质数据集

54 个

任务类型

进入经典数据集