five

hido-pinto/ct4-preprocessed

收藏
Hugging Face2025-01-12 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/hido-pinto/ct4-preprocessed
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含了四个主要特征:输入值(input_values)、注意力掩码(attention_mask)、时间索引掩码(mask_time_indices)和采样负指数(sampled_negative_indices)。输入值是float32类型的序列,注意力掩码和时间索引掩码是int32和bool类型的序列,采样负指数是一个包含int64类型序列的序列。数据集分为训练集和验证集,其中训练集包含约138813个示例,大小为116GB,验证集包含约16117个示例,大小为13GB。整个数据集的总大小为130GB,下载大小为49GB。具体应用场景和详细内容未在README中说明。

The dataset includes four main features: input_values, attention_mask, mask_time_indices, and sampled_negative_indices. Input values are sequences of type float32, attention masks and time index masks are sequences of type int32 and bool, respectively, and sampled negative indices are sequences of sequences of type int64. The dataset is split into a training set and a validation set, with the training set containing approximately 138,813 examples and being 116GB in size, and the validation set containing approximately 16,117 examples and being 13GB in size. The total size of the dataset is 130GB, with a download size of 49GB. The specific application scenario and detailed content are not described in the README.
提供机构:
hido-pinto
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作