aklein4/HuggingFaceTB-finemath-finemath-4plus-tokenized
收藏Hugging Face2025-07-21 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/aklein4/HuggingFaceTB-finemath-finemath-4plus-tokenized
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本输入和输出信息,适用于序列到序列的任务。数据集包含字段如源文本、输入ID列表、输出ID列表、输入token数量和输出token数量。训练集大小为约10GB,包含约6699万条示例。数据集整体大小与下载大小不同,可能包含了额外的验证集或测试集数据。
This dataset contains text input and output information, suitable for sequence-to-sequence tasks. It includes fields such as source text, list of input IDs, list of output IDs, number of input tokens, and number of output tokens. The training set is about 10GB in size, containing approximately 66.99 million examples. The overall size of the dataset differs from the download size, which may include additional validation or test set data.
提供机构:
aklein4



