alea-institute/kl3m-data-dotgov-www.eia.gov
收藏Hugging Face2025-01-30 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/alea-institute/kl3m-data-dotgov-www.eia.gov
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含文本数据的训练集,其中包括标识符、数据集名称、MIME类型和标记序列等字段。数据集分为训练集部分,共有大约12.1百万个示例,总大小约为4.1GB。数据集的下载大小为756MB。
This dataset is a training corpus containing text data, which includes fields such as identifier, dataset name, MIME type, and token sequence. The dataset is split into a training set part, with a total of approximately 1.21 million examples, with a total size of about 4.1GB. The download size of the dataset is 756MB.
提供机构:
alea-institute



