timaeus/pile-arxiv-elimination-slm-l1sae1568
收藏Hugging Face2025-03-17 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/timaeus/pile-arxiv-elimination-slm-l1sae1568
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本和元数据两个特征,文本特征为字符串类型,元数据特征包含pile_set_name字段。数据集划分为训练集,包含77011个样本,总大小约为3.7GB。提供了一个默认配置,用于指定训练集的数据文件。
The dataset includes two features: text and metadata, where the text feature is of string type, and the metadata feature contains a pile_set_name field. The dataset is split into a training set with 77,011 samples, totaling approximately 3.7GB. A default configuration is provided to specify the data files for the training set.
提供机构:
timaeus



