timaeus/pile-pubmed_central-elimination-disjoint-slm-l1sae580
收藏Hugging Face2025-03-18 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/timaeus/pile-pubmed_central-elimination-disjoint-slm-l1sae580
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本和元数据信息,文本为字符串类型,元数据包含pile_set_name字段,也是字符串类型。数据集划分为训练集,共有48799个样本,大小为1572292036.03266字节。整个数据集的下载大小为729174689字节。
The dataset includes text and metadata information, with text being of string type and metadata containing a pile_set_name field, also of string type. The dataset is split into a training set with a total of 48,799 samples, with a size of 1,572,292,036.03266 bytes. The total download size of the dataset is 729,174,689 bytes.
提供机构:
timaeus



