five

kothasuhas/llama-3b-gold-15M-student-generations_SNIS_2048_tune422v1

收藏
Hugging Face2025-04-23 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/kothasuhas/llama-3b-gold-15M-student-generations_SNIS_2048_tune422v1
下载链接
链接失效反馈
官方服务:
资源简介:
这是一个包含文本数据的数据集,具有文本内容(text),日志权重(log_weight)和抽样比例(sampling_p_scaled)三个特征。数据集分为训练集和验证集,训练集包含约14999750个示例,验证集包含1000个示例。数据集整体大小约为35918218757字节。

This dataset contains text data with three features: text content (text), log weight (log_weight), and sampling probability scaled (sampling_p_scaled). The dataset is split into a training set with approximately 14,999,750 examples and a validation set with 1,000 examples. The total size of the dataset is about 35,918,218,757 bytes.
提供机构:
kothasuhas
二维码
社区交流群
二维码
科研交流群
商业服务