five

kothasuhas/llama-3b-gold-15M-student-generations_SNIS_2048_tune422v1_N15.00M_T8.0

收藏
Hugging Face2025-04-26 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/kothasuhas/llama-3b-gold-15M-student-generations_SNIS_2048_tune422v1_N15.00M_T8.0
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含四个特征字段:文本(text),日志权重(log_weight),采样概率缩放(sampling_p_scaled),采样温度缩放(sampling_p_temperature_scaled)。数据集分为训练集和验证集,训练集包含1500万条示例,大小为32014763678字节,验证集包含1000条示例,大小为2444848字节。整个数据集的大小为32017208526字节,下载大小为19311005287字节。

The dataset includes four feature fields: text, log_weight, sampling_p_scaled, and sampling_p_temperature_scaled. The dataset is split into a training set and a validation set, with the training set containing 15 million examples and being 32014763678 bytes in size, and the validation set containing 1000 examples and being 2444848 bytes in size. The total size of the dataset is 32017208526 bytes, with a download size of 19311005287 bytes.
提供机构:
kothasuhas
二维码
社区交流群
二维码
科研交流群
商业服务