pxyyy/rlhflow_mixture_intuitive-v2_sampled-600k
收藏Hugging Face2024-11-19 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/pxyyy/rlhflow_mixture_intuitive-v2_sampled-600k
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为rlhflow_mixture_intuitive-v2_sampled-600k,包含600,000个样本,主要用于训练模型。数据集的结构包括一个名为messages的特征,它是一个包含content和role字段的列表,以及一个conversation_id字段。数据集分为一个训练集,总大小为979,992,064.985267字节。数据集的权重分布表显示了不同数据源在数据集中的权重比例,例如MathInstruct占0.17,SlimOrca占0.13等。
The dataset named rlhflow_mixture_intuitive-v2_sampled-600k contains 600,000 samples and is primarily used for training models. The dataset structure includes a feature named messages, which is a list containing content and role fields, as well as a conversation_id field. The dataset is divided into a training set with a total size of 979,992,064.985267 bytes. The weight distribution table in the README file shows the proportion of different data sources in the dataset, such as MathInstruct at 0.17, SlimOrca at 0.13, etc.
提供机构:
pxyyy



