five

gupta-tanish/Ultrafeedback-SimPO-seed-101

收藏
Hugging Face2025-01-06 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/gupta-tanish/Ultrafeedback-SimPO-seed-101
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是一个多字段的数据集,包含了提示文本(prompt)、提示ID(prompt_id)、选中的回答(chosen)、被拒绝的回答(rejected)和消息(messages)等字段。每个字段都有不同的数据类型,如字符串和浮点数。数据集被划分为训练和测试集,包括偏好训练集(train_prefs)、偏好训练集软标签(train_sft)、偏好测试集(test_prefs)、测试集软标签(test_sft)、生成训练集(train_gen)和生成测试集(test_gen),每个集合都有详细的字节数和示例数。数据集的总下载大小为648.98MB,总数据大小为1.16GB。

This dataset is a multi-field dataset that includes fields such as prompt text (prompt), prompt ID (prompt_id), selected responses (chosen), rejected responses (rejected), and messages (messages). Each field has different data types, such as strings and floating-point numbers. The dataset is divided into training and test sets, including preference training set (train_prefs), preference training set with soft labels (train_sft), preference test set (test_prefs), test set with soft labels (test_sft), generation training set (train_gen), and generation test set (test_gen), each with detailed byte sizes and number of examples. The total download size of the dataset is 648.98MB, and the total data size is 1.16GB.
提供机构:
gupta-tanish
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作