fblgit/simple-math-DPO

Name: fblgit/simple-math-DPO
Creator: fblgit
Published: 2024-01-27 16:31:57
License: 暂无描述

Hugging Face2024-01-27 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/fblgit/simple-math-DPO

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: chosen list: - name: content dtype: string - name: role dtype: string - name: messages list: - name: content dtype: string - name: role dtype: string - name: prompt dtype: string - name: rejected list: - name: content dtype: string - name: role dtype: string splits: - name: train num_bytes: 313485868.75 num_examples: 760000 - name: test num_bytes: 16499256.25 num_examples: 40000 download_size: 101158122 dataset_size: 329985125.0 license: cc-by-nc-nd-4.0 task_categories: - conversational - reinforcement-learning tags: - math - simple-math pretty_name: Simple Math (DPO) size_categories: - 100K<n<1M --- # Simple Math: 2+2=4 -1=3 (LoLo: Learning Only Logical Operations) DPO Pairs Just like my teacher gave me homework, i thought maybe we can also add some of these basics on the trainings of our models. It was created with very simple code that is in the repo, if you add more complex operations and so.. **please share the code** :D thank you Current Code Version: 20240127.fblgit (A modification over @win10 for progressive and DPO operation) ![LoLo: Learning Only Logical Operations](https://huggingface.co/datasets/fblgit/simple-math/resolve/main/LOLO.png) ## Versions ``` 27.01.24 First DPO Generator ``` ## Citations If you use Simple Math o train your model, please cite on the modelcard or the paper. ``` @misc{simplemath, title={Simple-Math: 2+2=4 4-1=3}, author={Xavier Murias}, year={2024}, publisher = {Juanako.AI}, journal = {HuggingFace repository}, howpublished = {\url{https://huggingface.co/datasets/fblgit/simple-math}}, } ```

提供机构：

fblgit

原始信息汇总

数据集概述

本数据集详情页面未提供具体的数据集信息，仅包含一段关于模型训练基础的思考。

5,000+

优质数据集

54 个

任务类型

进入经典数据集