five

fblgit/simple-math-DPO

收藏
Hugging Face2024-01-27 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/fblgit/simple-math-DPO
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: chosen list: - name: content dtype: string - name: role dtype: string - name: messages list: - name: content dtype: string - name: role dtype: string - name: prompt dtype: string - name: rejected list: - name: content dtype: string - name: role dtype: string splits: - name: train num_bytes: 313485868.75 num_examples: 760000 - name: test num_bytes: 16499256.25 num_examples: 40000 download_size: 101158122 dataset_size: 329985125.0 license: cc-by-nc-nd-4.0 task_categories: - conversational - reinforcement-learning tags: - math - simple-math pretty_name: Simple Math (DPO) size_categories: - 100K<n<1M --- # Simple Math: 2+2=4 -1=3 (LoLo: Learning Only Logical Operations) DPO Pairs Just like my teacher gave me homework, i thought maybe we can also add some of these basics on the trainings of our models. It was created with very simple code that is in the repo, if you add more complex operations and so.. **please share the code** :D thank you Current Code Version: 20240127.fblgit (A modification over @win10 for progressive and DPO operation) ![LoLo: Learning Only Logical Operations](https://huggingface.co/datasets/fblgit/simple-math/resolve/main/LOLO.png) ## Versions ``` 27.01.24 First DPO Generator ``` ## Citations If you use Simple Math o train your model, please cite on the modelcard or the paper. ``` @misc{simplemath, title={Simple-Math: 2+2=4 4-1=3}, author={Xavier Murias}, year={2024}, publisher = {Juanako.AI}, journal = {HuggingFace repository}, howpublished = {\url{https://huggingface.co/datasets/fblgit/simple-math}}, } ```
提供机构:
fblgit
原始信息汇总

数据集概述

本数据集详情页面未提供具体的数据集信息,仅包含一段关于模型训练基础的思考。

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作