five

prhegde/preference-data-math-stack-exchange

收藏
Hugging Face2023-12-13 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/prhegde/preference-data-math-stack-exchange
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 --- The preference dataset is derived from the [stack exchange dataset](https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences) which contains questions and answers from the Stack Overflow Data Dump. This contains questions and answers for various topics. For this work, we used only question and answers from [math.stackexchange.com](https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences/tree/main/data/math.meta.stackexchange.com) sub-folder. The questions are grouped with answers that are assigned a score corresponding to the Anthropic paper: ``` score = log2 (1 + upvotes) rounded to the nearest integer, plus 1 if the answer was accepted by the questioner (we assign a score of −1 if the number of upvotes is negative). ``` We performed following processing to derive the final dataset. 1) Basic pre-processing ([code](https://github.com/PraveenSH/dpo-arithmo-mistral-7B/blob/main/src/data_processing/stack_exchange_data.py)) to clean the text 2) Filter Mathematical question using regex based detector ([code](https://github.com/PraveenSH/dpo-arithmo-mistral-7B/blob/main/src/data_processing/stack_exchange_data.py)) 3) For each question, extract 2 answers - one with highest score and one with the lowest score. Former is used as Preferred response and latter is used as the rejected response ## References ``` @online{h4stackexchange, author = {Lambert, Nathan and Tunstall, Lewis and Rajani, Nazneen and Thrush, Tristan}, title = {HuggingFace H4 Stack Exchange Preference Dataset}, year = 2023, url = {https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences}, } ```
提供机构:
prhegde
原始信息汇总

数据集概述

数据来源

数据处理

  • 预处理: 进行了基本的文本清洗,具体代码可见此处
  • 过滤: 使用基于正则表达式的检测器过滤出数学问题,相关代码同样位于上述链接。
  • 答案选择: 对于每个问题,提取两个答案——一个最高分答案和一个最低分答案。最高分答案作为首选响应,最低分答案作为拒绝响应。

评分机制

  • 答案的分数计算公式为:score = log2(1 + upvotes),四舍五入到最近的整数。如果答案被提问者接受,则加1;如果upvotes为负数,则分数为-1。

许可证

  • 本数据集遵循Apache-2.0许可证。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作