prhegde/preference-data-math-stack-exchange
收藏Hugging Face2023-12-13 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/prhegde/preference-data-math-stack-exchange
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
---
The preference dataset is derived from the [stack exchange dataset](https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences) which contains questions and answers from the Stack Overflow Data Dump. This contains questions and answers for various topics. For this work, we used only question and answers from [math.stackexchange.com](https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences/tree/main/data/math.meta.stackexchange.com) sub-folder.
The questions are grouped with answers that are assigned a score corresponding to the Anthropic paper:
```
score = log2 (1 + upvotes) rounded to the nearest integer, plus 1 if the answer was accepted by the questioner (we assign a score of −1 if the number of upvotes is negative).
```
We performed following processing to derive the final dataset.
1) Basic pre-processing ([code](https://github.com/PraveenSH/dpo-arithmo-mistral-7B/blob/main/src/data_processing/stack_exchange_data.py)) to clean the text
2) Filter Mathematical question using regex based detector ([code](https://github.com/PraveenSH/dpo-arithmo-mistral-7B/blob/main/src/data_processing/stack_exchange_data.py))
3) For each question, extract 2 answers - one with highest score and one with the lowest score. Former is used as Preferred response and latter is used as the rejected response
## References
```
@online{h4stackexchange,
author = {Lambert, Nathan and Tunstall, Lewis and Rajani, Nazneen and Thrush, Tristan},
title = {HuggingFace H4 Stack Exchange Preference Dataset},
year = 2023,
url = {https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences},
}
```
提供机构:
prhegde
原始信息汇总
数据集概述
数据来源
- 该数据集源自Stack Exchange数据集,该数据集包含Stack Overflow数据转储中的问题和答案。
- 本数据集特别提取了math.stackexchange.com子文件夹中的问题和答案。
数据处理
- 预处理: 进行了基本的文本清洗,具体代码可见此处。
- 过滤: 使用基于正则表达式的检测器过滤出数学问题,相关代码同样位于上述链接。
- 答案选择: 对于每个问题,提取两个答案——一个最高分答案和一个最低分答案。最高分答案作为首选响应,最低分答案作为拒绝响应。
评分机制
- 答案的分数计算公式为:
score = log2(1 + upvotes),四舍五入到最近的整数。如果答案被提问者接受,则加1;如果upvotes为负数,则分数为-1。
许可证
- 本数据集遵循Apache-2.0许可证。



