SakanaAI/gsm8k-ja-test_250-1319
收藏Hugging Face2024-05-14 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/SakanaAI/gsm8k-ja-test_250-1319
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
---
# gsm8k-ja-test_250-1319
This dataset contains 1069 Japanese math problems and their solutions. It was used for optimizing LLMs in the paper "[Evolutionary Optimization of Model Merging Recipes](https://arxiv.org/abs/2403.13187)".
## Dataset Details
This dataset contains Japanese translations of 1069 math problems and solutions from the [GSM8K](https://huggingface.co/datasets/gsm8k) test set,
starting from the 251st example out of 1319.
The translation was done using `gpt-4-0125-preview`.
We did not use the first 250 examples because they are part of the [MGSM](https://huggingface.co/datasets/juletxara/mgsm) dataset.
MGSM is a well-known multilingual version of GSM8k, which includes translations of the first 250 samples from the GSM8k test set.
As we were going to use MGSM for the final evaluations, to avoid overlapping with MGSM,
we translated the remaining 1069 samples from the GSM8k test set that were not used in MGSM.
### Source Data
* [GSM8K](https://huggingface.co/datasets/gsm8k)
### Models
* [SakanaAI/EvoLLM-JP-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-7B)
* [SakanaAI/EvoLLM-JP-A-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-A-v1-7B)
* [SakanaAI/EvoLLM-JP-v1-10B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-10B)
## Citation
```
@article{DBLP:journals/corr/abs-2110-14168,
author = {Karl Cobbe and
Vineet Kosaraju and
Mohammad Bavarian and
Mark Chen and
Heewoo Jun and
Lukasz Kaiser and
Matthias Plappert and
Jerry Tworek and
Jacob Hilton and
Reiichiro Nakano and
Christopher Hesse and
John Schulman},
title = {Training Verifiers to Solve Math Word Problems},
journal = {CoRR},
volume = {abs/2110.14168},
year = {2021},
url = {https://arxiv.org/abs/2110.14168},
eprinttype = {arXiv},
eprint = {2110.14168},
timestamp = {Mon, 12 Jun 2023 08:23:44 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2110-14168.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
@article{DBLP:journals/corr/abs-2403-13187,
author = {Takuya Akiba and
Makoto Shing and
Yujin Tang and
Qi Sun and
David Ha},
title = {Evolutionary Optimization of Model Merging Recipes},
journal = {CoRR},
volume = {abs/2403.13187},
year = {2024},
url = {https://doi.org/10.48550/arXiv.2403.13187},
doi = {10.48550/ARXIV.2403.13187},
eprinttype = {arXiv},
eprint = {2403.13187},
timestamp = {Mon, 08 Apr 2024 18:24:51 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2403-13187.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```
提供机构:
SakanaAI
原始信息汇总
数据集概述
数据集名称
gsm8k-ja-test_250-1319
数据集内容
本数据集包含1069个日语数学问题及其解答。
数据集用途
用于优化LLMs,并在《Evolutionary Optimization of Model Merging Recipes》论文中被引用。
数据集详情
- 问题来源:从GSM8K测试集的第251个样本开始,共1069个样本。
- 翻译工具:使用
gpt-4-0125-preview进行翻译。 - 排除样本:未使用前250个样本,因为这些样本已包含在MGSM数据集中。
数据集来源
- 原始数据集:GSM8K
相关模型
- SakanaAI/EvoLLM-JP-v1-7B
- SakanaAI/EvoLLM-JP-A-v1-7B
- SakanaAI/EvoLLM-JP-v1-10B
许可证
apache-2.0



