OpenLLM-Ro/ro_hellaswag
收藏Hugging Face2024-08-09 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/OpenLLM-Ro/ro_hellaswag
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
language:
- ro
---
### Dataset Description
<!-- Provide a longer summary of what this dataset is. -->
[Hellaswag](https://arxiv.org/abs/1905.07830) is a commonsense inference challenge dataset.
Here we provide the Romanian translation of the Hellaswag from the paper *"Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback"* ([Lai et al., 2023](https://arxiv.org/abs/2307.16039)).
This dataset is used as a benchmark and is part of the evaluation protocol for Romanian LLMs proposed in *"Vorbeşti Româneşte?" A Recipe to Train Powerful Romanian LLMs with English Instructions* ([Masala et al., 2024](https://arxiv.org/abs/2406.18266))
## Citation
<!-- If there is a paper or blog post introducing the dataset, the APA and Bibtex information for that should go in this section. -->
```bibtex
@article{dac2023okapi,
title={Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback},
author={Dac Lai, Viet and Van Nguyen, Chien and Ngo, Nghia Trung and Nguyen, Thuat and Dernoncourt, Franck and Rossi, Ryan A and Nguyen, Thien Huu},
journal={arXiv e-prints},
pages={arXiv--2307},
year={2023}
}
```
```bibtex
@inproceedings{zellers2019hellaswag,
title={HellaSwag: Can a Machine Really Finish Your Sentence?},
author={Zellers, Rowan and Holtzman, Ari and Bisk, Yonatan and Farhadi, Ali and Choi, Yejin},
booktitle ={Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
year={2019}
}
```
```bibtext
@article{masala2024vorbecstiromanecsterecipetrain,
title={"Vorbe\c{s}ti Rom\^ane\c{s}te?" A Recipe to Train Powerful Romanian LLMs with English Instructions},
author={Mihai Masala and Denis C. Ilie-Ablachim and Alexandru Dima and Dragos Corlatescu and Miruna Zavelca and Ovio Olaru and Simina Terian and Andrei Terian and Marius Leordeanu and Horia Velicu and Marius Popescu and Mihai Dascalu and Traian Rebedea},
year={2024},
eprint={2406.18266},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
---
许可证:CC-BY-NC-4.0
语言:罗马尼亚语(ro)
---
### 数据集描述
<!-- 提供该数据集的详细说明 -->
HellaSwag是一款常识推理挑战数据集。
本次我们提供了来自论文《Okapi:结合人类反馈强化学习的多语言指令微调大语言模型(Large Language Model, LLM)》(Lai等人,2023)的HellaSwag罗马尼亚语译本。
该数据集被用作基准测试集,同时也是论文《你会说罗马尼亚语吗?利用英语指令训练高性能罗马尼亚语大语言模型的实操指南》(Masala等人,2024)中所提出的罗马尼亚语大语言模型评估流程的组成部分。
## 引用
<!-- 若有介绍该数据集的论文或博客文章,相关APA及Bibtex格式引用信息应置于此部分。 -->
bibtex
@article{dac2023okapi,
title={Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback},
author={Dac Lai, Viet and Van Nguyen, Chien and Ngo, Nghia Trung and Nguyen, Thuat and Dernoncourt, Franck and Rossi, Ryan A and Nguyen, Thien Huu},
journal={arXiv e-prints},
pages={arXiv--2307},
year={2023}
}
bibtex
@inproceedings{zellers2019hellaswag,
title={HellaSwag: Can a Machine Really Finish Your Sentence?},
author={Zellers, Rowan and Holtzman, Ari and Bisk, Yonatan and Farhadi, Ali and Choi, Yejin},
booktitle ={Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
year={2019}
}
bibtex
@article{masala2024vorbecstiromanecsterecipetrain,
title={"Vorbec{s}ti Rom^anec{s}te?" A Recipe to Train Powerful Romanian LLMs with English Instructions},
author={Mihai Masala and Denis C. Ilie-Ablachim and Alexandru Dima and Dragos Corlatescu and Miruna Zavelca and Ovio Olaru and Simina Terian and Andrei Terian and Marius Leordeanu and Horia Velicu and Marius Popescu and Mihai Dascalu and Traian Rebedea},
year={2024},
eprint={2406.18266},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
提供机构:
OpenLLM-Ro



