saillab/alpaca-uzbek-cleaned
收藏Hugging Face2024-09-20 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/saillab/alpaca-uzbek-cleaned
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- uz
pretty_name: Uzbek alpaca-52k
size_categories:
- 100K<n<1M
---
This repository contains the dataset used for the TaCo paper.
Please refer to the paper for more details: [OpenReview](https://openreview.net/forum?id=02MLWBj8HP)
If you have used our dataset, please cite it as follows:
**Citation**
```
@inproceedings{upadhayay2024taco,
title={TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in {LLM}s through Translation-Assisted Chain-of-Thought Processes},
author={Bibek Upadhayay and Vahid Behzadan},
booktitle={5th Workshop on practical ML for limited/low resource settings, ICLR},
year={2024},
url={https://openreview.net/forum?id=02MLWBj8HP}
}
```
The original dataset [(Alpaca-52K)](https://github.com/tatsu-lab/stanford_alpaca?tab=readme-ov-file#data-release) was translated using Google Translate.
**Copyright and Intended Use**
This dataset has been released under CC BY-NC, intended for academic and research purposes only. Please review the licenses and terms and conditions of Alpaca-52K, Dolly-15K, and Google Cloud Translation before using this dataset for any purpose other than research.
---
语言:
- 乌兹别克语
数据集展示名:乌兹别克语Alpaca-52k
规模类别:
- 10万 < 数据量 < 100万
---
本仓库包含TaCo论文所使用的数据集。
如需了解更多细节,请参阅该论文:[OpenReview](https://openreview.net/forum?id=02MLWBj8HP)
若您使用了本数据集,请按以下格式引用:
**引用格式**
@inproceedings{upadhayay2024taco,
title={TaCo: 借助翻译辅助思维链(Chain-of-Thought)流程提升大语言模型(Large Language Model)的低资源语言跨语言迁移能力},
author={比贝克·乌帕德亚伊(Bibek Upadhayay)、瓦希德·贝哈丹(Vahid Behzadan)},
booktitle={第五届低资源/受限资源场景实用机器学习研讨会,国际学习表征大会(ICLR)},
year={2024},
url={https://openreview.net/forum?id=02MLWBj8HP}
}
本数据集的原始版本[(Alpaca-52K)](https://github.com/tatsu-lab/stanford_alpaca?tab=readme-ov-file#data-release)通过谷歌翻译完成译制。
**版权与使用说明**
本数据集采用CC BY-NC协议发布,仅可用于学术与研究用途。若您将本数据集用于研究以外的其他用途,请务必先查阅Alpaca-52K、Dolly-15K以及谷歌云翻译的相关许可协议与条款细则。
提供机构:
saillab



