five

the-cramer-project/kyrgyz-alpaca

收藏
Hugging Face2024-03-24 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/the-cramer-project/kyrgyz-alpaca
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-4.0 language: - ky --- # Kyrgyz Alpaca This repo is made for research use only, i.e., cannot be used for commercial purposes or entertainment. ## References All of our achievements were made achievable thanks to the robust AI community in Kyrgyzstan and the contributions made by individuals within the AkylAI project (by TheCramer.com). We also express our gratitude to Stanford for their outstanding efforts and extend the accessibility of this dataset to a global audience. ## Dataset Kyrgyz Alpaca can be also downloaded from [here](https://drive.google.com/file/d/1ohiBSoyRxrUpFNRDLKknTn6dLgXFtsVV/view?usp=sharing). We used ChatGPT and Google Translate to convert [alpaca_data.json](https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json) into Kyrgyz. Although the translation wasn't perfect, we found it to strike a reasonable balance between cost and quality. The total cost for translating the entire dataset into Kyrgyz was approximately $700.00. If you're interested in learning more about the dataset's creation process, you can visit [the Stanford Alpaca page](https://github.com/tatsu-lab/stanford_alpaca). ## Next We work with Kyrgyz linguists to improve the quality of the translation. Please feel free to reach out timur.turat@gmail.com if you are interested in any forms of collaborations! ## Citation If you use the data or code from this repo, please cite this repo as follows ``` @misc{kyrgyz-alpaca, author = {Khakim Davurov, Timur Turatali, Ulan Abdurazakov}, title = {Kyrgyz Alpaca: Models and Datasets}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/Akyl-AI/kyrgyz-alpaca}}, } ```
提供机构:
the-cramer-project
原始信息汇总

数据集概述

数据集名称

Kyrgyz Alpaca

数据集用途

仅供研究使用,不可用于商业目的或娱乐。

数据集语言

  • 柯尔克孜语(ky)

数据集获取

可通过以下链接下载:Kyrgyz Alpaca

数据集翻译

使用ChatGPT和Google Translate将alpaca_data.json翻译成柯尔克孜语,翻译成本约为$700.00。

数据集改进

与柯尔克孜语语言学家合作,以提高翻译质量。

数据集引用

若使用此数据集,请按以下格式引用:

@misc{kyrgyz-alpaca, author = {Khakim Davurov, Timur Turatali, Ulan Abdurazakov}, title = {Kyrgyz Alpaca: Models and Datasets}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {url{https://github.com/Akyl-AI/kyrgyz-alpaca}}, }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作